| Author |
Message |
   
Chinmoy Bhagwat
| | Posted on Friday, August 13, 2004 - 05:44 am: |    |
Hi, I want to use FPA to estimate for a Data Warehousing kind of project. The project involves developing the ETL mappings to extract the data from various sources and after transformation load it into an ODS (Operational Data Store). The design of target tables is already done and reporting is not in the scope of the project. How do I go about counting FP. Are there any FP counting guidelines available for such kind of a project. Thanks. |
   
Chinmoy Bhagwat
| | Posted on Friday, August 13, 2004 - 08:40 am: |    |
Just to add to the query: We are going to use Informatica ETL tool. Are productivity figures available for this tool. |
   
Cindy Woodrow
| | Posted on Sunday, August 15, 2004 - 11:50 am: |    |
There will be a presentation on this at this year's IFPUG conference held in San Diego, CA September 21-24. For more information go to http://www.ifpug.org/conferences/confAgenda.htm ...Cindy |
   
Carol Dekkers
| | Posted on Sunday, August 15, 2004 - 10:23 pm: |    |
Dear Chinmoy: I am teaching the Data Warehouse 1 day workshop at the upcoming IFPUG workshops in San Diego, CA. However, in case you cannot attend, here is some information to get you started counting your application / project immediately: 1. Normally, the data warehouse project/application boundary includes the ODS (operational data stores) created from the legacy database feeds (often can be the Extract-E in the ETL process) as well as the T(transform) and L(load) processes. So the ETL processes are all contained within a single application boundary. This is NOT the 100% rule, however, it is the way MOST DW apps are seen from the user perspective. 2. Assuming we have a single application boundary as described in 1. AND assuming the purpose of the count is to determine the size of the entire DW, then we now have the starting place to perform the count. 3. We need to step back from the HOW the application is implemented (Informatica ETL) and consider the USER REQUIREMENTS for data. DW files are different from ODS files in that ODS's normally contain the snapshot(current) values of the collaborated data (i.e.,everything about a customer currently) that may exist through a population process of a variety of databases. I am assuming that your ETL tool will populate these ODS's -- correct? If so, then you need to figure out how many distinct, standalone, unique elementary processes are required (typically one High EI with a number of input datasets coming in to ETL the data into a single entity ODS such as Customer). 4. For the DW files themselves (which contain historical records and versioning of those recs) you need to again consider how many distinct groups are created through each ETL elementary process (an EI each). You take the E + T + L processes to populate a single DW entity as an elementary EI process (i.e., you cannot extract and leave the business in a consistent state -- the data is in a staging set of tables which are for all intents and purposes "temporary") -- nor can you simply transform it without the load processes. So the ETL for 1 DW entity is one EP (i.e., one EI). 5. The data mart(s) that are subsequently produced to meet the needs of various users in various applications are then counted as EO's or EQs depending on the requirements and processing they entail based on traditional EO / EQ rules per IFPUG 4.2. 6. The ILFs of the DW application under the aforementioned scenarios include the ODSs & the DW entities that are maintained. They do not include Code Tables (which are no longer counted at all per IFPUG 4.2), Associative Entities (except by following the three rules of IFPUG 4.2), or any other tables required by Informatica processing (i.e., technical requirements). I hope this clarifies things for you. Best regards, Carol Dekkers, President Quality Plus Technologies, Inc. www.qualityplustech.com Regards, Carol Dekkers,President, Quality Plus Technologies, Inc. Software Measurement Solutions -- POWERful Results! www.qualityplustech.com |
   
Chinmoy Bhagwat
| | Posted on Monday, August 16, 2004 - 03:01 am: |    |
Dear Carol, Thank you very much for the details. I would not be able to attend the workshop as I am in India. Anyway, I have a few further questions. 1. Basically the project involves just population of the ODS's from various sources after applying some transformations. In such case do your points #1 and #2 hold good? 2. I could not understand point #3 entirely specially the last paranthesis part. Would yo mind elaborating on this. 3. The DW that will be fed from these ODS's is not in the scope as of now. Given this, will the size that we are calculating be accuarate and help us arrive at an effort estimate? Thanks again, Chinmoy. |
   
Carol Dekkers
| | Posted on Monday, August 16, 2004 - 11:13 pm: |    |
Dear Chinmoy, You may not realize that there will be others from India travelling to San Diego for the IFPUG conference in September. You are certain to miss a great conference -- but I do understand that distance, budgets, and financial implications often make international travel less feasible. Now I will answer your questions as I understand them: 1. Your questions about my prior posting points 1 and 2: You are working solely either with a new project or a new application that takes the various disparate sources from in the business enterprise and assembles the data through a series of ETL processes to obtain the snapshot ODS's that your business enterprise needs, correct? To restate this -- you will have information extracts from various external sources that your software assembles (into staging tables typically), edits, massages/transforms, and then loads into comprehensive entities which are the ODS tables. Correct? Your boundary for this project (or application so far) will have the external datasources counted as either EIF(s) or as DETs on an EI ETL process. That the DW is outside the scope of the project is okay, we can work specifically with the ODS ETL processes. 2. You did not understand (that happens when things are done via email / postings!) my point 3 in the prior posting: Hope this helps: Having identified the boundary of our project / ODS application, we now have to figure out the unique elementary processes (EPs) for the ODS project, and to do so, we have to step back and ignore the "physical implementation". For example, if your business is say, banking, then your information warehouse (i.e., the ODSs, and/or the DW files) will include the following entities: - Account holders (aka customers -- may be multiple types such as regular consumer account holders, commercial/corporate, homeowners (who may not hold any accounts other than mortgage), etc) - Bank location (aka branch) -- may be multiple types such as storefront, atm only, full service, etc. and its information - Service types (aka products) -- loans (car, home, equity, collateral, business financing, etc) - and so on... When we take each of these comprehensive entities (each consolidates the disparate data about the entity gathered, massaged, and transformed from a variety of feeder sources) -- we need to look at what it takes to create a complete entity. For example, if one of the entities is, in fact, account holders, then there might be tens of input feeds that each contribute some DETs to the eventual complete ODS entity called "Account Holders". Physically there might be multiple input streams -- however, we need ALL of the data from all the sources converging together to be transformed into a complete and self-consistent Account Holder entity. This would be a single (albeit complex -- HIGH) EI process which encompasses the E+T+L processing. In other words it is like saying, we are going to create a consolidated entity called Account Holders that contains the current data for all account holders from all sources pulled together into a single repository (Account Holder would be an ILF in the ODS project/application). So our data feeds (physical) are typically combined because logically they are all needed to complete the ETL processes for Account Holders. (note there may be 1 EI per Account Holder type if the data and ETL processing for each type is unique from the other(s)). I hope that that clarifies things -- let me know. As a result, you would have at least 1 ETL "create account holder - EI - process (a high EI) per consolidated entity in your ODS. If there is also update and deletion possible, then there would be 3 EIs (add, change, delete) for each entity in the ODS app/project. 3. You asked about using the FP size of just the ODS portion to estimate the effort to create the ODS itself. FP are only as good as the model they (and other critical input variables) are used within. If you have a rudimentary hrs / FP guestimating equation, then you cannot expect great and accurate results. If your tool/model(s) are more robust and tailored to your environment, then the estimate for developing the ODS (assuming care was used to identify values for the other estimating input variables) should be good. Remember to always cite an estimated "range" of hours and provide rounded numbers (i.e., no decimal places!!!) especially when estimating effort at a very early stage / phase. Too many novices (and even practitioners who ought to know better) will publish and communicate "estimates" that contain what appears to be "exact numbers" complete with decimal places in the figures for hours by task. Don't fall into that trap! Make sure that you cite the level of implied precision that matches the phase at which you are estimating. Now,is that any more clear? (I ought to get paid by the word count!)
Regards, Carol Dekkers,President, Quality Plus Technologies, Inc. Software Measurement Solutions -- POWERful Results! www.qualityplustech.com |
   
Chinmoy Bhagwat
| | Posted on Tuesday, August 17, 2004 - 01:02 am: |    |
Dear Carol, That indeed was very detailed. :-) Thanks very much for a wonderful response. It definitely has cleared my doubts. Thanks again !!! Chinmoy. |
   
Shubhashree
| | Posted on Wednesday, September 22, 2004 - 01:23 pm: |    |
Dear Chimoy You can read the article provided in the URL below .It is quite useful and covers quite a few things one may come up with for a DW project http://www.dpo.it/english/resources/papers/2001-fesma-fpdw-en.pdf Regards Shubha |
   
Carol Dekkers
| | Posted on Wednesday, September 29, 2004 - 01:40 am: |    |
Chimoy, Two additional resources for DW projects: 1.My company (Quality Plus Tech) teaches a 1 day FP for DW Projects -- the materials are available for an onsite workshop or as a correspondence course. I taught this workshop last week prior to the IFPUG conference in San Diego. 2. Chris Kohnz did a 1 hour presentation on DW project counting and it is available on the IFPUG conference program CD (orderable from the IFPUG office -- send an email to ifpug@ifpug.org for details and pricing). Regards, Carol Dekkers,President, Quality Plus Technologies, Inc. Software Measurement Solutions -- POWERful Results! www.qualityplustech.com |
|