Loading a fact table
I have a set of dimension tables with are mainly 1 or 2 column detail and also one column also for the surrogate key. I am trying to populate the fact table and I wanted to know what the best methods are to populate the fact table. Now the fact table previously had information like the SalesRepName, siteLocation. I have now separated these onto dimension tables namely Staff and also Location. The idea is that instead of having the staff name, Location name in the fact table, we can represent them with their respective foreign keys. However when loading the data to the tables. Do I not need to join the tables so as to get the respective foreign key. In this particular case, is it god practice to index those columns on the dimension tables and also is SSIS better to do this job using lookups or is old SQL queries better ? StaffID StaffRepName 1 Tom 2 Jack 3 Phil 4 Jill LocationID SiteName 1 Paris 2 Lens 3 Monaco
May 24th, 2012 8:48am

Yes, you need to join to the dimension tables to get the surrogate keys. So make sure you load the dimensions prior to loading the fact table. I usually put a clustered index on the surrogate key (good for joining in reporting) and a non-clustered index on the business key (good for doing the lookup). You can do it in TSQL, but the SSIS lookup works pretty well if your dimensions aren't too big (which they normally aren't). Make sure you use the full caching option in the lookup and that you select only the surrogate key and the business key.MCTS, MCITP - Please mark posts as answered where appropriate.
Free Windows Admin Tool Kit Click here and download it now
May 24th, 2012 8:59am

If I wanted to go down the SQL route, will this involve the use of left joins on all the dimension tables, to cater for lookups that dont match and return a null ? Thanks.
May 24th, 2012 10:58am

There should be no such occurrence as a row in the fact table not having a corresponding row in a dimension table. You should set up foreign keys to enforce this if you're not confident in your SSIS ETL process. I don't understand your "SQL route". You're loading facts with SSIS, correct? Then you need to look up surrogate keys using SSIS in order to write rows to your fact table correctly, do you not? Using the SSIS Lookup component is how you do this. You can't do that in T-SQL after you've loaded the fact table, because you don't have any business keys to look up. When you report on the fact+dimension tables, yes, you'll join facts to dimensions by their surrogate keys. Again, LEFT joins should NOT be necessary. Talk to me now on
Free Windows Admin Tool Kit Click here and download it now
May 24th, 2012 11:21am

You can use TSQL with LEFT OUTER JOINS before you load the fact table. If you want to do it in the dataflow, you put this in the source so you can lookup all your dimension keys there. If your dataset isn't really large, it doesn't really matter which method you choose - TSQL or Lookup - just do what you're comfortable with.MCTS, MCITP - Please mark posts as answered where appropriate.
May 24th, 2012 4:49pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics