fuzzy lookup maintain index incredibly slow
I am using SQL 2005 SP3. I have an ssis package that uses a fuzzy lookup to match records. The created lookup table is about 750k rows and the FLRef table is about 1.5 million rows. The performance of the lookup operation is pretty good. If I use the same
index, I am able to quad thread it and refer all of the lookups to that single table, which seems to perform a lot better than if I make separate indexes.
The index table itself seems to take about 2-4 minutes to create from scratch. However, the issue is when I do the maintain index. When I insert 20 rows, it takes about 11 seconds. Once a month, I need to do an insert that is about 15-25k rows. I tried to
do that last night and it ran for 9 hours and then I stopped it. It seemed to insert all of the rows to the table, but I couldn't tell if the trigger was still trying to run or what.
Does this sound normal? I know that with a trigger (which I know very little about), it is basically performing the operation one row at a time. It seems to mysteriously create tables that the sp procedure uses to update the eti. Is there anyway to batch
run the sys.sp_FuzzyLookupTableMaintenanceInvoke procedure so that it does all of the inserted rows at once? Could this be if I deleted the trigger manually before, and then recreated the index table?
November 11th, 2010 9:15am
Is the issue with slow inserts or long maintenance? What is your goal?Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
November 11th, 2010 11:10am
Slow inserts. This runs as part of a procedure that inserts a months worth of data right before the fuzzy match is run. I only need to insert 15-25k rows at a time. I tried batching it with 200-500 rows but that was slow as well. It seems wierd to me because
the 750k table is very fast building the index from scratch. It wouldn't be an issue if I wasnt trying to multithread it, which means that all of the fuzzies should ideally hit the same index (or so I have found), and you can't specify a new index with the
existing fuzzies, because even if you just created it, it says that it cant find flref_234085n2 and such.So, maintaining is the idea way because then I don't have to rebuild the index and package every time I want to run it.
I'm working on a work around now that will parse out the rows from the input and the fuzzy matching by one of the exact match specifications (this might even be faster since i will ultimately shrink the fuzzy match indicies). But, I'd really like the insert
/ maintenance to work though so I don't have to set up special views every time I want to do this.
November 11th, 2010 11:18am
Drop the index (you can also disable it which I'd prefer), insert data, recreate or re-enable the index.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
November 11th, 2010 11:22am
Is this the FuzzyLookupTableMaintenanceRefTableCopyIndex that is on the FLRef table?
Are you saying that if I insert into the table with the trigger (and I am assuming the trigger is disabled at this point) and rebuild the above (or whichever) index, it will update the match index automatically?
November 11th, 2010 11:28am
Or wait, do you mean keep the trigger active while the index is disabled (and it wont rebuild every time), and then enable it after the doing the insert as normal?
Free Windows Admin Tool Kit Click here and download it now
November 11th, 2010 11:37am
Exactly this.Arthur My Blog
November 11th, 2010 11:38am
Thanks, i will give that a shot--
Free Windows Admin Tool Kit Click here and download it now
November 11th, 2010 11:41am
Arthur, are you absolutely sure, this is the fuzzy index used and maintained by the fuzzy lookup transformation , which is not really an index at all but several tables?
mcollins1264, I think the simplest option would be to not use the maintain index option at all, and just re-create the fuzzy index followng your bulk load. If you really do want to keep the maintain option on for smaller inserts and updates then add a clean-up
step to drop the objects before the bulk load. This also assumes that the bulk load can be run during a period of downtime as you won't have the fuzzy capability without the index.http://www.sqlis.com | http://www.konesans.com
November 11th, 2010 11:49am
I think he means the FuzzyLookupTableMaintenanceRefTableCopyIndex, which is on one of those tables, or are you saying that is a separate set of tables in itself?
My proc is in shambles right now, so I wont be able to test this for a little while, but I think he is saying that if that rebuilding index , of which is presumably the slow part of the trigger, will not occur if the index is disabled or gone. Then, if it is
re-enabled / rebuilt, it will say "oh here are a bunch of new rows" rather than doing that on every insert.
I am making a bunch of assumptions, so please correct me if I am wrong. Again I wont be able to test this for some time, but I am definately interested as this will be key on a bunch of these that I built recently.
Free Windows Admin Tool Kit Click here and download it now
November 11th, 2010 12:01pm