SSIS Update only subset of rows
I have an UPDATE tasks in the package. We will have an issue if these run as is against our production database. The issue is that due to the size of these two tables being updated, these UPDATE statements will fill the transaction log. Since a given UPDATE statement is an atomic transaction, each row that is modified by the UPDATE statement will be written to the transaction until all the rows affected by the UPDATE statement have been modified. As such, you'll need to execute the UPDATE in a loop, updating only a subset of rows (say 500,000) with each update statement to avoid filling up the transaction log. The Update Statement : UPDATE Table1 SET Table1.Field1 = StagingTable.Field1 FROM Table1 INNER JOIN StagingTable ON Table1.Field2 = StagingTable.Field2 Any ideas ? Thanks
May 5th, 2011 11:11am

In loop or not, how the Transaction log will not be filled? And what kind of a database space constraint you have so you cannot update data, it is already an un-healthy environment, beware.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
May 5th, 2011 11:33am

You might write script which loop and update only 1000-2000 rows each iteration, commit data before pull another set that way you can avoid fill up this transactional log. We have a similary kind of problem to update 5million rows every day - for that we implimented process which pull 5000 rows update and commit before pull next 5000 row set. Hope this helps you. If this is a data warehouse (not OLTP) then you can change recovery model simple before doing this updates and revert it back full once you finish this process. http://uk.linkedin.com/in/ramjaddu
May 5th, 2011 11:33am

Here is an example on how to do an update in batches: http://www.codeproject.com/KB/aspnet/BatchUpdate.aspxArthur My Blog
Free Windows Admin Tool Kit Click here and download it now
May 5th, 2011 11:36am

Thanks RamJaddu and Arthur !! But i think writing it all in T-SQl would be nice to avoid those round trips, lets say we are updating 5000 row for an iteration with over 680 million rows between those two tables, then it will take 136,000 round trips. Could you suggest me how to write it all in a T-SQL ? here is my update statement UPDATE Table1 SET Table1.Field1 = StagingTable.Field1 FROM Table1 INNER JOIN StagingTable ON Table1.Field2 = StagingTable.Field2 Thanks again !!
May 5th, 2011 12:28pm

I do not think there are round trips involved, all the SQL gets executed on the SQL Server anyways.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
May 5th, 2011 2:10pm

You did not specify what is vendor of your database is that you are trying to update but if you want to learn something new in T-SQL you can use the MERGE command to do that update on SQL Server 2005 and up as described here: http://charliedigital.com/2009/06/05/t-sql-merge-my-new-friend/Arthur My Blog
May 5th, 2011 2:13pm

I know that you have a Logging issue Just heads-up you may need to some tunning check http://blogs.msdn.com/b/mattm/archive/2010/06/29/performance-tuning-ssis-video.aspx check it all or just jump to 33:30sec and see what its saying about updates in SSIS story short 4:30Sec using a UPDATE in ssis becomes 1s using a different technique, anyways nice to check that Good luck Nik Sincerely SH -- MVP, MCITP 2008, MCTS 2008 & 2005 -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
Free Windows Admin Tool Kit Click here and download it now
May 5th, 2011 4:15pm

Hi Can anyone suggest how to do it I mean how to modify the update query in sql script task for batch update ? Lets say for this statement : The Update Statement : UPDATE Table1 SET Table1.Field1 = StagingTable.Field1 FROM Table1 INNER JOIN StagingTable ON Table1.Field2 = StagingTable.Field2 Sorry if that sounds silly ...Im just learning Thank You :)
May 5th, 2011 6:45pm

Assuming that Field2 is a sequential integer and that there's one row per value, here's some code to do the next 5,000 rows. --Find out how the point that has already been updated DECLARE @Field2AlreadyUpdated INTEGER SELECT @Field2AlreadyUpdated=MAX(Table1.Field2) FROM Table1 WHERE Table1.Field2 is not null -- Assuming your field is null until updated -- Update only the next 5,000 rows UPDATE Table1 SET Table1.Field1 = StagingTable.Field1 FROM Table1 INNER JOIN StagingTable ON Table1.Field2 = StagingTable.Field2 WHERE StagingTable.Field2 Between @Field2AlreadyUpdated + 1 and @Field2AlreadyUpdated + 5000
Free Windows Admin Tool Kit Click here and download it now
May 5th, 2011 10:16pm

I am assuming table TABLE1 column value null until updated Use SET ROWCOUNT ---- some thing like this While (select count(1) from Table1 where Field1 is null) > 0 begin begin tran SET ROWCOUNT 1000; ---- this restrict below update to 1000 rows UPDATE Table1 SET Table1.Field1 = StagingTable.Field1 FROM Table1 INNER JOIN StagingTable ON Table1.Field2 = StagingTable.Field2 Where Table1.Field1 is null If @@ERROR = 0 Commit tran Else Rollback tran end http://uk.linkedin.com/in/ramjaddu
May 6th, 2011 7:50am

Thanks a lot Ramjaddu Can you please modify the following update statement for better understanding !! i was in military for a while and just changed job to software update cds set [mailing_history] = cdst.[mailing_history], [mailing_history_initial] = cdst.[mailing_history_initial], [mailing_history_followup] = cdst.[mailing_history_followup] from dbo.campaign_detail_static cds inner join dbo.campaign_detail_static_temp cdst on cdst.campaign_code_skey = cds.[campaign_code_skey] and cdst.sequence_nr = cds.sequence_nr Thank You :) Learning bee
Free Windows Admin Tool Kit Click here and download it now
May 6th, 2011 11:57am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics