Re: Fastest way to run DELETE on a large table

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Andrew J. Kelly (sqlmvpnooospam_at_shadhawk.com)
Date: 01/14/05


Date: Fri, 14 Jan 2005 10:40:58 -0500

I don't know of any reason why not assigning an output value would be a
problem but if you don't use them then why have them?

-- 
Andrew J. Kelly  SQL MVP
"Reik" <Reik@discussions.microsoft.com> wrote in message 
news:3CE71788-2949-415A-8813-F27A6B3906DC@microsoft.com...
> Hi Andrew,
> thanks for helping. I will change the SP to do the insert with smaller
> batches.
> About the UNKNOWN TOKEN 0x00 exception, I have read that it is a
> bug in the MS JDBC Driver, dont know if it is true. The SP is supposingly
> notifying in a way the JDBC driver cannot handle.
> http://www.dbforumz.com/JDBC%20Driver-Unhandled-token-type-error-MS-ftopict145467.html
>
> Iam gonna change to the SP3 version of the driver tonight and see if the
> improved SP helps it. Do you know if it is a problem that my SP's use 
> output
> parameters but never assign output values to the parameters?
>
>
>
> "Andrew J. Kelly" wrote:
>
>> First off I know little of Java and hope to never have to use it 
>> again<g>.
>> My guess would be that they got a timeout while you are inserting the 
>> rows.
>> You are still inserting them all at once and you will most likely lock 
>> the
>> entire table while doing so and if not you will certainly block other 
>> users.
>> A couple more suggestions.  One is to always use SET NOCOUNT ON at the
>> beginning of your sp's to cut down on network traffic and speed execution 
>> of
>> the sp's with lots of activity.    Does your clustered index have 
>> Merchant
>> as the first column?  If so you can probably delete a lot more than 200 
>> rows
>> at a time.  Make sure these sp's are not wrapped in a transaction that 
>> was
>> issued before you called them otherwise you defeat the  whole purpose of
>> doing the small batches.   And finally do the inserts in smaller batches 
>> as
>> well.  You can add an identity colum to the staging table and cluster on 
>> it.
>> Then loop and do something like this:
>>
>> DECLARE @b INT, @e INT
>> SET @b = 1
>> SET @e = 1000
>>
>> WHILE 1 = 1
>> BEGIN
>>
>>     INSERT INTO TableA (...)  SELECT ....
>>         FROM cms_temp_records WITH (NOLOCK)
>>     WHERE YourIdentityCol BETWEEN @B and @end
>>
>>     IF @@ROWCOUNT = 0
>>         BREAK
>>
>>     SET @b = @b + 1000
>>     SET @e = @e + 1000
>>
>> END
>>
>>
>> -- 
>> Andrew J. Kelly  SQL MVP
>>
>>
>> "Reik" <Reik@discussions.microsoft.com> wrote in message
>> news:0EFB2EA1-B116-466D-B9C2-39DD20968432@microsoft.com...
>> > Okay I have been using stored procedures last night. The import batch 
>> > ran
>> > through. However I received a bunch of exception for people querying 
>> > the
>> > live
>> > table during that time. The live table and the staging table are in the
>> > same
>> > database. Here is the exception I got couple of times:
>> >
>> > 2005-01-14 07:17:29,921 [Thread-6038] ERROR - Error executing the
>> > following
>> > statement: SELECT * FROM cms_imported_products p WHERE p.merchant = 588
>> > AND
>> > p.artnr = 'CoR04'
>> > java.sql.SQLException: [Microsoft][SQLServer 2000 Driver for
>> > JDBC]Unhandled
>> > token type: Unknown token: 0x00
>> > at com.microsoft.jdbc.base.BaseExceptions.createException(Unknown 
>> > Source)
>> > at com.microsoft.jdbc.base.BaseExceptions.getException(Unknown Source)
>> > at 
>> > com.microsoft.jdbc.sqlserver.tds.TDSRequest.processReplyToken(Unknown
>> > Source)
>> > at
>> > com.microsoft.jdbc.sqlserver.tds.TDSRPCRequest.processReplyToken(Unknown
>> > Source)
>> > at com.microsoft.jdbc.sqlserver.tds.TDSRequest.processReply(Unknown
>> > Source)
>> > at com.microsoft.jdbc.sqlserver.tds.TDSCursorRequest.openCursor(Unknown
>> > Source)
>> > at com.microsoft.jdbc.sqlserver.SQLServerImplStatement.execute(Unknown
>> > Source)
>> > at com.microsoft.jdbc.base.BaseStatement.commonExecute(Unknown Source)
>> > at com.microsoft.jdbc.base.BaseStatement.executeQueryInternal(Unknown
>> > Source)
>> > at com.microsoft.jdbc.base.BaseStatement.executeQuery(Unknown Source)
>> >
>> >
>> >
>> > The stored procedures:
>> >
>> > USE cms_products
>> > GO
>> > IF EXISTS(SELECT name FROM sysobjects WHERE name =
>> > 'sp_delete_products_for_merchant' AND type = 'P')
>> >   DROP PROCEDURE sp_delete_products_for_merchant
>> > GO
>> > USE cms_products
>> > GO
>> > CREATE PROCEDURE sp_delete_products_for_merchant @@retVal int OUTPUT,
>> > @@merchantid varchar(255) = '-1'
>> > AS
>> > SET ROWCOUNT 200 --  Or what ever your system can handle
>> >
>> > WHILE 1 = 1
>> > BEGIN
>> >
>> > DELETE FROM cms_imported_products WHERE merchant = @@merchantid
>> >
>> > IF @@ROWCOUNT = 0
>> >    BREAK
>> > END
>> >
>> > SET ROWCOUNT 0
>> > GO
>> > ------
>> >
>> > AND:
>> >
>> > USE cms_products
>> > GO
>> > IF EXISTS(SELECT name FROM sysobjects WHERE name =
>> > 'sp_copy_temp_product_rows' AND type = 'P')
>> >   DROP PROCEDURE sp_copy_temp_product_rows
>> > GO
>> > USE cms_products
>> > GO
>> > CREATE PROCEDURE sp_copy_temp_pr args[i]);
>> >            }
>> >            callableStatement.setQueryTimeout(120);
>> >            callableStatement.execute();
>> >        } catch (SQLException x) {
>> >            log4j.error(x, x);
>> >            throw x;
>> >        } finally {
>> >            conn.setAutoCommit(true);
>> >        }
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > "Andrew J. Kelly" wrote:
>> >
>> >> The WHILE 1 = 1 just says keep looping.  The SET ROWCOUNT 5000 will 
>> >> limit
>> >> all sql statemtens to a mx of 5000 rows including deletes.  So this 
>> >> will
>> >> keep looping and deleteing 5000 rows at a time until @@ROWCOUNT = 0 
>> >> and
>> >> then
>> >> it will break out of the loop.  @@ROWCOUNT is always set to the number 
>> >> of
>> >> rows affected by the very last sql statement in this case the delete. 
>> >> So
>> >> as
>> >> long as there are rows to delete with that ID it will be greater than 
>> >> 0.
>> >> Once they are all deleted it will show 0 on the next pass and break 
>> >> the
>> >> loop.  The SET ROWCOUNT 0 resets the rowcount limiter to essentially 
>> >> be
>> >> unlimited as is the normal default behavior.
>> >>
>> >>
>> >> -- 
>> >> Andrew J. Kelly  SQL MVP
>> >>
>> >>
>> >> "Reik" <Reik@discussions.microsoft.com> wrote in message
>> >> news:D0875DB2-C1C3-4EBD-AD5A-722BA905DF14@microsoft.com...
>> >> > Hi Andrew,
>> >> > thanks for your good hints. I have one question left.
>> >> > Could you explain what your batch delete does, especially
>> >> > the @@rowcount and why I haveough. However I received a bunch of 
>> >> > exception for people querying the
>> > live
>> > table during that time. The live table and the staging table are in the
>> > same
>> > database. Here is the exception I got couple of times:
>> >
>> > 2005-01-14 07:17:29,921 [Thread-6038] ERROR - Error executing the
>> > following
>> > statement: SELECT * FROM cms_imported_products p WHERE p.merchant = 588
>> > AND
>> > p.artnr = 'CoR04'
>> > java.sql.SQLException: [Microsoft][SQLServer 2000 Driver for
>> > JDBC]Unhandled
>> > token type: Unknown token: 0x00
>> > at com.microsoft.jdbc.base.BaseExceptions.createException(Unknown 
>> > Source)
>> > at com.microsoft.jdbc.base.BaseExceptions.getException(Unknown Source)
>> > at 
>> > com.microsoft.jdbc.sqlserver.tds.TDSRequest.processReplyToken(Unknown
>> > Source)
>> > at
>> > com.microsoft.jdbc.sqlserver.tds.TDSRPCRequest.processReplyToken(Unknown
>> >now if it is true. The SP is supposingly notifying in a way the JDBC 
>> >driver cannot handle.
> http://www.dbforumz.com/JDBC%20Driver-Unhandled-token-type-error-MS-ftopict145467.html
>
> Iam gonna change to the SP3 version of the driver tonight and see if the
> improved SP helps it. Do you know if it is a problem that my SP's use 
> output
> parameters but never assign output values to the parameters?
>
>
>
> "Andrew J. Kelly" wrote:
>
>> First off I know little of Java and hope to never have to use it 
>> again<g>.
>> My guess would be that they got a timeout while you are ins to use a 
>> WHILE 1=1 ?
>> >> >
>> >> > Thanks Reik
>> >> >
>> >> > "Andrew J. Kelly" wrote:
>> >> >
>> >> >> I almost never import directly to the final production table.  Use 
>> >> >> a
>> >> >> staging
>> >> >> database on the same server and import the rows there first.  You 
>> >> >> will
>> >> >> get
>> >> >> fastest results by doing a Bulk Insert type of operation and not 
>> >> >> doing
>> >> >> individual inserts thru Java.  Then prep the data in any way you 
>> >> >> may
>> >> >> need
>> >> >> so
>> >> >> it is ready for inserting into the prod table.  Then create a 
>> >> >> process
>> >> >> that
>> >> >> will loop and delete the rows in smaller batches.
>> >> >>
>> >> >> SET ROWCOUNT 5000 --  Or what ever your system can handle
>> >> >>
>> >> >> WHILE 1 = 1
>> >> >> BEGIN
>> >> >>
>> >> >> DELETE FROM YourTable WHERE [ID] = xxx
>> >> >>
>> >> >> IF @@ROWCOUNT = 0
>> >> >>     BREAK
>> >> >>
>> >> >> END
>> >> >>
>> >> >> SET ROWCOUNT 0
>> >> >>
>> >> >> You can then insert the new rows either all at once for each ID (if
>> >> >> your
>> >> >> system can handle this) or use a similar technique to do it in
>> >> >> batches.
>> >> >>
>> >> >> INSERT INTO YourTable (Col1, Col2...)
>> >> >> SELECT (Col1, Col2 ...) FROM otherdb..StageingTable WITH (NOLOCK)
>> >> >> WHERE
>> >> >> [ID]
>> >> >> = xxx
>> >> >>
>> >> >> -- 
>> >> >> Andrew J. Kelly  SQL MVP
>> >> >>
>> >> >>
>> >> >> "Reik" <Reik@discussions.microsoft.com> wrote in message
>> >> >> news:CC44D9DD-3290-4625-B7DE-D88D42FC4A9E@microsoft.com...
>> >> >> > Hi,
>> >> >> > in my database I have a table that contains 250.000 records. 
>> >> >> > Every
>> >> >> > night
>> >> >> > Iam
>> >> >> > importing new records into that table. The table is also being
>> >> >> > queried
>> >> >> > from a
>> >> >> > java based web application during that time. The thread that runs
>> >> >> > the
>> >> >> > import
>> >> >> > is also java based. Before I import new records I remove old ones
>> >> >> > for a
>> >> >> > specific merchant: DELETE FROM table1 WHERE id = xx
>> >> >> > This almost every times creates a timeout in the JDBC driver 
>> >> >> > since
>> >> >> > the
>> >> >> > action takes too long. How can I improve this process?
>> >> >> >
>> >> >> > Another idea was to set up a second table with the same 
>> >> >> > structure.
>> >> >> > Then
>> >> >> > import new records into Table2. When complete, delete all relying
>> >> >> > records
>> >> >> > from Table1 and copy over the new records from TABLE2 to TABLE1. 
>> >> >> > Iam
>> >> >> > looking
>> >> >> > for a solution in TSQL that does the copying the fastest way. 
>> >> >> > Thanks
>> >> >> > in
>> >> >> > advance.
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> >>
>>
>>
>> 


Relevant Pages

  • Negative results
    ... Andrew J. Kelly wrote: ... Sorry thanks for pointing that out Kevin. ... Andrew J. Kelly SQL MVP ... Your best option to begin with is to look at the virtual file stats to seehow ...
    (microsoft.public.sqlserver.server)
  • Re: Extremely high % Disk Time .. How is this possible??
    ... Andrew J. Kelly SQL MVP ... > Slicing them into logical drives may look better but it does not increase ... I prefer to look at disk queue ... >>> Andrew J. Kelly SQL MVP ...
    (microsoft.public.sqlserver.clustering)
  • Re: disk I/O tool
    ... Andrew J. Kelly SQL MVP ... "Jay" wrote in message ... I guess he really doesn't understand SAN's very well either because SAN's are notoriously slow for writing small but frequent I/O's and direct attached drives will often outperform them hands down. ...
    (microsoft.public.sqlserver.server)
  • Re: Performance really poor if calling data from sql clients
    ... Andrew J. Kelly SQL MVP ... The TCP connection is running from the client ... On a computer running the test which is not the database hosting ...
    (microsoft.public.sqlserver.clients)
  • Re: Procedure execution problem - Clients performances
    ... Again check the parameters and plan with a trace. ... Andrew J. Kelly SQL MVP ... Dim conn As New SqlConnection ...
    (microsoft.public.sqlserver.clients)