Re: Fastest way to run DELETE on a large table

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Andrew J. Kelly (sqlmvpnooospam_at_shadhawk.com)
Date: 01/14/05


Date: Fri, 14 Jan 2005 09:21:59 -0500

First off I know little of Java and hope to never have to use it again<g>.
My guess would be that they got a timeout while you are inserting the rows.
You are still inserting them all at once and you will most likely lock the
entire table while doing so and if not you will certainly block other users.
A couple more suggestions. One is to always use SET NOCOUNT ON at the
beginning of your sp's to cut down on network traffic and speed execution of
the sp's with lots of activity. Does your clustered index have Merchant
as the first column? If so you can probably delete a lot more than 200 rows
at a time. Make sure these sp's are not wrapped in a transaction that was
issued before you called them otherwise you defeat the whole purpose of
doing the small batches. And finally do the inserts in smaller batches as
well. You can add an identity colum to the staging table and cluster on it.
Then loop and do something like this:

DECLARE @b INT, @e INT
SET @b = 1
SET @e = 1000

WHILE 1 = 1
BEGIN

    INSERT INTO TableA (...) SELECT ....
        FROM cms_temp_records WITH (NOLOCK)
    WHERE YourIdentityCol BETWEEN @B and @end

    IF @@ROWCOUNT = 0
        BREAK

    SET @b = @b + 1000
    SET @e = @e + 1000

END

-- 
Andrew J. Kelly  SQL MVP
"Reik" <Reik@discussions.microsoft.com> wrote in message 
news:0EFB2EA1-B116-466D-B9C2-39DD20968432@microsoft.com...
> Okay I have been using stored procedures last night. The import batch ran
> through. However I received a bunch of exception for people querying the 
> live
> table during that time. The live table and the staging table are in the 
> same
> database. Here is the exception I got couple of times:
>
> 2005-01-14 07:17:29,921 [Thread-6038] ERROR - Error executing the 
> following
> statement: SELECT * FROM cms_imported_products p WHERE p.merchant = 588 
> AND
> p.artnr = 'CoR04'
> java.sql.SQLException: [Microsoft][SQLServer 2000 Driver for 
> JDBC]Unhandled
> token type: Unknown token: 0x00
> at com.microsoft.jdbc.base.BaseExceptions.createException(Unknown Source)
> at com.microsoft.jdbc.base.BaseExceptions.getException(Unknown Source)
> at com.microsoft.jdbc.sqlserver.tds.TDSRequest.processReplyToken(Unknown
> Source)
> at 
> com.microsoft.jdbc.sqlserver.tds.TDSRPCRequest.processReplyToken(Unknown
> Source)
> at com.microsoft.jdbc.sqlserver.tds.TDSRequest.processReply(Unknown 
> Source)
> at com.microsoft.jdbc.sqlserver.tds.TDSCursorRequest.openCursor(Unknown
> Source)
> at com.microsoft.jdbc.sqlserver.SQLServerImplStatement.execute(Unknown
> Source)
> at com.microsoft.jdbc.base.BaseStatement.commonExecute(Unknown Source)
> at com.microsoft.jdbc.base.BaseStatement.executeQueryInternal(Unknown 
> Source)
> at com.microsoft.jdbc.base.BaseStatement.executeQuery(Unknown Source)
>
>
>
> The stored procedures:
>
> USE cms_products
> GO
> IF EXISTS(SELECT name FROM sysobjects WHERE name =
> 'sp_delete_products_for_merchant' AND type = 'P')
>   DROP PROCEDURE sp_delete_products_for_merchant
> GO
> USE cms_products
> GO
> CREATE PROCEDURE sp_delete_products_for_merchant @@retVal int OUTPUT,
> @@merchantid varchar(255) = '-1'
> AS
> SET ROWCOUNT 200 --  Or what ever your system can handle
>
> WHILE 1 = 1
> BEGIN
>
> DELETE FROM cms_imported_products WHERE merchant = @@merchantid
>
> IF @@ROWCOUNT = 0
>    BREAK
> END
>
> SET ROWCOUNT 0
> GO
> ------
>
> AND:
>
> USE cms_products
> GO
> IF EXISTS(SELECT name FROM sysobjects WHERE name =
> 'sp_copy_temp_product_rows' AND type = 'P')
>   DROP PROCEDURE sp_copy_temp_product_rows
> GO
> USE cms_products
> GO
> CREATE PROCEDURE sp_copy_temp_product_rows @@retVal int OUTPUT
> AS
>
> INSERT INTO cms_imported_products (
> [titel] ,
> [features] ,
> [teaser_text] ,
> [expires] ,
> [bigimage] ,
> [product_name] ,
> [price] ,
> [merchant] ,
> [merchant_name] ,
> [bannerimage] ,
> [artnr] ,
> [shiptime] ,
> [shipfee] ,
> [manufacturer] ,
> [deeplink] ,
> [subcategory]
> )
> SELECT
> [titel] ,
> [features] ,
> [teaser_text] ,
> [expires] ,
> [bigimage] ,
> [product_name] ,
> [price] ,
> [merchant] ,
> [merchant_name] ,
> [bannerimage] ,
> [artnr] ,
> [shiptime] ,
> [shipfee] ,
> [manufacturer] ,
> [deeplink] ,
> [subcategory]
> FROM cms_temp_records WITH (NOLOCK)
>
> GO
>
> ----
>
> my Java Code:
>
>
>
> try {
>
>            String statement = "{call " + name + "(?";
>            if (args != null) {
>                for (int i = 0 ; i < args.length ; i++) {
>                    statement += ",?";
>                }
>            }
>
>            statement += ")}";
>
>            CallableStatement callableStatement = conn.prepareCall(
>                statement
>            );
>            callableStatement.registerOutParameter(1,
> java.sql.Types.INTEGER); // this will hold the scalar
>
>            for (int i = 0 ; i < args.length ; i++) {
>                callableStatement.setString(i+2, args[i]);
>            }
>            callableStatement.setQueryTimeout(120);
>            callableStatement.execute();
>        } catch (SQLException x) {
>            log4j.error(x, x);
>            throw x;
>        } finally {
>            conn.setAutoCommit(true);
>        }
>
>
>
>
>
>
>
> "Andrew J. Kelly" wrote:
>
>> The WHILE 1 = 1 just says keep looping.  The SET ROWCOUNT 5000 will limit
>> all sql statemtens to a mx of 5000 rows including deletes.  So this will
>> keep looping and deleteing 5000 rows at a time until @@ROWCOUNT = 0 and 
>> then
>> it will break out of the loop.  @@ROWCOUNT is always set to the number of
>> rows affected by the very last sql statement in this case the delete. So 
>> as
>> long as there are rows to delete with that ID it will be greater than 0.
>> Once they are all deleted it will show 0 on the next pass and break the
>> loop.  The SET ROWCOUNT 0 resets the rowcount limiter to essentially be
>> unlimited as is the normal default behavior.
>>
>>
>> -- 
>> Andrew J. Kelly  SQL MVP
>>
>>
>> "Reik" <Reik@discussions.microsoft.com> wrote in message
>> news:D0875DB2-C1C3-4EBD-AD5A-722BA905DF14@microsoft.com...
>> > Hi Andrew,
>> > thanks for your good hints. I have one question left.
>> > Could you explain what your batch delete does, especially
>> > the @@rowcount and why I have to use a WHILE 1=1 ?
>> >
>> > Thanks Reik
>> >
>> > "Andrew J. Kelly" wrote:
>> >
>> >> I almost never import directly to the final production table.  Use a
>> >> staging
>> >> database on the same server and import the rows there first.  You will
>> >> get
>> >> fastest results by doing a Bulk Insert type of operation and not doing
>> >> individual inserts thru Java.  Then prep the data in any way you may 
>> >> need
>> >> so
>> >> it is ready for inserting into the prod table.  Then create a process
>> >> that
>> >> will loop and delete the rows in smaller batches.
>> >>
>> >> SET ROWCOUNT 5000 --  Or what ever your system can handle
>> >>
>> >> WHILE 1 = 1
>> >> BEGIN
>> >>
>> >> DELETE FROM YourTable WHERE [ID] = xxx
>> >>
>> >> IF @@ROWCOUNT = 0
>> >>     BREAK
>> >>
>> >> END
>> >>
>> >> SET ROWCOUNT 0
>> >>
>> >> You can then insert the new rows either all at once for each ID (if 
>> >> your
>> >> system can handle this) or use a similar technique to do it in 
>> >> batches.
>> >>
>> >> INSERT INTO YourTable (Col1, Col2...)
>> >> SELECT (Col1, Col2 ...) FROM otherdb..StageingTable WITH (NOLOCK) 
>> >> WHERE
>> >> [ID]
>> >> = xxx
>> >>
>> >> -- 
>> >> Andrew J. Kelly  SQL MVP
>> >>
>> >>
>> >> "Reik" <Reik@discussions.microsoft.com> wrote in message
>> >> news:CC44D9DD-3290-4625-B7DE-D88D42FC4A9E@microsoft.com...
>> >> > Hi,
>> >> > in my database I have a table that contains 250.000 records. Every
>> >> > night
>> >> > Iam
>> >> > importing new records into that table. The table is also being 
>> >> > queried
>> >> > from a
>> >> > java based web application during that time. The thread that runs 
>> >> > the
>> >> > import
>> >> > is also java based. Before I import new records I remove old ones 
>> >> > for a
>> >> > specific merchant: DELETE FROM table1 WHERE id = xx
>> >> > This almost every times creates a timeout in the JDBC driver since 
>> >> > the
>> >> > action takes too long. How can I improve this process?
>> >> >
>> >> > Another idea was to set up a second table with the same structure. 
>> >> > Then
>> >> > import new records into Table2. When complete, delete all relying
>> >> > records
>> >> > from Table1 and copy over the new records from TABLE2 to TABLE1. Iam
>> >> > looking
>> >> > for a solution in TSQL that does the copying the fastest way. Thanks 
>> >> > in
>> >> > advance.
>> >>
>> >>
>> >>
>>
>>
>> 


Relevant Pages

  • Re: Fastest way to run DELETE on a large table
    ... SET ROWCOUNT 200 -- Or what ever your system can handle ... > Andrew J. Kelly SQL MVP ... >>> individual inserts thru Java. ... >>> Andrew J. Kelly SQL MVP ...
    (microsoft.public.sqlserver.programming)
  • Re: Fastest way to run DELETE on a large table
    ... the @@rowcount and why I have to use a WHILE 1=1? ... > SET ROWCOUNT 0 ... > Andrew J. Kelly SQL MVP ... >> java based web application during that time. ...
    (microsoft.public.sqlserver.programming)
  • Re: eclipse & java 5
    ... I'm not Andrew. ... I have not installed the Sun JDK on my new FC4 on AMD64 ... box for fear that it might mess up the Java stuff from FC4. ... >!MESSAGE Missing required bundle ...
    (Fedora)
  • Re: eclipse & java 5
    ... > I'm not Andrew. ... I have not installed the Sun JDK on my ... > box for fear that it might mess up the Java stuff ... For instructions about how to install the last version ...
    (Fedora)
  • Re: Fastest way to run DELETE on a large table
    ... SET ROWCOUNT 5000 -- Or what ever your system can handle ... system can handle this) or use a similar technique to do it in batches. ... > in my database I have a table that contains 250.000 records. ... > java based web application during that time. ...
    (microsoft.public.sqlserver.programming)