Re: StreamReader->regex->Win32 api marshalling



Lots of info here, but I'll try to split it out.

Regex's are a notoriously bad performance wise way to accomplish what
you're trying to do.
http://www.codeproject.com/cs/database/CsvReader.asp
http://www.csvreader.com/csv_benchmarks.html

I would suggest you either implement the code yourself using raw char
array handling to find the commas, etc, etc, or purchase a commercial
csv parser like the one I sell, http://www.csvreader.com .

bcp is not what I would recommend for this because it does not properly
handle escape sequences, and does not necessarily have a performance
benefit that I've seen over dts. I would recommend calling a dts job
instead to import this data. If you don't need to do any data
manipulation to the original csv file, then there's no reason to even
do any csv parsing, just let dts do it for you. If you do want to try
the batching of insert statements like one of the previous comments
suggested, make sure you start a transaction before doing the chunks of
inserts, then commit at the end of a chunk. Otherwise, you'll get a
performance hit as the db tries to commit after every single insert.

Scott wrote:
> Tim wrote:
>
> > db. It all works fine, but the performance is just not good enough -
> > I need to at least halve the time its taking to process the data.
> > I've profiled the code and it looks like the overhead is more or less
> > evenly split between the reading/parsing on the one hand and the data
> > loading/bcp/Win32 api marshalling on the other hand - so I'm looking
>
> Are you using the Compile flag on the regex?
>
> I would also try to insert to the database directly from your code by
> batching the statements: insert xxx xxxx xxx;insert xxx xxxx xxx;insert
> xxx xxxx xxx;... I would start by batching into groups of a 100 and
> then try to tweak from there.
>
> I've had very good performance success with these two strategies.
>
> Scott C.

.



Relevant Pages

  • Re: StreamReader->regex->Win32 api marshalling
    ... When I said the csv data was rough, ... I would recommend calling a dts job ... > performance hit as the db tries to commit after every single insert. ... >> Scott C. ...
    (microsoft.public.dotnet.framework.performance)
  • Re: dts package and t log
    ... INSERTING and UPDATING is not necessarily committed straight away. ... Default DTS says "Only commit the rows after the last one has gone through". ... Allan Mitchell MCSE,MCDBA, (Microsoft SQL Server MVP) www.SQLDTS.com - The site for all your DTS needs. ...
    (microsoft.public.sqlserver.dts)
  • Re: DTS vs Biztalk
    ... or a presentation delivered some time ago where Scott ... Woodgate along with a DTS expert tried to show when to use ... Or maybe it was from the Tech Ed DVD from last year. ...
    (microsoft.public.biztalk.general)