Fastest Fetch for Client Application



I'm a total luddite and am writing a brand new C++ ODBC application
for numerical analysis against a SQL Server database. I'm not using
MFC or ATL or any foundation classes whatsover. It's just C++, my
little ODBC wrapper (and a thin one at that), and a few things I wrote
to do lots of multithreading.

Now, I initially picked ODBC because, well, it was the easiest and
most portable database API out there, but, having timed it, it seems
that performance "feels" pretty good. Going across a LAN to the
database server, a query that returns 80,000 rows runs from
SQLExecDirect to the last row SQLFetch'd in about .5 seconds. So, I'm
pulling in around 150,000 rows per second off of a six column set with
a pair of joins tossed in.

My proposition is that I can pull in rows fairly quickly into either
RAM or even crunch them as they are being read in, and produce results
dynamically. The key is, for this value proposition to fly, I have to
be able to get rows out of SQL Server as fast as reasonably possible.
So, before I get too heavy into things, I'd like to understand your
takes on ODBC / SQL Server performance with respect to the goal of
pulling in big datasets for number crunching in C++.

Is Firehose the Fastest Read?
--------------------------------------------
I'm using a firehose connection to do this, that is, forward only
readonly cursors, row size = 1, and, lucky for me, that's the default
out of the box setting. The question is, is a firehose the fastest
possible way to read data in ODBC? I've read a lot saying one way or
the other using SQL Server 2000 will, if you don't use a firehose, do
something with a bunch of server side cursor calls, and that will
ultimately slow you down. On the other hand, it seems that fetching
an array of stuff, rather than one row at a time, would be faster, but
then again, just how much of a performance hit is it to do a simple
function call, if, behind the scenes, ODBC is just pulling data out of
its own buffer on read.

I don't care at all if I have to architect my app to do funky things
with firehose cursors throught. Honestly, I don't need to read that
often, just, when I do, its going to be a lot.

Is OLEDB vs ODBC overrated?
---------------------------------------------
I think that there wouldn't be too much of a difference in performance
between ODBC and OLEDB, especially if they both are just speaking TDS
to the server. Is that suspicion born out? If anything, I would
think COM based APIs might be slower because the method calls are more
expensive than going through a DLL. Again, forget the asthetics of
the API, I'm just interested in reading lots of rows quickly
performance.

How Dead is DB-Library, and should we care?
-------------------------------------------------------------------
As an aside, just how dead is db-library anyway? It seems pretty
dead, as in, Microsoft has pooh-poohed it, stopped supporting it, and
I can't find it anywhere. But, if db-library is still the speed king
on the block, where could I get it? Is there a free-tds for Windows,
and if so, how good is it?

Firehose Bulk Fetch on Non-Prepared (SQLExecDirect Statements)?
---------------------------------------------------------------------------------------------------
Can I have the best of all worlds and do an array like fetch (rows >
1) on non-prepared statements? It seems like I could, but the docs
seem to read that doing so would turn off the firehose. Is this true?

SQLBindCol vs SQLGetData?
-----------------------------------------
Is it faster to bind, or to get?

Is it more than 10% faster
-------------------------------------
Are any of the above options going to be more than 10% faster for read
than just out of the box bound columns, SQLFetch?

.


Loading