Re: Efficient coordinated queries??



Ooh, no... processing is way to complicated for a stored procedure. That's a whole Can O' Worms(TM) there!! :-) We are distributing processing over a large server farm -- bringing it back onto the SQL server would probably kill our performance.

Regards,
-- Tom

"William Vaughn (MVP)" <billva@xxxxxxxxx> wrote in message news:%2352%23glGuJHA.4632@xxxxxxxxxxxxxxxxxxxxxxx
Resending...

Ah, I would take a different approach. I would NOT try to bring the data to the client to process, but write a stored procedure that processes the rowsets.

--
__________________________________________________________________________
William R. Vaughn
President and Founder Beta V Corporation
Author, Mentor, Dad, Grandpa
Microsoft MVP
(425) 556-9205 (Pacific time)
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
http://betav.com http://betav.com/blog/billva
____________________________________________________________________________________________



"Tom" <tbrown92030@xxxxxxxxx> wrote in message news:8F03DE44-9279-49C8-8FDD-072B63711993@xxxxxxxxxxxxxxxx
Bear with me, here is some background...
I need to process a lot of data that's being brought in from SQL. A large chunk of it can be brought in with a single query (albeit joining to many tables) and each row represents an item to process. However, there is a significant amount of data that exists in a 0..n relationship to this initial row of data, basically five different sets of data that have a 0..n relationship. For a hypothetical example consider a customer order database where we need to process through all customers and can bring in a row of customer data as our main query but there are 0..n1 rows of customer order data, 0..n2 rows of customer feedback information, 0..n3 rows of customer discounts available, etc.

There is way too much data to bring in at once, into a one big DataSet, and process. Instead I want to bring in the data as individual DataSets representing an item to process. That is, table 0 contains the one row from the main query and table 1 through table 5 have their respective 0..n rows that correspond to the data in table 0.

So, what is the best way to get these individual DataSets built?
1. Establish a SqlDataReader on the first query and, for each row obtained, separately query for the other five sets of data to get their 0..n rows.

2. Establish a SqlDataReader on each query ensuring that the five correlated queries are ordered the same as the first for its primary key (this key is common across all these sources, of course) and examine the primary key/foreign key values for the correlated readers to merge the appropriate rows into a DataSet once the primary row has been obtained.

3. Fully denormalize in a gi-normous SQL query and go back to just a single row of data per item.

4. ???

I've only just started thinking about (3) but would like to avoid this since the n values of some of the 0..n subqueries could be significant (10s, 20s of correlated rows, but enough to make denormalization unpalatable). I've not tried (1) thinking that the act of querying so many times would kill performance (there are potentially millions of items to process) and have been going with (2). However this is still proving to be a bottleneck in processing. I'm hoping there is another way to go that avoids having to issue separate subqueries for each row returned on the primary query but moves more of the burden of merging the correlated data on the SQL server instead of my process.

Any and all feedback and suggestions will be most welcome!

-- Tom


.



Relevant Pages

  • Re: Discussing 3 different strategies for deleting from multiple tables
    ... I will be using SQL Server but I am riding on top of a third party ... FYI, Account contains around 20K ... >>> This results in one parameterized query followed by two more trips to ...
    (microsoft.public.data.ado)
  • Re: Efficient coordinated queries??
    ... Microsoft MVP ... Hitchhiker's Guide to Visual Studio and SQL Server ... A large chunk of it can be brought in with a single query and each row represents an item to process. ... For a hypothetical example consider a customer order database where we need to process through all customers and can bring in a row of customer data as our main query but there are 0..n1 rows of customer order data, 0..n2 rows of customer feedback information, 0..n3 rows of customer discounts available, etc. ...
    (microsoft.public.dotnet.framework.adonet)
  • Re: Calculated field in pass through query
    ... Using the same hardware, with a single user, Jet will be faster than MS SQL ... Server simply because Jet does not use log. ... Even within MS SQL Server, ... same query on the same data but on another MSSQL db, ...
    (microsoft.public.access.queries)
  • Re: Website Connectivity?
    ... does the website just query our server? ... How do I give it a login to query the SQL Server running on the server? ... It would be a serious security breach to allow a remote website to ...
    (microsoft.public.windows.server.sbs)
  • Re: Efficient coordinated queries??
    ... A large chunk of it can be brought in with a single query and each row represents an item to process. ... For a hypothetical example consider a customer order database where we need to process through all customers and can bring in a row of customer data as our main query but there are 0..n1 rows of customer order data, 0..n2 rows of customer feedback information, 0..n3 rows of customer discounts available, etc. ... Instead I want to bring in the data as individual DataSets representing an item to process. ... I'm hoping there is another way to go that avoids having to issue separate subqueries for each row returned on the primary query but moves more of the burden of merging the correlated data on the SQL server instead of my process. ...
    (microsoft.public.dotnet.framework.adonet)

Quantcast