Re: large queries: optimal settings (sqlserver, jtds)
- From: Joe Weinstein <joeNOSPAM@xxxxxxx>
- Date: Wed, 11 May 2005 09:27:45 -0700
Mikee wrote:
Hi
I'm using java servlets with jdbc to query MS sqlserver.
Some of the queries can be large i.e. querying a table of 1 billion rows and resultsets of 1 million rows. The queries are all read only looping through the resultset up to a maximum of 5 million rows. Sometimes the query is performed using joined tables (select * from a,b where a.ib=b.id). A full table scan on our hardware takes 20-30 minutes.
Hi. I'd have to say that you should probably spend whatever resources you have to redesign this. It's not right for a servlet to select millions of rows of data out of the DBMS. Operate on raw data where it is, in the DBMS. Build your saw mill where the trees are. Only bring out data that a user is going to need to look at. I helped change a payroll application from external processing that pulled all the raw data out and processing it, to moving the same algorithms to DBMS stored procedures. This changed the system from requiring a 16-CPU HP box and taking 8 hours to run, to taking under 50 minutes. Joe Weinstein at BEA
I'm using the sqlserver supplied driver and more recently trying the jtds one.
So what are the best settings I should use for my connection and statement objects.
Specifics: i) When doing a statment.execute() does this only return once the query has finished or are results passed to the client on the fly.
ii) selectMethod - this option only seems to apply to the sqlserver supplied driver. Default is "direct" but the docs say this caches the enire result set to memory, so for large queries use "cursor". However using direct I don't seem to witness more memory being used up than when using cursor. Speed performance seems about the same if the setFetchSize is not set too small.
iii) Is there a way of roughly knowing waht setFetchSize to use.
iv) I want to use the statement.setQueryTimeout to halt long running queries and put them into a queue. The queryTimeout only seems to have an effect if the statement.execute has not yet returned in the time set but this goes back to i). If the execute returns has the query completely finished and are all the results stored somewhere or is the query still running. Logic tells me the former but I've run a query that takes several minutes but results start appearing after 30 secs or so.
Thanks Mike
.
- Prev by Date: Re: cancelled at the user's request
- Next by Date: Re: JDBC ResultSet writes to tmp file
- Previous by thread: Re: Using JDBC Connection Pooling via Websphere JDBC Provider using a DataSournce Name
- Next by thread: syntax error on escape sequences with bind variables
- Index(es):
Relevant Pages
|
Loading