Re: dataset Performence Issue
- From: "Nigel Norris" <nospam@xxxxxxxxxx>
- Date: Thu, 4 Aug 2005 09:01:06 +0100
Sahil,
An interesting and well argued post. However I fear you are setting up a
Straw Man here. I never suggested that a DataSet is an alternative to a
DBMS. All I'm suggesting is that the OP'S use of a 70Mb DataSet is not
necessarily 'awful'. Indeed there may be cases where that's a very good
thing to do.
Clearly an in-memory approach has inherent scaling limits - the virtual and
physical memory on the machine. But 70Mb is a perfectly reasonable amount
of data to hold in memory in some applications. A suitable structured
in-memory solution tailored to specific queries can outperform a DBMS.
Network and disk costs see to that.
More comments in-line.
Nigel
"Sahil Malik [MVP]" <contactmethrumyblog@xxxxxxxxxx> wrote in message
news:u$hUoCFmFHA.3300@xxxxxxxxxxxxxxxxxxxxxxx
> 1. First of all, Dataset is guess what - Managed Code, and to send all the
> .NET lovers in tailspin, Managed Code can never be as fast and as
> optimized as native code.
Ok, Managed Code has a cost. But DBMS's require network and disk operations,
which have a way bigger cost. You have to look at all the costs before you
can conclude that one approach is faster than the other.
> 2. Secondly, the Garbage collector is that animal that makes things very
> very good for 90% of the situations i.e. normal memory usage, but when you
> start storing many megabytes or close to a gigabyte of information
> completely in RAM - it will actually hurt your application performance. In
> those scenarios, you don't want an external policeman who doesn't
> understand the specific needs of your app. In that situation, you want
> fine control on the memory where you specifiy when it gets cleaned, or
> serialized to the disk etc. You need paging mechanisms etc. which are
> possible to write for the dataset but are a real royal pain to write and
> even then they don't work quite as well as - guess what - native code
> (i.e. most of what SQL Server).
>
My data is going to fit in VM - I would never suggest writing any form of
paging mechansism. Then you really are re-inventing a DBMS. The only paging
mechanism I'm relying on is the OS paging, which is - guess what - native
code that has been optimized to the max. In any case I'd expect to have
enough physical memory not to need paging.
> 3. SQL Server and any database comes with a "Query Engine". The number of
> optimizations built into that is the work of many Phds (or dudes with
> similar smarts and specialization), they have written up SQL Server's
> query engine to take advantage of automatic paging, locking algorithms,
> spilling over to the disk when needed, "query plans", caching those query
> plans - when you compare the object model of a Dataset (or any biz object
> for that matter), the comparison is like comparing a candle with the sun.
>
And it needs all that complexity because it's trying to minimize disk IO,
and because it's got to service every kind of request thrown at it. An
in-memory DataSet used for a specific purpose doesn't need any of that to
achieve the same performance.
> 4. The algorithms in a DataSet is rudimentary, they rely on simple
> techniques such as string matching, string manipulation - that level of
> simplicity. They work on an "Object structure", every value they access
> goes over a dereferenced segment calculation.
DataSets can maintain indexes, as far as I understand. I presume an index
will use a standard Hashtable - which is a reasonably optimized lookup
algorithm. So for a single indexed column lookup performance should be very
good.
>
> 5. Lets not forget transactional locks and many other such points, I
> blogged about it earlier over here -
> http://codebetter.com/blogs/sahil.malik/archive/2005/01/23/47547.aspx
>
Don't need any of that stuff. In-memory is fast - if you really need writes
then just serialize everything with a reader/writer lock.
> 6. Datasets are or any such object - AN IN MEMORY disconnected cache of
> data. Being completely in memory lends them to the disadvantage of a 32
> bit OS's 2 GB memory allocation limit,
Yep - if you have, or may need, 2 Gb of data, use a DBMS. Or use a 64-bit
machine.
> Again, I strongly and vehemently disagree with an architecture that puts 1
> GB data into a DataSet. That is complete stupidity in both .NET 1.1 and
> 2.0.
>
As part of your straw man argument, you've escalated the OP's 70 Mb a bit!
.
- Follow-Ups:
- Re: dataset Performence Issue
- From: Nigel Norris
- Re: dataset Performence Issue
- References:
- dataset Performence Issue
- From: Ashishthaps
- Re: dataset Performence Issue
- From: Sahil Malik [MVP]
- Re: dataset Performence Issue
- From: Adrian Moore
- Re: dataset Performence Issue
- From: Sahil Malik [MVP]
- Re: dataset Performence Issue
- From: Nigel Norris
- Re: dataset Performence Issue
- From: Sahil Malik [MVP]
- dataset Performence Issue
- Prev by Date: Re: Merge and Get Changes returns nothing
- Next by Date: Re: Howto bind a date to a textbox from a strongly typed dataset
- Previous by thread: Re: dataset Performence Issue
- Next by thread: Re: dataset Performence Issue
- Index(es):
Relevant Pages
|