Handling substantial volumes of time-series data
From: JGD (news_at_prodata.co.uk)
Date: 06/12/04
- Next message: janjooon: "datetime disblay"
- Previous message: Jerry Pisk: "Re: Detect Server Disconnection"
- Messages sorted by: [ date ] [ thread ]
Date: Sat, 12 Jun 2004 11:49:01 +0100
Wonder if I can ask for some very broad advice here as to the best way
to approach a project, which will involve developing a program under
VB.Net to do some long-term environmental monitoring.
Scientific data readings will be collected at shortish time intervals
(user-specified but typically every 5 or 10 minutes) and added
automatically to a data store of some type. (This 'some type' is what
I'm trying to pin down.) So data could accumulate at 100, 000 records
per year (and potentially at up to 500 000 pa for anyone using
1-minute intervals) and could, in theory at least, be accumulated for
5-10 years. Each record would consist of maybe 10-20 numeric fields.
Data would be retrieved from the store for presentation and analysis
in consecutive blocks covering eg a day/week/month - the start and end
date/times would be arbitrary (ie user-selected) but otherwise some
specified fields from all records between start and end would always
be retrieved.
The data might typically be reviewed by just an individual user on the
same PC used to store the database, but at the same time it might be
nice to build in the flexibility for future development to allow
access by more than one user (but probably never more than 2-5) across
a LAN or the net.
The user environment will always be a Windows PC and the program needs
to be distributable to users with, if possible, no minimum
requirements for installed software other than a Home version of
Windows and the .Net framework. Each user will need to populate their
own version of the data store and so distribution might require either
that the program creates its own data store from scratch or that an
empty store is distributed.
I can see that there might be some advantages eg for ease of
developing the presentation and analysis tools under VB.Net in using
a formal database to hold the data rather than a custom file-system
file. But I'm struggling somewhat to choose a formal database that
might balance the conflicting requirements for simplicity of
distribution and use, substantial (eventual)) data volumes but very
delimited data structure and retrieval requirements.
If anyone's read this far, any comments and suggestions would be very
welcome. I guess I really have two main questions at present:
1. Is it really feasible and sensible to use a formal database for
this project rather than eg a large text or binary file, given the
requirements for ease-of distribution and use?
2. If the answer's yes, then my reading thus far suggests that
possibly an Access-type database might be preferable?
And if anyone has any book recommendations or web links that might be
relevant to this type of database project then that would be great.
(By way of background, I'm an experienced VB6 developer in the process
of migrating to VB.Net, but - possibly to the amazement of some
readers here - the focus of my projects has always been scientific
data handling and I've never really had a need until now to become
more knowledgeable about database use. So even basic comments about
databases wouldn't be out of place.)
JGD
John Dann
www.weatherstations.co.uk (major update now online)
NB Reply address needs an obvious edit
- Next message: janjooon: "datetime disblay"
- Previous message: Jerry Pisk: "Re: Detect Server Disconnection"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|