Re: census database design
- From: talktobatchu <talktobatchu@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Mon, 11 Jul 2005 20:22:02 -0700
First of all thanks for the suggestion. The database design, mean the table
structure each having its own specific data sounds perfect to me. I'm sure
this works fine for my census data.
Will implement it right now to look into the real scenario(s).
"tina" wrote:
> you're making a common newbie mistake in table design: storing data in
> table names (counties) and field names (age groups). you should have one
> table for the counties (not one table for each county), and one record for
> each age group. i would probably use the following table design, as
>
> tblDistricts (your ctyDistrictdata table)
> DistrictID (primary key)
> CountyName
> Region
> (it's not clear if a district may have multiple counties in it, or vice
> versa, or if a region may have multiple counties, or vice versa. so this
> table design may or may not be correct.)
>
> tblRaces
> RaceID (pk)
> RaceName
>
> tblAgeGroups
> GroupID (pk)
> GroupDescription (field should be Text data type)
> (one record for each defined age group: "Under 5 years", "5 to 9", etc)
>
> tblCensus
> CensusID (pk)
> DistrictID (foreign key from tblDistricts)
> RaceID (fk from tblRaces)
> CensusYear
> GroupID (fk from tblAgeGroups)
> CensusCount
>
> so if, in 2004, one county district had a count of people for each of five
> races in five age groups, then tblCensus would have 25 records for that
> data - one record for the number of persons in each age group of each race.
>
> with the table setup outlined above, you can "slice and dice" the data just
> about any way you want, for statistical analysis. for example, to get the
> query results you described in your post, you would set criteria for the
> counties you want to see, the race, the specific years, and the age group -
> all on tblCensus, only. you can easily use a Totals query to present that
> data as two records, one for each year, with a sum of the person count for
> each year.
>
> the ironclad rule of database design is: FIRST, define the
> tables/relationships to correctly model the process, according to proven
> normalization guidelines; build the tables and set the relationships
> according to that model. then, and ONLY then, begin to build your queries,
> form, reports on top of your table structure. to learn the basics of
> relational data modeling, one good text is Database Design for Mere Mortals
> by Michael Hernandez. also, the following webpage has many useful links to
> help you:
> http://www.ltcomputerdesigns.com/JCReferences.html
>
> hth
>
>
> "talktobatchu" <talktobatchu@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
> news:4367ACED-6F19-4786-81BB-E87FE22CFD6F@xxxxxxxxxxxxxxxx
> > Hello
> > I need to store information regarding race wise population
> of
> > every county of my state year wise, which is nothing but census data. Here
> > the data is categorized basing on different age groups. After thinking
> about
> > a lot of options for storing this kind of data, came up with a table(s)
> > design as follows.
> >
> > Pop_RaceAndAgeGrpWiseFrm2001C1 - this table stores data for county one.
> here
> > the ctyDistrictID is 1.
> >
> > the fields are: Pop_RaceAndAgeGrpWiseFrm2001C1.EntryID,
> > Pop_RaceAndAgeGrpWiseFrm2001C1.ctyDistrictID,
> > Pop_RaceAndAgeGrpWiseFrm2001C1.RaceName,
> > Pop_RaceAndAgeGrpWiseFrm2001C1.statsYear,
> > Pop_RaceAndAgeGrpWiseFrm2001C1.Under5Years,
> > Pop_RaceAndAgeGrpWiseFrm2001C1.5to9Years,
> > Pop_RaceAndAgeGrpWiseFrm2001C1.9to14Years...
> > Pop_RaceAndAgeGrpWiseFrm2001C1.85AndAbove
> >
> > there are 33 counties in our state and have 33 different tables whose
> > structure looks same as the above one with
> > 'Pop_RaceAndAgeGrpWiseTblC1.ctyDistrictID' being different for each
> county.
> > There is a 'ctyDistrictdata' table which stores the information relating
> to
> > the counties like its name, ID and region etc... After storing the census
> > data into all those different county tables year wise, everything looks
> fine.
> > We have the relationships as, the 'ctyDistrictID' field from all the 33
> > tables are linked to 'ctyDistrictdata' table onto the same ID(field).
> >
> > But when a query is written to pull the 'Under5Years' data for a
> particular
> > race and for two specified years from 5 different counties(tables) by
> linking
> > them, the result has so many duplicates in it. But the expected result
> should
> > only have 2 tuples: first for year one and second for year two. The query
> > looks as follows:
> >
> > SELECT DISTINCT Pop_RaceAndAgegrpwiseFrm2001C1.statsYear,
> > Pop_RaceAndAgegrpwiseFrm2001C1.RaceName,
> > Pop_RaceAndAgegrpwiseFrm2001C1.[under 5 years],
> > Pop_RaceAndAgegrpwiseFrm2001C2.[under 5 years],
> > Pop_RaceAndAgegrpwiseFrm2001C3.[under 5 years],
> > Pop_RaceAndAgegrpwiseFrm2001C4.[under 5 years],
> > Pop_RaceAndAgegrpwiseFrm2001C5.[under 5 years]
> > FROM Pop_RaceAndAgegrpwiseFrm2001C1, Pop_RaceAndAgegrpwiseFrm2001C2,
> > Pop_RaceAndAgegrpwiseFrm2001C3, Pop_RaceAndAgegrpwiseFrm2001C4,
> > Pop_RaceAndAgegrpwiseFrm2001C5
> > WHERE (Pop_RaceAndAgegrpwiseFrm2001C1.RaceName="White" And
> > Pop_RaceAndAgegrpwiseFrm2001C2.RaceName="White" And
> > Pop_RaceAndAgegrpwiseFrm2001C3.RaceName="White" And
> > Pop_RaceAndAgegrpwiseFrm2001C4.RaceName="White" And
> > Pop_RaceAndAgegrpwiseFrm2001C5.RaceName="White");
> >
> > I know there should be something wrong in the query which is written above
> > or in the way table join is performed. It might be that the design of the
> > tables itself for storing the census data is in a bad shape. Even after
> > looking at things, nothing is working out good. Which way the above kind
> of
> > census data can be stored in a database with a better design. Please
> assist
> > me in this.
> >
> > Thanks for any help or suggestions provided.
>
>
>
.
- References:
- census database design
- From: talktobatchu
- Re: census database design
- From: tina
- census database design
- Prev by Date: Re: census database design
- Next by Date: Re: Inspection ***
- Previous by thread: Re: census database design
- Index(es):