RE: Attention Ken Sheridan
- From: Ken Sheridan <KenSheridan@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 27 Jan 2009 08:57:01 -0800
Dave:
Lets first deal with the Cities issue as that's just a question of basic
normalization. I can illustrate it best with an example. Firstly with a
table Contacts which has columns for City and State as well as ContactID
FirstName, LastName, AdddreesLine1, AddressLine2 etc, we might have rows with
data like this (excluding some columns for simplicity)
1 John Brown San Francisco California
2 Jane Green San Francisco New Hampshire
Jane Green's row is obviously incorrect because we all know which state San
Francisco is in, but there is nothing in the design of the table to prevent
this incorrect data being entered, nor would a visitor from Mars know which
is the correct row or would assume that there are cities called San Francisco
in both states. This is because the table is not properly 'normalized'. Its
actually not normalized to Third Normal Form (3NF), which requires that all
non-key columns in the table must be functionally dependent solely on the
whole of the table's primary key.
In the table The FirstName, LastName and City fields do meet this criterion
because the sole determinant for them is Contact ID. The State column
however is not determined solely by ContactID, but also by City (assuming,
albeit wrongly, for the moment that all city names are unique) as once we
know that the city is San Francisco we know that the contact is in
California, or we would in a correctly designed database.
The solution is to 'decompose' (yes, that is the term used) the table into
therr normalized tables, Contacts, Cities and States. Cities will have
columns CityID (because city names are duplicated we can't the name as the
key), City and State. The States column needs just the one column State (or
two if you want one for the abbreviation and one for the full name of the
state). The State column would be the key as state names are not duplicated,
so we don't need a 'surrogate' numeric key (though some people prefer to use
one anyway). So the tables would now look like this:
Contacts:
1 John Brown 42
2 Jane Green 42
Cities:
42 SanFrancisco CA
States:
CA California
In the relationships between States and Cities referential integrity would
be enforced, which means only a valid state can be entered in Cities, and
cascade updates would also be informed as theoretically a state's abbreviated
name could be changed, so changing it once in States would automatically
change it in the matching rows in Cities. In the relationship between Cites
and Contacts referential integrity would be enforced, but as CityID is
probably an autonumber whose value can't be changed then its not necessary to
enforce cascade updates.
You can enter all the state in the states table in one go of course, and
probably many of the cities in the city table, but when you need to enter a
city in the contacts table which doesn't yet exist in the Cities table you
need to enter the latter first. This is something you'd usually build in via
the interface so that when entering a contact in a contacts form you can pop
up a form to enter a new city before completing the entry in the contacts
form.
With this set of normalized tables you can see that the same city can't be
mistakenly put in two different states as the only place where we are told
which state its in is in one row in the Cities table. If by any chance the
visiting Martian was right in thinking that there are San Francisco's in both
states (probably not, but I know there are 4 Staffords in the USA because we
are twinned with them, so it’s a real possibility) then ther would be two
rows in Cities , both with San Francisco in the City column, but with
different CityID and State values, and the value in CityID in Contacts would
be that for whichever of the two is the correct one.
BTW you'll find a demo of how to handle this sort of data via correlated
combo boxes on forms at:
http://community.netscape.com/n/pfx/forum.aspx?nav=libraryMessages&tsn=1&tid=23626&webtag=ws-msdevapps
It uses the local administrative areas of Parish, District and County in my
area, but the principle is exactly the same.
Your statement "the one reason why I had put it into one table is that even
though some of the fields are not used they are all printed out on a report"
illustrates another generic point. One thing to remember is that by 'table'
we don't just mean a 'base table' i.e. a real stored table; it also means
the 'result table' of a query, so for a report you'd join the necessary
tables in a query and use that as the RecordSource of the report, which would
be just like using a single base table. If we joined the Contacts, Cities
and States tables in the above example for instance we'd end up with a Result
table which is exactly the same as the original single table before we
decomposed it, but without the scope for inconsistent data which the single
table allowed. BTW as an example of these sort of errors in a real life
scenario I once found three versions of my name as author of technical papers
in my own field (environmental planning) in one database. As far as the
database is concerned I'm three separate people!
Applying these principles to your own database is going to be down to you;
we can point you in the right direction but we don't have the insider
knowledge of you business model to be able to tell you exactly what you need
in the way of tables and relationships. Lets look at one or two points of
detail, though, which might help illustrate how the principles will be
applied in your case:
"I have all 200 of our stores in one table"
That sounds like how it should be. 'Stores' is an entity type and will have
attributes such as its address data, so each row in the table will represent
one store and each column position in a row will represent the attribute
value, e.g. a sore in Boston Mass. (where my grandmother was brought up
incidentally) might have a CityID value of 21 where 21 is the value in the
CityID column of the row in the Cities table for Boston.
"I tried to get it to work where when a person entered the market_hub the
store_and_location only listed the stores for that hub"
That's what the demo I gave the link for above does in fact. Firstly,
though, you need to identify the relationships. There are several
possibilities:
1. Each store relates to one hub only, but two stores in one City say,
could relate to different hubs. In this case the relationship is a simple
many-to-one relationship from Stores to Hubs, so you'd have a Hub (or HubID)
foreign key column in the Stores table referencing the primary key of the
Hubs table.
2. Alternatively all stores in one City could relate to one Hub, and each
Hub could cover more than one City. In this case the relationship is bwteen
Cities and Hubs so you'd have a Hub (or HubID) foreign key column in the
Cities table referencing the primary key of the Hubs table. There'd be no
Hub or HubID column in the Stores table as once you know which city its in
you know which hub it relates to.
3. Another possibility could be the same as 2 but all stores in one State
could relate to one Hub, in which case you'd have a Hub (or HubID) foreign
key column in the States table referencing the primary key of the Hubs table.
Again there'd be no Hub or HubID column in the Stores table as once you know
which state its in you know which hub it relates to.
4. Over here we tend to use Post Codes (equivalent of Zip codes) for this
sort of thing, and once a post code is entered you know not only which city
or town and county (we don't have states of course) but which street and
which part of the street an address is in. My post code for example shows
that I am on the east side of my street. You could have something similar as
regards hubs in your case you'd have a Hub (or HubID) foreign key column in
a ZipCodes table referencing the primary key of the Hubs table. Again
there'd be no Hub or HubID column in the Stores table as once you know a
store's Zip code you know which hub it relates to.
I hope that gives you some idea of the sort of questions toy need to ask
yourself when setting up your tables. I think the ball is now in you court
to come up with a model, but I'd strongly recommend that you draw it out on
paper forts before creating the tables and relationships, using boxes for
each table and arrowed lines between then for the relationships (rather like
you see in the relationships window in Access). This is called, not
surprisingly, an entity relationships diagram. Mentally test the paper model
as you go along by rigorously asking yourself does it accurately represent
your real world business model. You'll find this, like the prospect of
execution, not only concentrates the mind wonderfully, but will save you a
lot of headaches later when you come to build the database itself. Getting
the 'logical model' right at the outset is the real key to a robust and
efficient database; the interface will follow naturally from the model if its
right, but will be a PITA to design if its wrong.
Ken Sheridan
Stafford, England
"dave@homedeliverygroup" wrote:
Ken,
First off thanks for the time that you have put into your replies...
the one reason why i had put it into one table is that even though some of
the fields are not used they are all printed out on a report...
(BTW we are a delivery company for Lowes)
You have broke down my table this way
Claims (1 -9)
Customers (10-19)
WorkProgramme (20 – 25)
Materials (26 -30)
Vehicles (31 – 34)
you were almost correct in your description 1-9 is the claim info and what
hub it was out of(our hubs are mainly on the east coast) as well as what
store and the loacation of that store, which is a drop down(storesTBL) list
containg all our store information, 10-19 is the customers info where the
damage happened. 20-25 is actually the delivery contractor that caused the
damage, how much he needs to repay(as we pay off all the claims then deduct
from our contractors), the number of weeks the payoff will be deducted till
paid back in full and a brief description of the damage... 26-30 is actually
info for a merchadise damage claim if our contractor damages the merchandise
then the cost to replace the damaged merchandise would also go into the 20-25
section... 31-34 is if the contractor has an accident involving another auto,
or hits something with the delviery truck again the cost to pay back would go
into the 20-25 section. So section 1-9 is always entered as is 10-19, and
20-25... 26-30 only if merchandise damage, and 31-34 only if an accident has
occured.
I am not sure what you mean by the city and state and having a seperate city
and state table as we deliver to hundreds of citys across several states i do
see what you mean as the same city could be entered in New York as well as
Pennsylvania. Would that mean i would have to create a city/state table and
enter the info or create them and let the data entry build the tables????
Also the addresses and telphone numbers you mentioned to have a separate
addresses table that references the customers table is way beyond me...more
reading on that for me.
Currenlty there is not another person in our company that has any Access
experience other than myself and i only had it when i was in college which
was a bit of time ago... so i am moving forward myself with plenty of
reading... the DB that i have created will work for the time being, however i
want to be able to enhance, make changes and make it better functional wise i
see the flaws in it now. for instance i have all 200 of our stores in one
table... LOL one table again... must be my way of thinking... i tried to get
it to work where when a person entered the market_hub the store_and_location
only listed the stores for that hub... sounded simple, but did not work... we
have currently 10 hubs so i had 10 stores tables which made sense but then i
just entered all the info into one table... but as i read and re-read your
"stock" summaries somethings are coming to light...
dave
--
thanks for your help
Dave
"dave@homedeliverygroup" wrote:
Dave:
It seems to me that you have two entity types here, one being the claim,
i.e. the 'case' as a whole, the other being the works associated with a
claim. The claim will have attributes such as the policyholder, policy
number etc, i.e. those attributes which are specific to the claim as a whole.
The works will have attributes such as the type of work, its costs etc and
these will be specific to each set of works rather than the claim a whole.
In a relational database entity types are modelled by tables and their
attributes by the columns (fields) of the tables. So you would have a claims
table with columns for its attributes, and a works table with columns for its
attributes, the two being related in a one-to-many relationship by means of a
foreign key ClaimNumber column in the Works table referencing the primary key
ClaimNumber of the Claims table.
You are right to think in terms of a subform, but you would have only one, a
works subform within a claims form, the two linked on ClaimNumber. All
works, including those associated with the initial claim, would be entered as
records in the subform. Consequently when claim 6582188 is retrieved all the
works associated with it would be shown in the subform and as many additional
works as necessary can be added at any time simply by inserting another
record in the subform.
As it sounds like you have everything in the one table at present you will
need to 'decompose' it into the two tables. This is very simple to do.
having created the Works table you then create an 'append' query which
inserts rows into the works table from your current table. You'd append the
values of the CaseNumber column along with the values from the other columns
which are attributes of the works, and which are now represented by columns
in the works table. You can then create a relationship between the tables,
enforcing referential integrity, cascade updates (the latter to cater for a
claim number being changed after its associated works record(s) had been
entered – this might merely be the correction of a simple error by the user),
and, if appropriate, cascade deletes (this ensures that if a claim record is
deleted the works records associated with it are also deleted).
Once you are satisfied that the works table is correctly populated you can
delete the now redundant columns from the original table.
Finally you'd redesign your form to reflect the tables, deleting the
redundant bound controls from the form and adding the works subform.
You mention that the ability to add further works is only one of the
enhancements proposed. Be sure to look at the totality of these when
considering the modifications to the database which are needed as there may
well be further entity types involved and further relationships with existing
or yet to be created tables. The basis of a solid relational database is
that it is a 'logical model' of the real world entities and the relationships
between them. Getting the logical model right is the key to success. Do so
and the interface will fall into place naturally, but get it wrong and you'll
end up constantly jumping through hoops to get round the defects in the
model. I'd strongly endorse Jeff's advice that you take time to become
familiar with the principles of the database relational model before getting
too deep into the application design.
Ken Sheridan
Stafford, England
Ken thanks for you above reply you are correct i have all the info in one
table as follows...
one table name is hdg_claimTBL
1. today_date
2. hdg_ticket_number (primary key)
3. claim_status (from drop down list)
4. date_of_occurence
5. market_hub (from drop down list)
6. store_and_location (from drop down list)
7. checkpayableto (from drop down list)
8. claimType(from drop down list)
9. original_invoice
----------------------------------------------------------------
10. first_name
11. last_name
12. address
13. second_address
14. apartment_number
15. city
16. state
17. zip_code
18. telephone
19. alt_telephone
------------------------------------------------------------------------------
20. contractor_name
21. driver_name
22. deduction_amount
23. repayment_amount
24. number_of_weeks
25. description_of_claim
------------------------------------------------------------------------------
26. sku_number
27. model_number
28. merch_description
29. item_retail_cost
30. item_cost
-----------------------------------------------------------------------------
31. year
32. make
33. model
34. vin_number
as you can see it is a very larger tabel(34 items)... as you stated... now
according to "normalization" there are no duplicative columns unrelated to
the ticket number, however i do see where i can separate the table down into
five different tables of related data as i serperated them with the dashed
lines... does that make more sense? I also realize now that i can't make the
ticket number the primary key...
--
thanks for your help
Dave
--
thanks for your help
Dave
.
- Follow-Ups:
- Re: Attention Ken Sheridan
- From: Jeff Boyce
- RE: Attention Ken Sheridan
- From: dave@homedeliverygroup
- Re: Attention Ken Sheridan
- References:
- Attention Ken Sheridan
- From: dave@homedeliverygroup
- RE: Attention Ken Sheridan
- From: dave@homedeliverygroup
- Attention Ken Sheridan
- Prev by Date: Re: HOW do I insert a number in a field....
- Next by Date: RE: Adding a form to a table
- Previous by thread: RE: Attention Ken Sheridan
- Next by thread: RE: Attention Ken Sheridan
- Index(es):
Relevant Pages
|
Loading