Re: Database Design & Normalization Question

From: Jason Hedges (jasonh_at_bsmgr.com)
Date: 11/19/04


Date: Fri, 19 Nov 2004 08:50:45 -0800

Anthony,

Thanks for your reply and the book recommendation.

Can you give me more direction on the date issue? I
understand the logic behind the books, videos, software,
etc. What do you do with data like the date fields?
Should each date type (pub date, street date, etc.) be a
separate field/column in a table? I thought the way I had
done it allowed for alot of flexibility because you
didn't have to change the database design to store new
types of data that were similar to other types already
being stored. Here's another example (similar to the date
scenario):

I have a table that defines "title types" (sub title,
foreign title, title as it appears on the product, etc.).
I have a many-to-many relationship table that stores a
product identifier (relating back to the master product
table), a title, and a title type identifer (relating to
the title types table). This seemed flexible since I can
decide to store a "misc title" simply by adding
another "title type" to my title types table and using
it's identifer.

How should I be storing this type of data? Should each
type of title be a separate field in a table? Should I
just store NULLS or empty strings when I don't have a
value for one of these fields?

Thanks!
Jason

>-----Original Message-----
>Mark is on a different path, although, headed in the
right direction.
>
>First, you need to understand this, the goals of
database design and those
>of application design are predicated on two different,
sometimes opposing,
>sometimes cooperative, criteria. Applications are
designed for data
>MANIPULATION and PROCESS. Databases, however, are
designed for data
>INTEGRITY and IDENTIFICATION.
>
>In the sense that you must understand what the User
wants is absolutely
>correct. For the application, the GUI and the process
of information is a
>perfectly acceptable viewpoint to model the appliction
design and process.
>Howerver, this is completely wrong for the design of the
database system.
>
>I am assuming that many of those "unique" attributes are
based on the
>product "type." If there are truely common
caracteristics of Entity Classes,
>then perhaps a Super Table Subordinate Table relation
would work better,
>although the current SQL DBMSes do not fully support
this sort of structure
>well.
>
>The idea is that there are Products, which have common
attributes, but,
>then, there are Books that are dissimilar from software
and videos. Now, you
>can call all of these Products, but, then, we must deal
with the type
>specific attributes. All in the same table allowed
NULL? Seperate table and
>make entries for only those that have data? For all of
the unique columns or
>only some because it is likely that you will only
populate some base on type.
> Aha, sounds like a subordinate class. You create a
Book table for the Book
>entity, a Video table for the Video entity, etc, each
with its own specific
>attributes. You can tie all of these back to a common
Products table to
>carry the common attributes if you wish. The point is
that Books are not
>Videos are not Software, even if they do have common
characteristics. Just
>because humans and fish have eyes and mouths does not
make a human a fish nor
>a fish a human. Aha, but they are both animals. See
the structure?
>
>The point is that data modeling comes down to describing
the real world, not
>as a process, but as a classification, for delineating,
reducing, individual
>attributes.
>
>I think you basically had the right ideas for most of
what you described;
>however, take a look at the seperate or super-sub table
structure. But the
>dates, no, that is where you started getting into the
MUCK area. A
>pulication date is not a release date any more than a
start time is the same
>as an end time. Is your birthdate the same as your
death date? I certainly
>hope not!
>
>Forget about all of that garbage about normalization
performance. Research
>has shown that the proper normalization of data to
ensure the integrity and
>reduce the duplicity of data is by far the best means to
garauntee the best
>performance. That is why we have DBMSes, to maximize
the performance GIVEN a
>normalized database. A DBMS is a physical mechanism,
the data model and its
>normalization is a logical one. The two are indepenent,
although related, or
>better, derived. That is, the physical is derived from
the logical, not the
>other way around. So, performance should never be a
constraint on the
>design, only the integrity of the system.
>
>Feel free to follow up if you think we could provide you
with any further
>assistance.
>
>Check out "An Introduction to Database Systems" by C. J.
Date. I think you
>would find it enlightening.
>
>Sincerely,
>
>
>Anthony Thomas
>
>
>"Jason Hedges" wrote:
>
>>
>> Ok, maybe I've travelled too far down the "everything
is
>> a type" road already.
>>
>> Here's the scenario: I am designing a new database and
>> application in SQL Server and .Net for managing
product
>> information for bookstores (strictly product data, not
>> sales, customers, etc.). The current database and
>> application have been developed with COBOL. The main
>> types of products we deal with are books, music,
bibles,
>> videos, software, and gifts. Every product has some
>> common attributes (item number, title, retail price,
>> vendor) and some products have additional attributes
(sub-
>> title, measurements, publish date, release date,
foreign
>> language title, etc.). Products can have from 1 to 3
>> contributors (author, artist) and be categorized in 1
to
>> 3 categories.
>>
>> I've ended up with alot of many-to-many tables. For
>> example, I have the main products tables, a master
>> category table, and a many-to-many table that links
>> products with multiple categories. Same thing for
>> contributors. Another example: since there are several
>> types of dates that can be stored for a product
(publish
>> date, release date, etc.), I have a "Date Type" table
and
>> a many-to-many table that holds a list of dates and
their
>> respective "date type" and the product item number
they
>> belong to. I thought this would be the best way since
>> the number of each was an unknown.
>>
>> I was under the impression that having a lot of fields
in
>> a table that were not used was poor design. Example:
>> having a SubTitle field in my main product database
would
>> be poor design since many products don't have a sub-
>> title. Is that true or false?
>>
>> I'd appreciate some more input if you have enough info
>> here to go on.
>>
>> Thanks!
>> Jason
>>
>> >-----Original Message-----
>> >I agree with the assessment that no sound judgement
may
>> be passed on any
>> >particular design without the full business rules the
>> design was based on;
>> >however, there are a few problems that crop up now
and
>> again.
>> >
>> >First, database normalization is a classification
>> mechanism that attempts to
>> >model a real world system. It is a method of
logically
>> seperating out
>> >individually defined "nouns" or entities.
>> >
>> >What I often see is that developers new to database
>> design tend to apply
>> >leasons learned from OOP/OOA to data modeling.
Database
>> design is a
>> >reductionist exercise to classify and reduce data
>> redundancy. OOP/OOA is an
>> >aggregation of process consolidation whereby a user,
by
>> generalization, may
>> >abstract out common features. These are at two polar
>> extremes, or orthogonal
>> >to one another. Be careful.
>> >
>> >The first sign that you may have strayed too far is
when
>> you start building
>> >MUCK tables and overly use Many-to-Many
relationships.
>> Everything is a
>> >"type" with an ID, Name, and description, right?
>> WRONG! Don't go down that
>> >track.
>> >
>> >Sincerely,
>> >
>> >
>> >Anthony Thomas
>> >
>> >
>> >"Anith Sen" wrote:
>> >
>> >> >> How far should I go with normalizing my database?
>> >>
>> >> As far as you can. In other words, full
normalization
>> upto BCNF in case of
>> >> tables with single column keys ( which will be in
5NF
>> anyway ) and upto 5NF
>> >> in case of tables with composite keys and
overlapping
>> sets of values in
>> >> different rows are mandatory to avoid data
>> modification anomalies.
>> >>
>> >> >> Is this the correct way to design the database?
Is
>> it worth the trouble
>> >> >> to do it this way?
>> >>
>> >> Without a detailed knowledge of your business model
>> and existing entity
>> >> types, its attributes and applicable relationships
>> others cannot comment on
>> >> a particular design narrative. If you are familiar
>> with higher normal forms
>> >> beyond 1NF, every principled decomposition to avoid
>> moification anomalies is
>> >> definitely worth the "trouble".
>> >>
>> >> BTW, NULLs have nothing to do with any normal forms
>> beyond 1NF.
>> >>
>> >> >> It seems like it would be alot easier for
INSERTS,
>> UPDATES, and SELECTS
>> >> >> to have the extra title fields as part of the
>> Products database and just
>> >> >> store empty strings in them if I don't have a
value.
>> >>
>> >> "Ease" of writing INSERTS, UPDATES, and SELECTS is
not
>> a design principle;
>> >> but integrity preservation is. You wouldn't
consider a
>> single table for
>> >> representing the entire schema by such an
assessment
>> of ease, would you?
>> >>
>> >> >> Just want to confirm that I'm heading in the
right
>> direction.
>> >>
>> >> If you are considering full normalization in your
>> logical design process,
>> >> you are on the right track.
>> >>
>> >> --
>> >> Anith
>> >>
>> >>
>> >>
>> >.
>> >
>>
>.
>



Relevant Pages

  • Re: MVC design questions
    ... Instead of a single CRUD GUI program I've got two, the idea being that one can be given to anybody because it can't damage the database (no insert/update/ delete capability at all while the other, which can do these things, is intended only for suitable authorised users: most data is bulk loaded and the update GUI is only there for removing to odd piece of junk and tidying up. ... In fact, i'd create three - a full-access account for the DBA, an account with the ability to edit the data, but not do DDL, for the update app, and a read-only one for the search app. ... I have almost no OO background and have no idea who the currently recognised OO design gurus are. ... That era was dominated by big ideas - big processes, big notations, big books, big words. ...
    (comp.lang.java.programmer)
  • Re: Database design
    ... On my single table database I have 28 categories and ... Jason's design is much preferable. ... Contacts who qualify for SAY; ... I have three books that cost me over $500 and none cover this topic! ...
    (microsoft.public.access.tablesdbdesign)
  • Re: I want to learn access
    ... up the database properly. ... > CDs that assume that you know how to design forms, tables, etc, which I ... All the books that I have seen on ... I am even ready to preapre applications for people ...
    (microsoft.public.access.gettingstarted)
  • Re: books about database design / modeling
    ... Frank Swarbrick wrote: ... Most 'database' books seem to focus a lot on administration and not much on ... Any good books I should look for? ... Database design and application development for Oracle: ...
    (comp.databases)
  • Re: Recommended books on Top Down Design
    ... "Top down design" still happens, but books these days don't promote it. ... The Art of Computer Programming by Knuth ... some dynamic typing abilities. ...
    (comp.programming)