Re: Why does 1:3 relationsihp require another table?

From: Louis Davidson (dr_dontspamme_sql_at_hotmail.com)
Date: 05/15/04


Date: Sat, 15 May 2004 13:20:26 -0500

I don't quite understand why the order of values in columns makes a
difference to 1NF? As long as each row has the same number of values, it
would not be an issue.

> with NULLs that mean UNKNOWN. But what NULL, or any element any
> domain, for that matter, means, is irrelevant to the determination of
> whether something is 1NF or not, and this argument boils down to the
> argument "You can't use NULL for anything but UNKNOWN". Of course you
> can, but if you do, SQL does become a little less intuitive.

I think that the entire (and pretty much unimportant from a reality
standpoint) argument would change if NULL could have any meaning, but as a
matter of principal, the value of null is UNKNOWN, based on Codd's Rule #3.

DoesNotExist <> NULL, because logically DoesNotExist = DoesNotExist,
certainly if the values are of the same domain. This is why we have so much
trouble with comparisons of two column where NULL may mean doesNotExist. It
is why in our models we should always use NULL to mean Unknown, as in This
object has a value for Blurg, it is just not known currently. Take a person
who goes by one name, say Bob. Now if we enter his name as Bob Null, and
Susie (who has a last name that is not known) Null is also in there, it is
indistinguishable that Bob has no last name and Susie just isn't sharing
right now. So we should store 'Bob' '' and Susie Null and we know the
correct answer now. If we want to look for users whose last name
DoesNotExist, we look for last_name = ''.

Anyhow, I will concede that this may not be a major violation of 1NF, but it
is a minor one because by using DoesNotExist, you are in essence
reproducing:

R={(Louis,DEF,ABC),(Lewis,012,789,456,123)}

using

R={(Louis, DEF,ABC,doesNotExist,doesNotExist),(Lewis,012,789,456,123)}

With logical trickery. It works, not going to argue that, and as such is a
very minor point, but I go back to my argument (that you kind of disagreed
with in another post) that to do it right the first time and it will be
easier later. Sure if the rules change you can simply rebuild the
structures, but I like to do my work up front, rather than taking the
chance.

-- 
----------------------------------------------------------------------------
Louis Davidson (drsql@hotmail.com)
Compass Technology Management
Pro SQL Server 2000 Database Design
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are
interested in consulting services.  All other replies will be ignored :)
"Steve Kass" <skass@drew.edu> wrote in message
news:upieODkOEHA.1456@TK2MSFTNGP09.phx.gbl...
> Louis,
>
>   I agree that for quite a few kinds of small variations on the original
> real-world model, the two-table design can more easily be changed into a
> correct design for the new real-world model.  That's a separate and
> no-less important issue from the issue of whether the one-table model is
> 1NF or not.
>
>   The reason you gave for calling the one-table model non-1NF, that not
> all rows have the same number of columns, says to me that you want the
> use of NULL to represent "doesn't exist" to always be a violation of
> 1NF.  That's worth considering, but I think I can weasel out from under
> it.  ;)  In order to do that, I'll assume you believe in NULLs, and that
> there can be a relation with NULLs in its tuples (hence 1NF models with
> NULLs, since every relation is 1NF).
>
>   Suppose Louis's original password was ABC, and he changed it once, to
> DEF.  Lewis started out with 123 as his password, and has changed it to
> 456, then again to 789, and again to 012.  I think you're claiming that
> the three-password model is not 1NF because it then takes the form of
> this non-relation:
> NR={(Louis, DEF,ABC),(Lewis,012,789,456,123)}.  NR is definitely not a
> relation, since it's not a subset of any cartesian product of domains
> (I'm using a notation that can only represent this kind of non-relation
> if the"missing" attributes are all at the end, but I don't think that
> obscures anything).
>
>   But R={(Louis,
> DEF,ABC,doesNotExist,doesNotExist),(Lewis,012,789,456,123)} is a
> relation, and 1NF, since R is a subset of {strings} X {strings union
> {doesNotExist}}X {strings union {doesNotExist}}X {strings union
> {doesNotExist}}, assuming we are allowed to work in the mathematical
> world of three-valued logic.  doesNotExist is what some people call NULL
> when they extend non-NULL relational algebra and step into the muddy
> waters of 3VL.  The fact that I didn't call it NULL is certainly
> irrelevant.  Since you believe in NULL, you believe that R={(Louis,
> DEF,ABC,doesNotExist,doesNotExist),(Lewis,012,789,456,123)} is 1NF, so I
> can only conclude that you do not believe Louis and Lewis are correctly
> represented by R, and that NR is what my model is in the scenario I
> described.
>
>   Here's a good argument for not letting me represent Louis and Lewis by
> S.  If I did, Louis would not be able to ask for some good password
> suggestions by inquiring
>
> select p from someGoodPasswordSuggestions
> where p not in (
>   select password from S where Name=Louis
>   union all
>   select oldPass1 from S where Name=Louis
>   union all
>   select oldPass2 from S where Name=Louis
>   union all
>   select oldPass3 from S where Name=Louis
> )
>
> I agree that this query returns no value when applied to R, regardless
> of the cardinality of someGoodPasswordSuggestions.  I also know this is
> an ugly query, but that doesn't prove anything.
>
> It's not a bad attempt to show that S can't be what we have, but it's
> based on the false assumption that SQL implements relational algebra,
> which it doesn't.  SQL is an attempt to implement relational algebra
> with NULLs that mean UNKNOWN.  But what NULL, or any element of any
> domain, for that matter, means, is irrelevant to the determination of
> whether something is 1NF or not, and this argument boils down to the
> argument "You can't use NULL for anything but UNKNOWN".  Of course you
> can, but if you do, SQL does become a little less intuitive.
>
> In my book, R={(Louis, DEF,ABC),(Lewis,012,789,456,123)} does represents
> the state of a real-world model.  Not the one we have, but one where
> Louis is only prohibited from using his most recent password, and Lewis
> is prohibited from using his three most recent.  The model R, which is
> not 1NF, represents, happens indeed to be a good example of a variation
> on the original model that a simple variation on the one-table model
> can't represent.  I don't claim to be able to design a 1NF model for
> that scenario, though, and the fact that the one-table model can't be
> changed only slightly to represent something it isn't claiming to
> represent is a separate issue from whether it's 1NF.  ("That can't be an
> animal, because it doesn't look anything like a pea***").  In my model,
> each user is prohibited from using his or her last three passwords, and
> Louis's NULLs means that there is a place in the model for a second and
> a third most recent password, but no string currently has that role.
> Fourth most recent passwords and beyond are not modeled, and there is no
> place for them.
>
> Well, I doubt I've made a convert of anyone, but I hope we all agree
> that the questions are interesting and worth thinking hard about.
>
> SK
>
>
>
> Louis Davidson wrote:
>
> >Actually address1 and address2 are very good examples of a 1nf violation
> >that we all just kind of live with.  ( I personally just go with a column
> >called streetAddress and let it include carriage returns and line feeds
most
> >of the time.
> >
> >The problem with the solution of the old passwords twofold.  1nf requires
> >that every row have exactly the same number of values.  This is a problem
> >when the user is created, since they only have 0 old passwords, so the
row
> >looks like:
> >
> >UserId  ...  Password    OldPassword1  OldPassword2   OldPassword3
> >1                fred             null                    null
> >null
> >
> >Ignoring that symantically what you are saying here is that the old
> >passwords are unknown (they are actually known to exist) what do you do
then
> >the user changes their password to fred1
> >
> >UserId  ...  Password    OldPassword1  OldPassword2   OldPassword3
> >1                fred1           fred                    null
> >null
> >
> >you put the value in OldPassword1, easy enough, but when they change to
> >fred2, where do you put the value?  You shift right, using tacky
procedural
> >code:
> >
> >Set oldPassword3 = oldPassword2, oldPassword2 = oldPassword1, etc
> >
> >Okay, but the real problem comes in when your users realize, that to keep
> >their old password, all they have to do is change their password 4 times,
> >and they are back to where they started (windows lets you, I have used
this
> >trick on a server when the password was being forced not reuse passwords
for
> >3 times.  So you have the last time they changed the password (it is one
of
> >the columns represented by ...) but what if you want to get smart and not
> >allow a password reuse for 2 months, or a year, or ever?  Or what if you
> >want to do some sort of matching algorithm to determine if the user is
just
> >reusing a root value (which users frequently do) or to look for some sort
of
> >pattern.
> >
> >By building the table like:
> >
> >UserPasswordHistory
> >
> >UserId   DateChanged  Password
> >1           20040101         fred
> >1           20040201         fred1
> >1           20040301         fred2
> >
> >You give yourself the power to do this very easily, and with way more
> >flexibility.  Most of the time, more tables does not equal more
complexity,
> >usually less (from the standpoint of the database developer, which
frankly
> >is all we care about here :) because now the meatier bits of code are
done
> >in simple set based operations.  A trigger can be written to maintain the
> >number of password revisions allowed, and even a settings table could be
> >created to allow this setting to be changed by changing a single value.
The
> >trigger could delete all rows that were greater than the number set in
the
> >settings, and/or to disallow reuse of a password for N months.
> >
> >The goal of doing proper design is flexibility to store and use the data
> >using relational techniques, not using hokey procedural code, since the
SQL
> >language was created to deal with sets of data, not scalar values.  I am
not
> >going to lie to you, it can certainly be done the way you have suggested,
> >and you will not likely have any real problems with it, unless the user
> >changes their mind (which almost never happens.)   It will be more
trouble
> >to code (in comparison to using SQL as it is intended,) and you will lose
> >information that may become useful to you ("When did I change my password
> >last month?")
> >
> >
> >
> >
> >
> >
> >
>

Quantcast