Re: VFP 9 and file size limits
From: Ken Dibble (balderdash_at_spongemop.com)
Date: 12/10/04
- Next message: rajani: "Re: loop never exits on ESC"
- Previous message: Gene Wirchenko: "Re: Moving VFP8- project from W2K to WXP"
- In reply to: Gene Wirchenko: "Re: VFP 9 and file size limits"
- Next in thread: Gene Wirchenko: "Re: VFP 9 and file size limits"
- Reply: Gene Wirchenko: "Re: VFP 9 and file size limits"
- Messages sorted by: [ date ] [ thread ]
Date: Fri, 10 Dec 2004 03:24:45 GMT
>Hi Gene, could you, please, be more specific. Few words responses doesn't
>>clarify anything.
>>If you add a new record, all fields have initial values. In this scenario
>>haw can one doesn't enter something, if just there is "something" in the
>>first place?
>>In a multi-column record, how can you assign values for just few columns?
>
> You do not have the column in the first place in that table if it
>might not have data for that relation. The canonical example of this
>might be the reports-to link in an employee table. At least one
>person does not have a boss. It looks something like:
>
> EMPLOYEES
> EmpNbr EmpName ReportsTo
> ------ ------- ---------
> 1 Big Cheese NULL
> 2 Mid Cheese 1
> 3 Medium Cheese 1
> 4 Small Cheese 2
> 5 Little Cheese 2
> 6 Tiny Cheese 3
> 7 Auditor NULL
>(Say, the auditor is externally appointed and does not report to
>anyone.)
>
> NULLs are not necessary with:
>
> EMPLOYEES
> EmpNbr EmpName
> ------ -------
> 1 Big Cheese
> 2 Mid Cheese
> 3 Medium Cheese
> 4 Small Cheese
> 5 Little Cheese
> 6 Tiny Cheese
> 7 Auditor
>
> REPORTSTO
> EmpNbr ReportsTo
> ------ ---------
> 2 1
> 3 1
> 4 2
> 5 2
> 6 3
>
> Note that employees 1 and 7 do not have a row in the second table
>since they do not report to someone.
>
> If it were possible to have more than one report-to, another row
>can be added without having to redo the structure. (Constraints might
>have to be modified if only one report-to was allowed before.)
It seems to me that any theory can be carried to extremes. I think
there's a normal form rule along the lines of, "never record the same
piece of data twice". Taking that super-seriously in every case would
result in stuff like:
Table FirstNames
PK Name
1 Adam
2 Alex
3 Alice
Table LastNames
PK Name
1 Abercrombie
2 Adams
3 Ainsley
Table Birthdates
PK Bdate
1 01/01/2001
2 01/02/2001
3 01/03/2001
Table People
PK FirstName LastName Bdate
1 1 3 2
2 2 2 1
3 2 2 2
4 3 1 3
Sometimes theory gets unwieldy.
The approach to avoiding nulls that you've described will, in
applications that have to handle more than a small number of data
points across large numbers of records, likely have performance
problems when trying to marshall significant information about
something to show to a user, and may also result in a very
slow-running system when doing maintenence on those hundreds of
tables.
IMO, separating out a table when it's possible to record more than one
of the same kind of item is almost always necessary and worth the
overhead (I can see value to using memo fields like little
"mini-tables" too, on occasion--and the VFP IDE's extensive use of
them for such purposes demonstrates that there really isn't anything
to fear from that approach). But creating extra tables merely to avoid
the use of nulls is not. Especially when there are workarounds that
can accomplish that same goal in most cases:
I don't see a real need for NULL to indicate "I don't know" very often
with VFP--which is one of the reasons I believe that the xBase data
system is superior to the set-based systems that are out there. But
there are a couple data type categories that, when nothing has been
deliberately stored to them, can be confused with "empty": logical and
numeric (encompassing integer, double, and currency).
In the case of numeric fields one can, in most cases, just store the
data as character instead and convert it with VAL() and INT() as
needed. Then MyNumber == SPACE(FieldLength) is, indeed, "I don't
know." The exception here would probably be for calculations involving
large numbers of records where 0 would be significant, like averages;
there'd be a performance hit involved in the conversions. On the other
hand, you'd have to screen out the null records before doing the
calculations--also a performance hit, though probably a much smaller
one. And in cases where the thing being averaged is more than one of
the same data type belonging to the same entity (order totals, say,
but not ages), you'd indeed have a reason for having separated out a
new table anyway.
As for logical fields, I really do think it's preferable to use a
1-width character field for stuff where there are allegedly only two
possible values (gender -- "M"/"F" for example). Then, once again,
empty is unambiguously "I don't know", and you can introduce other
codes for recording newly-discovered genders.
One might suggest dates as another example, but unlike logical or
numeric data, an empty date value is not ambiguous. It will result in
wrong results in date math, but you can test for EMPTY() on it to
avoid that. Ditto for character fields, which in VFP always contain at
least spaces. And VFP 9 has VARCHAR coming.
That's my take on it.
In any case, it's just fun to have other people to talk about this
stuff with.
Ken
- Next message: rajani: "Re: loop never exits on ESC"
- Previous message: Gene Wirchenko: "Re: Moving VFP8- project from W2K to WXP"
- In reply to: Gene Wirchenko: "Re: VFP 9 and file size limits"
- Next in thread: Gene Wirchenko: "Re: VFP 9 and file size limits"
- Reply: Gene Wirchenko: "Re: VFP 9 and file size limits"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|