Re: DB Architecture Questions (for joe celko)
- From: --CELKO-- <jcelko212@xxxxxxxxxxxxx>
- Date: Tue, 5 May 2009 16:19:21 -0700 (PDT)
Why don't you buy some of my books, rather than ask me to post them in
a Newsgroup? You are asking for a complete RDBMS education in a
newsgroup!
1) I've seen Celko state that "id" keys make no sense (product_id etc) and should be product_number instead (refer to the "product" instead of the "record [sic]"). <<
Huge conceptual error; rows are nothing like records. You are still
in a file system mindset.
No. Learn the ISO-11179 rules for data element names, the differences
among "_id", "_nbr", "_seq", etc. When you look for a key, you start
with the standards of your industry. I would estimate that this
covers ~80% of the properly defined systems -- VIN, ISBN, UPC, EAN,
etc. which can be verified and validated via a trusted external
source. In the old days, this took time; you can Google it today.
Does your industry actually have "product_id" as a standard?
After that, there is a natural key in the data, such as (longitude,
latitude), store, cash register, ticket_nbr, etc which can be verified
and validated in the reality of the data. This is ~18% of the cases.
if you have no natural key, then your schema is probably wrong.
Then, if that fails and we are looking at <2% of all situations, we
invent a key with check digits, validation rules and an audit trail
within our enterprise. This is a great screaming pain IF you do it
right. This is why industry standards exists -- people got tired the
pain Would you like to do Retail without standard UPC/GTIN barcodes on
products?
He says that this is the industry standard <<
No, I did not. You are the guy with a "magical universal product_id"
that fits all data; I am the guy with a zillion industry specific
identifiers.
This makes sense from an application design standpoint because product_id is the "identifier" for the "record [sic]". <<
This is another example of your mindset. Files work on the location
of records; RDBMS has keys. Totally different concepts. In SQL, the
unit of work is the entire schema (in data modeling it is the entire
Universe of Discourse). Unlike a field in the record of a file, the
data elements are named for what they are; not for their location,
local use, etc. Think global, not local. A VIN is a VIN
everywhere.
Also, if you change product_number, you've to change all the related records [sic] as well. <<
So? That is why SQL has UPDATE, DRI actions and ALTER statements; SQL
does it for me. In a file system, you have to change each file, one
by one. Where were you when retail went from 10-digit UPC to the 13-
digit EAN and 15-digit GTIN codes a few years ago
2) What is Celko's opinion of using Pascal naming conventions?
Fine for old Pascal programs, but this is SQL and we use ISO-11179
rules for data element names. And since the feds other major data
users are requiring it, it is a good idea.
Many "enterprise" developers I've talked to follow Pascal naming conventions and recommends against underscores and prefixes. <<
Have you seen the actual research? I collected some of that stuff for
AIRMICS. Sorry, underscores work, especially for the long data
element names we use today. We are used to seeing ruled paper and
spaces between words now.
For an example, if you have a product table, why would you have product_id, product_name, product_price where it is already a given that these columns belong to the product table. <<
And does it change names when it is in another table? How many
different names does one data element? If every table has a "id"
column, how do you tell them apart in the Data Dictionary? Again,
tables are not files; data elements are global to the schema, not
local to the table (file).
3) I prefer that the DB just store and retrieve data and that everything else should be handled by the application tiers. <<
Pretty much, but I want to put as much of the data integrity into the
DB as I can. That way there is one and only one version of a business
rule for data integrity, the optimizer can use the constraints and I
don't have to write hundreds of lines of code in the apps.
We use ad-hoc queries instead of using stored procedures because it provides much more flexibility and scalability. <<
At the expense of compile time, and lack of consistency for regular
jobs? Doing the end of the month reports the same way every month is
important.
For an example, our DAL contains ad-hoc methods to CRUD business entities. Stored procedures makes no sense in this case. <<
So you let anyone update your DB on the fly in whatever ad-hoc way
they feel like today?
Think about that.
Also, stored procedures can lead to out-of-scope and obsolete objects. I've worked on legacy apps that had 1000s of stored procedures, some not used even. Debugging, maintaining, and proper dependency checks are very "primitive" because there is no proper IDE. What's Celko's opinion on this? <<
You don't know how to manage an application system. There are lots of
tools now. You can find dead code, do global changes, etc.
4) What is Celko's opinion on NULL-able fields [sic] (especially date fields [sic])? Often, in the application and design requirements, there is data that is "unknown" or "not yet set" and a NULL-able type makes perfect sense. The alternative is to use a "null constant" but that makes no sense in "object oriented" designs. <<
You were not around for the "Null Wars" between Chris and myself :)
5) What is Celko's opinion of bit fields [sic]?
Don't use them. That is "punch card" programming. SQL is a predicate
language -- we discover facts from data.
Again, all of my research and opinions are in my books.
.
- Follow-Ups:
- Re: DB Architecture Questions (for joe celko)
- From: sqlguru
- Re: DB Architecture Questions (for joe celko)
- From: Geoff Schaller
- Re: DB Architecture Questions (for joe celko)
- References:
- DB Architecture Questions (for joe celko)
- From: sqlguru
- DB Architecture Questions (for joe celko)
- Prev by Date: information_Schema.Routines
- Next by Date: Re: information_Schema.Routines
- Previous by thread: Re: DB Architecture Questions (for joe celko)
- Next by thread: Re: DB Architecture Questions (for joe celko)
- Index(es):
Relevant Pages
|