Re: Stored Procedures - Patterns and Practices
From: Hugo Kornelis (hugo_at_pe_NO_rFact.in_SPAM_fo)
Date: 10/31/04
- Next message: Hugo Kornelis: "Re: question on HAVING"
- Previous message: Uri Dimant: "Re: SELECTing Next X From JOINed Tables"
- Next in thread: Dan Guzman: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: Dan Guzman: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: JXStern: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: Galore: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: --CELKO--: "Re: Stored Procedures - Patterns and Practices"
- Messages sorted by: [ date ] [ thread ]
Date: Sun, 31 Oct 2004 12:58:07 +0100
On Sat, 30 Oct 2004 12:13:01 -0700, Wingenious wrote:
>There's obviously a clash of professional opinions here. One side has
>published the reasoning behind its opinions. The other side has provided no
>justification for its opinions.
Hi Wingenious,
I started writing a detailed justification yesterday. When Agent crashed
on me, destroying most of my work, I decided to get some sleep first. Even
though Zach and Steve have now beaten me to the punch, I still decided to
take what was left after the crash, retype what I remembered of the part
that was lost and post it anyway. Sorry if I seem to be echoing Zach and
Steve at some places, but I couldn't bring myself to revising the part
that did survive Agnet's crash.
I looked over the e-book and I'll start off with saying that it does
contain some useful information. But unfortunately, this is mixed up with
so much bad advise and even wrong information that I'd never advise a SQL
Server newbie to read this whitebook, much less to follow it's advise.
Page 4: The relational model has tables, rows and columns. Files, records
and fields are terms from other models. Though I am not as allergic to
them as some others in this group are, and I must admit that fighting this
terminology seems to be a losing battle, considering MS themselves use
these terms in the documentation of Access - I still recommend you (or
whoever wrote that whitepaper) to change your terminology: use the terms
table, row and column and include the terms field and record between
brackets on page 4 (not the other way around as you now have it).
Page 4: I disagree that the debate in SQL Server related discussion forums
is rarely useful. I never counted, but I guess that over 90% of the
questions asked in this group get resolved by answers from the group. Of
course, there are different opinions, and of course the poster should test
each answer himself before implementing it.
Page 5: Fixed length strings should not be applied "only" when each
position is used. It should be used when most values are at or almost at
the maximum length, unless the data may contain significant trailing
spaces (in which case varchar should be used only). The effect of the
added trailing spaces is neither bizarre, not inconsistent. It is very
consistent, well-documented in BOL and only bizarre to those who don't
know what they're doing.
There is no automatic trimming of trailing spaces in SQL Server. There is
an automatic addition of trailing spaces on fixed length. Also, there is
one function that returns the length without trailing spaces (obviously
intended primarily for use with fixed length strings); another function
(DATALENGTH) returns the real length, including trailing spaces (and this
function is therefor not useful with fixed length strings).
Page 7: Why make up your own naming convention? There IS an international
standardized naming convention, ISO-11179, why not use that?
Page 7: Prefixes? No, no, no!!! Definitely no to the advise to use
prefixes on all names. Apart from violating ISO-11179, you are also
destroying one of the most beautiful aspects of the relational model: the
transparent use of views instead of tables. I've used this to minimize
impact of a change: create a new table (with the new structure), create a
view (that mimicks the old structure from the data in the new structure)
with the old table name and then change only those programs that actually
need to do something different - all other programs continue to work as if
the table structure never changed! Of course, that would have been
completely impossible if the table name had started with "tbl" - I'd
either have been stuck with a view named "tblFoo" or I'd have been forced
to change, recompile and retest all other programs as well, just to change
"tblFoo" to "vwFoo".
Page 8: "Most of the strong objections to using prefixes come from mere
personal opinion instead of being based on sound logic" - hogwash. Neither
ISO-11179, nor the ability to use a view to mimick an upgraded table has
anything to do with personal opinion. OTOH - your arguments for prefixes
are flawed.
Take the argument from page 7: "to preserve the sanity of DBAs and
developers" - look at this code: "SELECT MyColumn FROM xyz WHERE ..." and
tell me if there can be any doubt that xyz has to be either a table or a
view (the difference between those is irrelevant in this context) and can
never be a trigger, procedure, constraint, or whatever?
The example on page 8 is flawed as well. The name of a stored procedure
should not be tied to the table against which it acts, it should be tied
to the business object against which it acts. It may well act against
multiple tables.
Page 8: Base table names: plural or singular. Again, check out ISO-11179.
The name should always indicate what's in it. Since most tables are
designed to have more than one row, most names will either be plural or a
group denunciation (sp?). The table that holds employee information should
not be called Employee, but Employees or Personnel. A table that holds the
current rundate (for applications that need the ability to simulate a
rundate other than CURRENT_TIMESTAMP) should be called Rundate, to
indicate that this table should hold only one row. The counterargument (in
the text) that plural names can be misspelled is hogwash again - as if
singular names are never misspelled!
Page 9: Column name prefixes: Again: in violation of ISO-11179. And again:
an invitation to extra work. One day, all of the company will be
celebrating the 32,768th customer while you are feverishly hacking away,
changing millions of references from "intCustNo" to "lngCustNo", compiling
and testing the code.
"However, if an appropriate amount of requirements discovery is done prior
to data modeling and database design, such changes are very rare". Rare,
yes, but not impossible. Ever heard of Murphy's law?
"Further, it's very likely that such a change in data type would require a
change to the code anyway". If you are doing most code in another
language, yes. But I generally prefer to do most work in queries; they
need not be changes when I change the datatype for CustNo from smallint to
int in all tables.
"... prefixes can be very helpful in understanding and changing code that
references the fields" - use a data dictionary instead.
Page 9: Parent and child are (again) not the correct terms for the
relational model. Use the terms referencing table and referred table
instead.
Page 10: It is not true that the values of a primary key must not change.
They should be reasonably stable, but since the introduction of cascading
updates in SQL Server, a changed PK value is no longer the nightmare it
once was.
Page 10/11: There are situations where adding a surrogate key in addition
to the natural key is useful. If the natural key is long or spans too many
columns, AND the table is referred to from other tables, adding a
surrogate primary key and declaring the natural key as UNIQUE instead of
PK can be a wise choice. But the advise to have an identity column in
every table is unwise; giving this advise without urging to have a natural
key in each table as well is plain dangerous. If you routinely read this
group (like I do), you'd know how many posts are about removing unwanted
duplicates - the one thing these posts have in common is that they are all
caused by a table design without a declared natural key.
Another thing the author forgfot to mention is that using surrogate keys
has the implication that many common queries require the joining in of
more tables, which will impact overall performance.
The example at the end of page 11 is just plain silly. The choice of key
for customers is bad (zip code is not reasonably stable, date/time account
created is an open invitations to duplicate rows). The choice is keys for
orders is even worse (basic accounting rules call for order numbers that
are unique within the company and varifyable) and using product name
instead of UPC as key for products is just plain silly. With natural keys
and good database design, the PK for OrderDetail is not five, but only two
columns: OrderNo and UPC. To comment on the final sentence: not natural
keys, but improper design without adequate research is a bad database
design decision.
Page 12 (and 11): The recommendation that the PK column be "the first
field" shows a complete lack of understanding of relational theory. There
IS no such thing as "first", "second" or "last" in an UNordered set!
Page 14: The clustered should not be reserved for a column that is used
frequently in queries, but for a column (or column combination) that is
used frequently in range searches or in ORDER BY or GROUP BY clauses.
Page 14: The recommendation to include a counter to implement optimistic
locking is silly. Use SQL Server's rowversion (aka timestamp) column
instead; that is provided by SQL Server for exactly this goal!
Page 16: Why advocate a column "RecordMask" to serve as a status
indicator? Why not call it "StatusIndicator"? Or would this name imply
that it should be a bit mask instead of human readable status indicators?
What is the advantage of 0x0040 over 'Pending'?
Page 16: Don't use an insert trigger to set the create date/time; use a
DEFAULT instead.
Page 17: Please don't advocate the use of dynamic SQL without at least
warning about the risks of SQL injection. Better yet is to not recommend
dynamic SQL isntead. About 99% of all dynamic SQL would not be needed if
the programmer wasn't too lazy to do some more typing.
Page 17/18: Stored procedures should not be tied to tables, but to
business objects. I want to call a stored procedure that adds a customer
without having to know how many tables are used to store this customer's
details.
Page 18: What is the use of a stored procedure that takes a column name
and a value as (limited) search argument, then returns all columns from a
table? Why not use SELECT ... FROM ... WHERE ...?
The same question for the user-defined function that basically does the
same as the stored procedure and the view on page 19.
Page 19: ORDER BY is illegal in a view. The common workaround to include a
TOP 100 PERCENT clause as well doesn't guarantee that the results of the
view are always returned in the expected order. Again: sets in the
relational model are _UN_ordered BY DEFINITION!
Page 23: Don't use system tables, even BOL includes warnings about this.
Use the INFORMATION_SCHEMA views or MS supplied stored procedures if you
must query the metadata.
Page 26: The date format 01/01/2000 is ambiguous - is it dd/mm/yyyy or
mm/dd/yyyy? The only reliable date formats are yyyymmdd (date only),
yyyy-mm-ddThh:mm:ss (date plus time) and yyyy-mm-ddThh:mm:ss.mmm (date
plus time, including milliseconds).
Page 28: SQL Server doesn't do any automatic RTRIM when comparing
character strings. It does automatic padding. Not the same!!
Page 28: Double quotes to qualify object names are recommendded by the
ANSI/ISO standards for SQL, not brackets.
Page 28: Don't give a blanket advise to prefer table variables over temp
tables. Both variations should be tested (if a temp table or table
variable is needed at all!)
Page 30: The first two function provide functionality that should be
handled by the presentation layer; if it is really needed to do this at
the server, it should be done with inline functions instead of UDF to
preserve performance.
The third function is better replace by SELECT num FROM Numbers WHERE num
BETWEEN @A AND @B.
Page 32: Don't use system tables, even BOL includes warnings about this.
Use the INFORMATION_SCHEMA views or MS supplied stored procedures if you
must query the metadata.
I think that I could find more if I could bring myself to investigate this
even further, but I believe that I've written sufficient to falsify your
claim that "The other side has provided no justification for its
opinions". I tried to attack only the actual statements in the whitepaper,
avoiding personal attacks at the credibility of anyone who dares to
disagree - a debating style that unfortunately IS used in the whitepaper.
Best, Hugo
-- (Remove _NO_ and _SPAM_ to get my e-mail address)
- Next message: Hugo Kornelis: "Re: question on HAVING"
- Previous message: Uri Dimant: "Re: SELECTing Next X From JOINed Tables"
- Next in thread: Dan Guzman: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: Dan Guzman: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: JXStern: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: Galore: "Re: Stored Procedures - Patterns and Practices"
- Maybe reply: --CELKO--: "Re: Stored Procedures - Patterns and Practices"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|