Re: do all primary keys use autonumber
- From: Vincent Johns <vjohns@xxxxxxxxxxxxxxxxxx>
- Date: Sat, 12 Nov 2005 03:29:14 GMT
peregenem@xxxxxxxxxxxx wrote:
Vincent Johns wrote:
Autonumber is a convenient way to be sure that you won't have any duplicate values, and an Autonumber field occupies only about 4 bytes per record, much shorter than many other fields you might choose to use.
Bear in mind that PRIMARY KEY has a special meaning which makes an autonumber the choice of last resort for the PRIMARY KEY designation.
You can look at it this way if you wish (and I consider Autonumber the choice of first resort, because it satisfies the requirements for a primary key, is small, and is not bound to anything else in the model, making it practically immune to any need for revision). But you might consider that Autonumber isn't required; it's an optional feature for those who wish to use it.
Take the example of a table which has two 'unique' columns: a random integer and an industry standard key, of which the ISBN is an excellent example. In relational theory there is no issue: all keys are equal, simply use the one most appropriate to your needs. However, with SQLs you can only have one PRIMARY KEY per table. You are forced to choose just one but what criteria should you use to make this choice? Both could be given NOT NULL UNIQUE constraints and your 'uniquifier' needs would be satisfied.
I suppose I'd choose the ISBN if the Autonumber isn't already present, and the Autonumber if it is. That's because, if I later discover a transcription error in one of the ISBN fields, it's a lot easier to correct that if I don't also have to worry about other Tables linked using it. The "Cascade Update" feature will take care of many such changes, but not if a linked Table happens to reside in a separate database file. (OK, this is a bit far-fetched, I admit, but I still think that the Autonumber lets me avoid having to worry about such stuff.)
But everyone (correctly) says a table should have a PRIMARY KEY. So what does PRIMARY KEY give you that NOT NULL UNIQUE does not? For Access/Jet, there's only one answer: physical order on disk a.k.a. the table's clustered index.
Even assuming that one can tease the "physical order on disk" out of a key value, when is that useful? If you wish to know what record is the most recently updated, you can use a date/time field, for example. If you know of some other uses, I'd be interested.
What makes the best clustered index? In a nutshell, the column(s) primarily used in queries that use GROUP BY and BETWEEN constructs. Does anyone write SQL that includes GROUP BY (autonumber_col) or BETWEEN (this autonumber value) AND (that autonumber value)? I doubt it.
I certainly wouldn't use an Autonumber field that way. Normally, my Autonumber fields are Random, so that I won't be tempted to use them for anything besides acting as a primary key. But I'm not fanatical about this -- if I had a Table in which I could use something like ISBN as a primary key, and space were tight, I might very well choose to use that instead of my usual random Autonumber. Maybe.
Also note that, in "Northwind Traders", the [Orders].[OrderID] field that generates a new order number is an (incremental) Autonumber field, so Microsoft is implicitly suggesting here that you might want to make an Autonumber field do double duty: primary key, and source of a datum used outside the database (in this case, it appears on invoices). But I don't normally use them that way.
Incidentally, assuming ISBN is unique in a Table, for what possible reason would one use GROUP BY on that field? I can possibly imagine a BETWEEN involving ISBNs, if one is trying to correlate items in a catalog organized by ISBN.
In the earlier example, ISBN is the natural key and wins the PRIMARY KEY designation hands down. What about a table where the autonumber is the only one 'unique' column?
Say, a table with just two columns, a random integer column named ID and a last_name column. The last_name is unlike to be unique. Then again, last_name is far more likely than the random integer to be used in GROUP BY or BETWEEN constructs. The best approach is to constrain the random integer (ID) with NOT NULL UNIQUE and make the PRIMARY KEY designation to be (last_name, ID) in that order. This will cause the physical order of the disk to be rebuilt on compact in last_name order.
How does knowing the "physical order of the disk" (assuming you can determine what that is) help you do ordinary database operations? It sounds as if you're involved in teasing apart database files without going through Access to do it. I suggest not trying to do that, not only for legal reasons (it might involve violation of trade-secret laws, or of contract law -- the EULA to which you agreed when you installed Access), but also because whatever method you use is likely to become invalid under the next version of Access.
There are alternatives to hacking the file format. For example, you will probably always be able to perform operations such as exporting the contents of a Table to a flat tab-delimited text file, and import it into any of a myriad of other applications (including into any version of Access).
Choosing an autonumber as the sole PRIMARY KEY column is likely to have the effect of a performance hit on your database because a random integer makes for a lousy clustered index.
This is a feature of Access that I'm not aware of. People are more likely, from what I've heard (and I'm not privy to the source code of Access, nor to the details of its algorithms), to suffer performance hits from such mistakes as failing to define indices where they should, and not from choosing a "lousy clustered index". Again, I'd be interested in knowing more details if you have them.
-- Vincent Johns <vjohns@xxxxxxxxxxxxxxxxxx> Please feel free to quote anything I say here. .
- Follow-Ups:
- Re: do all primary keys use autonumber
- From: peregenem
- Re: do all primary keys use autonumber
- From: peregenem
- Re: do all primary keys use autonumber
- References:
- Re: do all primary keys use autonumber
- From: Vincent Johns
- Re: do all primary keys use autonumber
- From: peregenem
- Re: do all primary keys use autonumber
- Prev by Date: Re: Is there a maximum size of field caption property in a table?
- Next by Date: Re: problem linking tables in split database
- Previous by thread: Re: do all primary keys use autonumber
- Next by thread: Re: do all primary keys use autonumber
- Index(es):
Relevant Pages
|