Re: do all primary keys use autonumber

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



peregenem@xxxxxxxxxxxx wrote:
Vincent Johns wrote:

Autonumber is a convenient way to be sure
that you won't have any duplicate values, and an Autonumber field
occupies only about 4 bytes per record, much shorter than many other
fields you might choose to use.

Bear in mind that PRIMARY KEY has a special meaning which makes an autonumber the choice of last resort for the PRIMARY KEY designation.

You can look at it this way if you wish (and I consider Autonumber the choice of first resort, because it satisfies the requirements for a primary key, is small, and is not bound to anything else in the model, making it practically immune to any need for revision). But you might consider that Autonumber isn't required; it's an optional feature for those who wish to use it.


Take the example of a table which has two 'unique' columns: a random
integer and an industry standard key, of which the ISBN is an excellent
example. In relational theory there is no issue: all keys are equal,
simply use the one most appropriate to your needs. However, with SQLs
you can only have one PRIMARY KEY per table. You are forced to choose
just one but what criteria should you use to make this choice? Both
could be given NOT NULL UNIQUE constraints and your 'uniquifier' needs
would be satisfied.

I suppose I'd choose the ISBN if the Autonumber isn't already present, and the Autonumber if it is. That's because, if I later discover a transcription error in one of the ISBN fields, it's a lot easier to correct that if I don't also have to worry about other Tables linked using it. The "Cascade Update" feature will take care of many such changes, but not if a linked Table happens to reside in a separate database file. (OK, this is a bit far-fetched, I admit, but I still think that the Autonumber lets me avoid having to worry about such stuff.)


But everyone (correctly) says a table should have a PRIMARY KEY. So
what does PRIMARY KEY give you that NOT NULL UNIQUE does not? For
Access/Jet, there's only one answer: physical order on disk a.k.a. the
table's clustered index.

Even assuming that one can tease the "physical order on disk" out of a key value, when is that useful? If you wish to know what record is the most recently updated, you can use a date/time field, for example. If you know of some other uses, I'd be interested.


What makes the best clustered index? In a nutshell, the column(s)
primarily used in queries that use GROUP BY and BETWEEN constructs.
Does anyone write SQL that includes GROUP BY (autonumber_col) or
BETWEEN (this autonumber value) AND (that autonumber value)? I doubt
it.

I certainly wouldn't use an Autonumber field that way. Normally, my Autonumber fields are Random, so that I won't be tempted to use them for anything besides acting as a primary key. But I'm not fanatical about this -- if I had a Table in which I could use something like ISBN as a primary key, and space were tight, I might very well choose to use that instead of my usual random Autonumber. Maybe.


Also note that, in "Northwind Traders", the [Orders].[OrderID] field that generates a new order number is an (incremental) Autonumber field, so Microsoft is implicitly suggesting here that you might want to make an Autonumber field do double duty: primary key, and source of a datum used outside the database (in this case, it appears on invoices). But I don't normally use them that way.

Incidentally, assuming ISBN is unique in a Table, for what possible reason would one use GROUP BY on that field? I can possibly imagine a BETWEEN involving ISBNs, if one is trying to correlate items in a catalog organized by ISBN.

In the earlier example, ISBN is the natural key and wins the PRIMARY
KEY designation hands down. What about a table where the autonumber is
the only one 'unique' column?

Say, a table with just two columns, a random integer column named ID
and a last_name column. The last_name is unlike to be unique. Then
again, last_name is far more likely than the random integer to be used
in GROUP BY or BETWEEN constructs. The best approach is to constrain
the random integer (ID) with NOT NULL UNIQUE and make the PRIMARY KEY
designation to be (last_name, ID) in that order. This will cause the
physical order of the disk to be rebuilt on compact in last_name order.

How does knowing the "physical order of the disk" (assuming you can determine what that is) help you do ordinary database operations? It sounds as if you're involved in teasing apart database files without going through Access to do it. I suggest not trying to do that, not only for legal reasons (it might involve violation of trade-secret laws, or of contract law -- the EULA to which you agreed when you installed Access), but also because whatever method you use is likely to become invalid under the next version of Access.


There are alternatives to hacking the file format. For example, you will probably always be able to perform operations such as exporting the contents of a Table to a flat tab-delimited text file, and import it into any of a myriad of other applications (including into any version of Access).

Choosing an autonumber as the sole PRIMARY KEY column is likely to have
the effect of a performance hit on your database because a random
integer makes for a lousy clustered index.

This is a feature of Access that I'm not aware of. People are more likely, from what I've heard (and I'm not privy to the source code of Access, nor to the details of its algorithms), to suffer performance hits from such mistakes as failing to define indices where they should, and not from choosing a "lousy clustered index". Again, I'd be interested in knowing more details if you have them.


  -- Vincent Johns <vjohns@xxxxxxxxxxxxxxxxxx>
  Please feel free to quote anything I say here.
.



Relevant Pages

  • Re: New Database - Primary Key
    ... primary key and the autonumber field as a surrogate key. ... It seems your definition of "primary key" is narrower than mine. ... We consistently use surrogate keys as primary keys, ...
    (microsoft.public.access.tablesdbdesign)
  • Re: do all primary keys use autonumber
    ... > that you won't have any duplicate values, and an Autonumber field ... autonumber the choice of last resort for the PRIMARY KEY designation. ... Say, a table with just two columns, a random integer column named ID ... physical order of the disk to be rebuilt on compact in last_name order. ...
    (microsoft.public.access.tablesdbdesign)
  • Re: Limit numbers in primary key
    ... You ran limit the numbers in the Primary Key by inputting ... property box of your autonumber field. ... > AutoNumber field can generate three kinds of numbers: ... > creating two or more special copies (replicas) of an Access database. ...
    (microsoft.public.access.gettingstarted)
  • Re: Recordset Order and Autonumber
    ... - that the AutoNumber field properties in table design show Sequential; ... If Access is actually assigning primary key values that are wild, ... BTW, you can create a new table with an AutoNumber field, and import the ...
    (microsoft.public.access.tablesdbdesign)
  • Re: Primary key problem
    ... primary key where all of the relationships between the ... the same sequence every time I opened the database I did ... >Why not just use a autonumber field for the relations? ... >aware that you have a autonumber behind the scenes. ...
    (microsoft.public.access.formscoding)