Re: Record source at runtime
- From: "BruceM" <bamoob@xxxxxxxxxxxxxxxx>
- Date: Mon, 15 Oct 2007 07:27:27 -0400
I have been following maybe 10% of this, but I'm starting to think I should
include the PK in the query when I specifically need that field. What I am
getting from the discussion (and other things I have read) is that it
probably does not make much difference on recordsets of the sizes that are
typical on my projects so far, and that proper indexing of other fields that
are actually used as output (e.g. LastName, FirstName) has more to do with
performance in any case.
"David W. Fenton" <XXXusenet@xxxxxxxxxxxxxxxxxxx> wrote in message
news:Xns99C8995FDC9C1f99a49ed1d0c49c5bbb2@xxxxxxxxxxxxxxxxx
"Arvin Meyer [MVP]" <a@xxxxx> wrote in
news:OWzKqXTDIHA.464@xxxxxxxxxxxxxxxxxxxx:
"David W. Fenton" <XXXusenet@xxxxxxxxxxxxxxxxxxx> wrote in message
news:Xns99C7AFAABDC0Af99a49ed1d0c49c5bbb2@xxxxxxxxxxxxxxxxx
The wording of your two answers puzzles me. You seem to me to be
saying that including the PK in the SELECT clause improves
performance somehow. I can't see how.
IIRC, here's the reasoning supplied to me. The primary index is
the PK, if you have the PK in the initial return of indexes, when
it fetches the data, it will use the PK to bring those rows back
to the workstation. If you don't have the PK, it will still find
the data using the supplied index, but must do a second scan,
instead of going right to the row.
Seems to me it must *always* use the PK, no matter what (except for
a table scan, I guess), because the PK is the lookup for the data
pages (since the tables are stored in PK order). This has always
been my assumption, that, no matter what indexes are used for the
WHERE clause, those indexes are keyed to the PK value, which means
the PK index has to be used to find the data pages.
However, now that I think about it, I'm not entirely certain where
this idea came from.
Jet uses clustered indexes for the primary key, and that means
that the table's data is written in PK order after a compact.
That means that the data pages have to retrieved via the PK no
matter what you select on. That is, you have to go through the PK
no matter *what* criteria you use, since the data pages are
stored according to the PK.
So, what would happen is this (oversimplied for clarity):
1. retrieve indexes for Lastname and Firstname.
2. those indexes will point to the PKs of the records that match.
3. request the data pages for those PKs.
That's how I understand it too.
It may be that the PK index has to be retrieved, but I doubt it.
I'd think it would be retrieved only when there are criteria on
the PK. It depends on whether the non-primary indexes store
pointers to the PK values or pointers to the data pages. I'm not
sure what actually happens. If it's the former, then the PK index
would be required to find the actual data pages.
I am pretty sure that the PK does need to be retrieved if it isn't
in the query.
Sure, but having it in the SELECT statement shouldn't have any
effect on how the query plan is processed. Really, the SELECT
statement is not operational until the data pages have already been
retrieved, and all it does is determine which fields are displayed.
On the other hand, since the PK index is clustered, it may be
that the pages function as the index (using an offset to find the
PK values). I'm not sure about this.
I'm not sure that the PK is clustered. Where did you read that?
Jet Database Engine Programmer's Handbook, as well as many postings
in these newsgroups. I can't find it in the index of the book after
a quick scan, though.
Here's some documentation:
http://support.microsoft.com/default.aspx?id=137039
AND
http://support.microsoft.com/default.aspx?scid=kb;en-us;209769
Of course, the PK clustered index is not as full-featured a
clustered index as one finds in SQL Server, for example, and it
applies only to the PK, and it's not maintained except during a
compact, and it is perhaps not the kind where the lowest level of
the index contains the data pages.
Are you sure about that? I'd think that the query optimizer would
fix that for you. That is, it's likely to use the least sparse
index first.
As I remember, the query optimizers uses the index index (ie:
index(0), index(1), etc.)
But I just don't see a reason why having the PK in the SELECT makes
a difference at all. The Jet Programmer's Guide says nothing about
this at all, and I've never used it. When I tested SHOWPLAN below,
there was absolutely no difference in performance between the two on
a pretty large table (though it was running locally, not across a
network).
Have you checked whether SHOWPLAN gives you any indication on
exactly what happens in regard to the order of filtering by
indexes in the same table?
No I haven't but if you have a large set of data, perhaps 100K
rows, you might see the way that ShowPlan looks at it.
Hmm. I just checked on a table with 300-400K records, and SHOWPLAN
shows no difference for including the PK or not (which means
nothing). Showplan only documents index usage in joins, I guess.
--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/
.
- References:
- Record source at runtime
- From: BruceM
- Re: Record source at runtime
- From: Marshall Barton
- Re: Record source at runtime
- From: BruceM
- Re: Record source at runtime
- From: Arvin Meyer [MVP]
- Re: Record source at runtime
- From: BruceM
- Re: Record source at runtime
- From: Arvin Meyer [MVP]
- Re: Record source at runtime
- From: David W. Fenton
- Re: Record source at runtime
- From: Arvin Meyer [MVP]
- Re: Record source at runtime
- From: David W. Fenton
- Record source at runtime
- Prev by Date: Re: Starting Internet Explorer from Command Button
- Next by Date: date calculations - please help
- Previous by thread: Re: Record source at runtime
- Next by thread: Re: Record source at runtime
- Index(es):