Filter in the Join vs Filter in the Where

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



To start off, this is with SQL Server 2005 data.

The next question comes from running into someone that likes to write
their
code a different way than I do. I've long ago established how I do
it, and I'm sure I had a good reason at the time, but I can't recall.
And this other person, I'll call Bob, also likes his way, and feels
it's better.

The actual data this question is based on is tens of millions of rows
in 3nf normalized database that stores data. Not quite OLTP, but not
quite datawarehouse. Typically people are working with 4 or 5 sets of
data from this database at any one time, with each set of data having
hundreds of thousands of rows. We can assume (for this discussion)
that appropriate indexes are in place, with sufficient memory, cpu,
and i/o available. Of course we simplify here for an example:
CREATE TABLE #One (CarBrand NVARCHAR(40), Employees INT)
INSERT INTO #one VALUES ('Ford',625)
INSERT INTO #one VALUES ('Chevrolet',865)
INSERT INTO #one VALUES ('Porche',700)
INSERT INTO #one VALUES ('Citroën',250)

CREATE TABLE #two (CarBrand NVARCHAR(40), PopularityRank SMALLINT)
INSERT INTO #two VALUES ('Ford',1)
INSERT INTO #two VALUES ('Chevrolet',3)
INSERT INTO #two VALUES ('Porche',2)
INSERT INTO #two VALUES ('Citroën',4)

CREATE INDEX #myindex ON #one(Carbrand)
CREATE INDEX #myindex3 on #one(Employees)
CREATE INDEX #myindex2 ON #two(Carbrand)

The conflict is that creating a query that joins multiple tables, and
requires filtering to be applied to various columns from these various
tables, I will more often than not use the Join On to join the common
fields, and place any filters in the Where clause. For example I
might write a query like this:

SELECT a.carbrand, a.Employees, b.Rank
FROM #One a
JOIN #Two b
ON a.CarBrand = b.carbrand
WHERE a.Employees > 500

Bob will write a query that looks like this:

SELECT a.carbrand, a.Employees, b.Rank
FROM (select 'x' as x) as x
JOIN #One a
on a.Employees > 500
JOIN #Two b
ON a.CarBrand = b.carbrand

Bob's theory being that since the order of operations for sql
databases is:
1. FROM clause
2. WHERE clause
3. GROUP BY clause
4. HAVING clause
5. SELECT clause
6. ORDER BY clause

sticking any filtering conditions into the FROM clause is going to
lead to higher performance through less data needing to be processed
as it is filtered before the where needs to deal with it. I've found
others preaching this as well:http://www.bennadel.com/blog/70-SQL-
Query-Order-of-Operations.htm

In general, that's never really something I've worried about. Should
I be? Won't SQL Server, Oracle, DB2 optimize your query and run it
however it feels is most optimal, regardless of where you put
filtering conditions? Or is there some incremental benefit that might
appear, so you should just try and avoid the where?

It was news to me and I hadn't found much discussion online so I
thought I'd see what folks around here say.
.



Relevant Pages

  • Re: Filter in the Join vs Filter in the Where
    ... And since your query and Bob's query both return the same result, it is not suprising that the optimizer generates the same plan. ... requires filtering to be applied to various columns from these various ... I will more often than not use the Join On to join the common ... and place any filters in the Where clause. ...
    (microsoft.public.sqlserver.programming)
  • Re: The Many Flavors of SQL - Can a SQL Server query work in MS Access?
    ... One thing is that if you have more than one join in the From clause you have to use Parentheses ... You might try building the base of the query using the query grid. ... inner join cmsopen.dbo.apm_vendor v ... in the Query Analyzer on a Windows 2000 Server box running SQL Server ...
    (microsoft.public.access.queries)
  • Re: Filter a Recordset
    ... Select A,B,C Into #tblTmp From AAA Where A Like 'AA' Order By D ... Query instead of nested SQL statement. ... DB objects (like Queries in Access or SP in SQL Server). ... > filtering conditions to any field in a Recordset ...
    (microsoft.public.vb.database.ado)
  • Re: Maximum number of elements ina an IN() Clause
    ... if the IN clause is the ONLY criteria... ... If you are using MS SQL Server, can you define an heterogenous server, on it, pointing to an Access-Jet database, and push the values in a table there? ... Vanderghast, Access MVP ... would making the passtrough query WITHOUT the in clause and then, ...
    (microsoft.public.access.queries)
  • Re: ORDER BY in VIEW not working
    ... SQL Server is NOT guaranteed to repect an ORDER BY ... in the view unless you use ORDER BY also when you *query* the view. ... CREATE VIEW TopView AS ... Although the view definition contains an ORDER BY clause, ...
    (comp.databases.ms-sqlserver)