Re: SQL to Linq - Left, Right and Inner Joins



Jon Skeet [C# MVP] wrote:
Frans Bouma [C# MVP] <perseus.usenetNOSPAM@xxxxxxxxx> wrote:
Michel Walsh wrote:
If used alone, yes, but with a WHERE clause, MS SQL Server optimizer should recognize an (implicit) inner join and thus, limit the effective amount of data transferred through the wire.
That's not what makes it inefficient. The cross-join execution path is often less ideal as the cross-join action by itself is first handled, and THEN the where clause is handled to filter out rows you don't want.

I'd certainly hope that the query optimiser would do the job there though. For instance, if the where clause is on an indexed column I'd be *hugely* surprised if the inner join were fully evaluated by SQL server before looking at the index.

In database land, 'hope' isn't a tool in your toolbox. In simple queries like:
select distinct c.* from customers c cross join orders o
where c.CustomerId = o.Customerid

vs.
select distinct c.* from customers c inner join orders o
on c.Customerid = o.CustomerId

the execution plan is the same. It could be that in database ABC this isn't the case however. It also can be that with more explicit joins and explicit where clauses, the optimizer gives up and the execution plans will differ. For example, in some queries, moving the where clause to the ON clause of a left join can increase performance. For example:
select distinct c.* from customers c inner join orders o
on c.Customerid = o.CustomerId
inner join [order details] od on
o.OrderId = od.Orderid
where c.Country='USA'

select distinct c.* from customers c inner join orders o
on c.Customerid = o.CustomerId
and c.Country='USA'
inner join [order details] od on
o.OrderId = od.Orderid

The second query has less reads than the first query. The main trick is that with more complicated joins, every next table to join with the set is joined more efficiently if the set it has to be joined with is smaller. So if you can already limit that set with the filter you'll apply anyway, it will bring down the overall query burden. With simple queries this might not be noticable, but these are fast anyway. It's the large queries which are the point of interest as these will also take the majority of the performance eaten by the application.

It's also the case that you can't mimic all queries with crossjoins + wheres. For example, some LEFT join constructs are hard or impossible to formulate with solely cross-joins and where clauses.

It's not like LINQ passes the join in one SQL statement and then applies the where clause separately: the query optimiser has the full query to work with.

That's not the point. The point is that Linq contains a set-theory operator 'join' but has just 1 argument for it and has no freedom how to formulate the predicate to use in the join of the two sets. I know linq works with sequences and not sets, but it would have been way better if 'join' in Linq would have had normal predicate support and a join argument, it wouldn't have hurt Linq in any way shape or form.

Btw, The cross joins in MS code often rely on a T-SQL specific statement 'CROSS APPLY', which isn't a known statement in many other databases.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
.



Relevant Pages

  • Re: update query: still having problems
    ... I will have to adapt the other query. ... So I will need the inner join. ... UPDATE Department INNER JOIN Sheet1 ON Department.employeeID = ... If this is what you want, fine, else, add a WHERE clause to ...
    (microsoft.public.access.queries)
  • Re: Sum based on date range
    ... FROM dbo_EMPLOYEE INNER JOIN dbo_EMP_ABSENCE ON ... And I am not sure why the Where clause has three date references in it when I am trying to get today as one date and the date 6 months ago as the other date. ... You can then try to switch back to design view or try to run the query. ... I need to be able to query by EMP_ID total number SEVERITY over the last ...
    (microsoft.public.access.queries)
  • Re: SQL to Linq - Left, Right and Inner Joins
    ... if the where clause is on an indexed column I'd ... be *hugely* surprised if the inner join were fully evaluated by SQL ... select distinct c.* from customers c inner join orders o ... The second query has less reads than the first query. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: update query: still having problems
    ... new query, BUT using the one that works right now). ... UPDATE Department INNER JOIN Sheet1 ON Department.employeeID = ... If this is what you want, fine, else, add a WHERE clause to limit ... SELECT Department.*, sheet1.* ...
    (microsoft.public.access.queries)
  • Re: Multiple Joins Using DAO
    ... Try building the query through the query builder and looking at the SQL ... FROM Employees INNER JOIN (Orders INNER JOIN ON ... FROM Employees INNER JOIN Orders INNER JOIN ON ...
    (microsoft.public.access.formscoding)

Loading