Re: Wanted: Discussion on MSSQL Internals Interview Qustions



On Tue, 23 Jun 2009 12:11:40 -0700, Siegfried Heintze wrote:


I just had an interview and there were a few questions I did not answer to
my own satisfaction.

Hi Siegfried,

What job was this for? Developer for Microsoft's SQL Server team?
Co-author of Kalen Delaney's next book? Tutor of a SQL Internals class?
Because for most other jobs, some of these questions would be totally
irrelevant.

I'll add some comments to some of the the questions already adressed by
Rick and try to answer the rest.

(2) When are stored procedures compiled? Everytime he SQL server process
starts?

Rick's answer is more complicated that what I would have said: when the
stored procedure is executed and no usable cached execution plan is
found.

(6) When does MSSQL use tempdb?

Rick gives a few examples, I'll add two more: hash tables when executing
a query that involves a hash operator; version store when using snapshot
isolation.

The list is still incomplete!

(7) Is there an assembler I can call to create compiled TSQL code if I think
I can do a better job than the TSQL compiler?

No.

(8) Is there an API I can call if I want to manually perform my own query
optimization (where I manually decompose and optimize my SQL SELECT/JOIN
statements myself instead of relying on the TSQL compiler to parse and
optimize my SQL SELECT/JOIN statements)?

No.

(9) Is the TSQL code the only input to the query optimizer or does it
consider other inputs as well such as the number of rows in each table it is
performing a join on?

Rick's reply gave me a good chuckle, but I'm sure you understood that he
wasn't serious. Other inputs the query optimizer uses are the schema
(layout of tables, available indexes [VERY IMPORTANT], possibly indexed
views [Enterprise edition only]), index statistics (already mentioned by
Rick), and constraints. Possibly more, I'm not sure.

Oh, and technically the query optimizer does not take the TSQL code as
its input. It gets the parse tree (or query tree) that the parser
generates from the T-SQL code.

(10) Is the caching that is implemented by a C# dataset redundant with the
caching implemented in MSSQL? Why not?

The C# dataset caching can only cache results from a single query. Also,
I think (not sure - C# is not my specialty) that multiple clients
submitting the same query will each cache their own copy.

SQL Server caches raw table data. So even if a completely different
query is used that just happens to use the same data, it will benefit
from the data being cached.

(11) How do I decided whether to do my joins in SQL Select statements or in
the relationships among my C# datasets? As you know, there is the feature of
datasets where I can define keys that are to be matched and effectively
perform a join after my data tables in my data set have been populated.

I already said that I don't know much about C#. But this sounds as if
you first fetch all data from the server to the client and then somehow
"fake" a join on the client side.

Frankly, I wouldn't know of any reason why you would ever want to do
that. You'd probably be overburdening the network beyond recognition.
Maybe if you have a very fast network, very fast client computers and a
completely crappy server?

(12) What are the alternatives to @@IDENTITY and why might I perfer to use
them instead?

First: Avoid overuse of IDENTITY columns. They have their place, but
many people slap them routinely on every table and that is very bad.

For single-row inserts, SCOPE_IDENTITY is better. @@IDENTITY returns the
last identity value generated within the connection, which might be the
result of a fired trigger instead of the value you expect. This can be
avoided by using SCOPE_IDENTITY, since the trigger is in a different
scope.

For all inserts, you can use the real key of the table to join the input
data to the table and retrieve the generated identity values. With a
proper design, this is always possible. With an improper design, you
have more serious issues than not being able to retrieve the generated
identity values.

(13) When might I perfer alternatives to stored procedures?

Depends on the kind of alternatives.

* Views and inline table-valued user-defined functions: When the
procedure returns a single resultset and the code to generate it can be
expressed as a single query; extra benefit over stored procedure is that
you can use the view or UDF directly in a query.

* Multi-statement user-defined functions: As above, but when there is no
way to get the results by a single query, and when you don't mind the
huge performance hit this can incur. There are also some limitations
(the function may not have side effects).

* Scalar user-defined functions: Hardly ever, again due to the
performance hit. (And the no side effect limitation).

* Triggers: I don't consider these alternatives to stored procedures,
but some do so I'll mention them. Use these when the code does not have
to be executed by request, but must be automatically executed in
response to certain events.

* Dynamically generated ad-hoc SQL: Never. Just google "SQL injection".

* CLR stored procedures (or CLR code in general): Only when the code
uses lots of computations and does little or no data access, hen the
ease of just using a standard call in CLR exceeds the difficulty of
doing it a different way, or when some external component has to be
called. Examples include heavy trigonometric computations, regular
expression evaluations, and credit card check.

Did I miss an alternative to stored procedures?

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
.



Relevant Pages

  • Extreme performance issues (SQL Server 2000/ADO.NET/C#)
    ... This process runs very quickly if run through Query ... same exact stored procedures and views, run in the same exact order, through ... system that runs SQL Server (a 4-cpu Xeons system with 2gigs of physical ... When I execute these steps manually through query analyser,, ...
    (microsoft.public.dotnet.framework.adonet)
  • Re: SQL Server
    ... Where is the processing done in a pass through query? ... One of the SQL Server programmers is pushing stored procedures, ... pass-through queries. ...
    (microsoft.public.access.queries)
  • Re: Securing Stored Procedures from being seen
    ... I think my issue was that the query I was issuing was ... work is that the user will be logging into SQL Server with their username ... granting the group execute permissions. ... see the names of any stored procedures to which he has no rights. ...
    (microsoft.public.sqlserver.security)
  • Re: Get sql server sp Parameters from system tables or sp...
    ... You can use the INFORMATION_SCHEMA views in SQL Server to get at the ... contains all the parameters for stored procedures. ... > the data access application block. ... > to do this I would like to use a query that given the name of an sp will ...
    (microsoft.public.sqlserver.programming)
  • RE: Heavy performance slowdown after migrating from SQL Server 200
    ... What I'm, really wondering about is, why SQL Server 2008 queries every part ... as it was a single query on its own without any linkage. ... that the query plan optimizer would have produced a query plan first querying ...
    (microsoft.public.sqlserver.fulltext)