I have a database where tables are structured so that deleting a
record in
one of the table must also result in deletes from 8 other tables.
I am guessing that this is not atypical in applications out there.
This
is due to how the tables are hierarchically related to one another.
In some
situations, it may involve more than 12 tables even.
Having said that I would like to discuss and get some feedback on
what the
best strategy may be to accomplish this goal. Note that the solution
obviously
must incorporate ADO.
Also note that none of the databases have "hard" forein key
relationships
defined and assume that I cannot create them either.
I will only use 3 tables for brevity - Account, Contact, Opportunity
- and
assume the following relationship. FYI, Account contains around 20K
records,
Contact 80K, and Opportunity 700K.
Account.Accountid = Contact.Accountid
Contact.Contactid = Opportunity.Contactid
So a deleting an account must result in a delete of all contacts
under that account and a delete of all opportunities under those
contacts as well.
--------------------
Method 1 -
DELETE FROM ACCOUNT WHERE ACCOUNTID = ?
DELETE FROM CONTACT WHERE ACCOUNTID NOT IN (SELECT ACCOUNTID FROM
ACCOUNT)
DELETE FROM OPPORTUNITY WHERE CONTACTID NOT IN (SELECT CONTACTID FROM
CONTACT)
I can see this method as the slowest but at the same time simple and
foolproof. This results in one parameterized query followed by two
more trips to the database to delete any orphaned records. This
method also has the added side-effect of possibly cleaning up any
"dirty" data from any other sessions.
And perhaps the above queries can be rewritten using EXISTS to make
it more efficient. But the idea is same. Perhaps something like
this:
DELETE FROM OPPORTUNITY WHERE NOT EXISTS (SELECT CONTACTID FROM
CONTACT WHERE
CONTACTID = OPPORTUNITY.CONTACTID)
--------------------
Method 2 -
DELETE FROM OPPORTUNITY WHERE OPPORTUNITYID IN (
SELECT OPPORTUNITYID FROM OPPORTUNITY A1 INNER JOIN CONTACT A2 ON
(A1.CONTACTID
= A2.CONTACTID) WHERE A2.ACCOUNTID = ?
)
DELETE FROM CONTACT WHERE ACCOUNTID = ?
DELETE FROM ACCOUNT WHERE ACCOUNTID = ?
This method works by deleting from the "leaf" and works up to the
"root".
This may be more efficient but as the number of tables involved gets
larger
and larger, the queries will get more complex. Also, the first
delete statement
in the example will not work since you cannot have a parameter in the
subquery.
Although it's not essential, I would rather have a solution that does
not
use string concatenation to generate a query.
--------------------
Method 3 -
SELECT A1.ACCOUNTID
FROM ACCOUNT A1 LEFT JOIN CONTACT A2 ON (A1.ACCOUNTID = A2.ACCOUNTID)
LEFT JOIN OPPORTUNITY A3 ON (A2.CONTACTID = A3.CONTACTID)
WHERE A1.ACCOUNTID = ?
I would retrieve the result into a recordset and simply delete all
rows returned
by calling .Delete method on each row. And when I Update the
recordset,
ADO will generate all necessary delete sql statements. I am
basically benefitting
from not specifying Unique Table property and thus letting ADO delete
all
involved rows from all tables in the query. The added benefit of
this method
is that there is a single trip to the database (well two, if you
count the
select). However, as more tables are involved, the query gets
complex and
the number of joins required goes up.
--------------------
I am leaning towards method 1 for the moment (at least while in
initial development). Method 2 seems like it doesn't provide much
more benefit to Method 3 as the level of depth grows larger, the
resulting subquery will approximate the query in Method 3 anyway.
I would love for some feedback on this and experiences you've had on
similar
situations.
Thanks so much.
Jiho Han
Senior Software Engineer
Infinity Info Systems
The Sales Technology Experts
Tel: 212.563.4400 x216
Fax: 212.760.0540
jhan@xxxxxxxxxxxxxxxx
www.infinityinfo.com