Re: storage of arbitrary attributes and their names
- From: "Mike C#" <xyz@xxxxxxx>
- Date: Sun, 16 Jul 2006 13:50:10 -0400
"Alan Krueger" <thuktun@xxxxxxxxx> wrote in message
news:kc_sg.1000$cv.595@xxxxxxxxxxx
Mike C# wrote:<snip/>
It's called "EAV" or "Entity-Attribute-Value" model. It's generally
considered a horrible idea.
I think I understand the concern, but I'm not sure what the best
alternative would be.
Assume we have something called an Item with an ItemType and an Owner. At
design time, we've placed all the fields we support directly into the
database schema. However, for a particular ItemType and Owner, our
customer may sometimes want to pass arbitrary custom information to
associate with the Item.
If a customer wants to associate "arbitrary custom information" to an item,
why can't you store that information in a well-defined column (i.e.,
"Arbitary_Info")? If you want the client to be able to store the "abitrary
custom information" in a key->value format, why not use XML? Another
thought is to create a well-defined "standard" schema, and if the customer
wants X, Y and Z added to their schema it would fall into a separate
customization project the customer would need to pay for.
One approach to this would be to add custom columns to the Item table, one
for each custom item with the appropriate type, most likely nullable.
This adds a non-trivial amount of extra information to the Item table, and
since there were going to be a *low* of Item rows, this seemed like a bad
idea.
Another approach would be to separate this into an ItemCustomValue table,
but all the custom values for all ItemTypes and Owners would still be
collected together, so this still seems horribly de-normalized. However,
it wouldn't seem to scale well to create separate tables for custom values
per ItemType and/or Owner, since there might be a non-trivial number of
those, too.
Add to this that we would like these custom columns to not need explicit
database configuration, it suggests either automatic schema updates by the
system (which seemed like a bad idea to us) or the "EAV" table, as you've
said it's called. Is one of the other approaches better?
They're both horrible. If you think about it, in the first approach every
user who can add an ItemType has to have permissions to alter the database
schema. This is probably far greater permissions than you probably want
them to have. The EAV approach, and the reasons it is so bad, has been
discussed ad infinitum in the newsgroups.
What might represent this data efficiently and scalably?
What you're asking for are requirements that are in total opposition to one
another. On the one hand you want the users to be able to modify the schema
(or simulate schema modification using "EAV"), and on the other hand you
want efficiency and scalability. The most efficient and scalable schema
would be a baseline schema with schema modifications created on a
per-project basis. Are you planning on querying based on the values in
these "user-defined columns"? If you are, then users need the capability
(and technical knowledge) to create their own indexes based on the queries
your application is submitting. If you're not searching based on the
user-defined extensions to the schema, and no index is required, then why
not just use XML in a column? If you're using SQL 2K5, how about an XML
column and XQuery FLWOR expressions?
.
- References:
- storage of arbitrary attributes and their names
- From: thuktun
- Re: storage of arbitrary attributes and their names
- From: Mike C#
- Re: storage of arbitrary attributes and their names
- From: Alan Krueger
- storage of arbitrary attributes and their names
- Prev by Date: Problem deploying Reports in SQL 2005
- Next by Date: Re: enterprise manager question
- Previous by thread: Re: storage of arbitrary attributes and their names
- Next by thread: need suggestions for data model
- Index(es):
Relevant Pages
|