Re: Design Question - Accounts/Transactions

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance




Comments inline

"Craig Lister" <cdotlister@xxxxxxxxx> wrote in message news:904be758-57bf-48f8-931e-669ae9e61ce5@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Thanks guys! Amazing answers - extremely helpful.

Firstly, Bob... am I understandin you right in that, you would add a
'Balance' column to the transaction, which would hold a running
balance? Then, I could just get the last transaction to get the
current balance? If so, my only problem would be, what if the user
enters a back-dated transaction?

The application I am writing is basically a home finance management
app. Accounts are bank accounts. So, the user manually enters
transaction from their bank statement - or where ever. It's possible
they may enter them in the wrong order...

The is_debit column. I think this might be futile. I think having
negative values would be fine. I have a transaction type field...
which indicates if it's a Payment, a Deposit, a transfer out, or a
transfer in. So, I'll drop the is_debit, and just allow for negative
numbers. The user, of course, enters it as a positive, no matter what,
but logic will work out what it should be written in the table as.

Tom - Thanks.

Just to check, you put:

CREATE NonClustered Index ix_Transaction_account
ON dbo.[Transaction] (account_id) INCLUDE (amount)

In my real table, I have clustered the account_id index for
performance. I think physically grouping the account_ids would improve
performance. Am I right? Would the cluster still apply with the
INCLUDE?

If your clustered index on the Transaction table is on account_id, then you probably don't need the above nonclustered index. There are a few cases where performance can be improved by having a nonclustered index on the same column(s) as the clustered index, but that's probably not true here. The INCLUDE clause can only be used for nonclustered indexes. Effectively every column in the table is "included" in the clustered index (because the clustered index is essentially the table), so you don't need and are not allowed to have the INCLUDE clause in a clustered index.


I'm worried about the volume of data. 500 users, each having maybe 4
accounts, with maybe 5 years of history - is a lot of data.

Let's do some calculations. Assume you have 500 users averageing 4 accounts each. This is a total of 2,000 accounts. Assume each user does 10 transactions per day per account. That is a total of 20,000 transactions per day. 5 years is about 1825 days, but I like round numbers for this of work, so lets assume we are keeping 2,000 days of transaction data. That means we will have a table with 40,000,000 rows. Each row as you have given the table (assuming you get rid of the is_debit column and change the amount column to decimal(19,2) - surely big enough to handle any amount you need) has two INTs (4 bytes each) and the decimal(19,2) which takes 9 bytes for a total of 17 bytes. But I assume your real table also has other columns. Maybe a datetime column (8 bytes), a transaction description column (I assume varchar - maybe averaging 20 bytes per row) and some overhead (say 5 bytes). That's a total of 50 bytes per row. So the whole table would take something on the order of 2,000,000,000 bytes. That's 2GB which is not a very large table.

Further, when you need to compute the balance for an account, you only need to scan the part of the table for that account. Each account has approximate 1/2000 of the whole table because there are 2,000 accounts. So to compute the balance you only need to read 1/2000 of the 2GB or 1MB. And the cost of doing that will be fairly cheap. I tried a simulation and on my test it took about 75-100 milliseconds to read the rows and add them up. Your mileage will definitely vary. And whether 100ms is fast enough depends on how often you have to do this. Of course, reading the balance directly from the account table would be many times faster.

In this case, the result denormalization you need if you decide to denormalize is so minimal, I don't think it will be a problem. And, if you don't denormalize, the disk space and response time is also not likely to be a problem, depending, of course, on of often you have to compute the balance. So you can probably go either way.


Maybe this would be an OK case for denormalization. Maybe I need a
balance column in my Account table (This balance is shown a lot,
throughout the site). Would it be acceptable to have an Update/Insert
trigger that updated the Balance amount in the account table? Not sure
how it would work (Haven't worked much with triggers - a lot of people
seem to stay clear of them).

A trigger is certainly a possibility (don't forget that it would also need to handle deletes to the transaction table). It's perfectly fine (IMO) to use triggers when they are appropriate. When you have triggers, you should write them so they run quickly. And you want to write them to be as bullet proof as posible. E.g., make sure they are written so that they don't get any avoidable errors, like foreign key violations or divide by zero. The most common error beginners make is to write the trigger in a manner that it only works properly if exactly one row was inserted/updated/deleted. A trigger fires once per statement, not one per row. So if you execute an UPDATE statement that updates 10 rows in the transaction table, the trigger will only be fired once, and inside the trigger, the inserted and deleted pseudo-tables will each have 10 rows. Your trigger needs to handle that correctly.

The other viable alternative is to only allow inserts/updates/deletes thru a stored procedure. And have that stored procedure handle the updating of the balance column in the account table. If you do this make sure you have proper transaction control to ensure that either both the account table and transaction table are updated or niether of them are.


I liked the running total plan too - but, not sure what happens when
someone enters an out-of-order transaction. Say they enter all their
transactions... get their bank statement, and realise they forgot
one... so they enter it. How would the balance column cope?


As you siad, storing a running balance in the transaction table imposes more updatation problems (and more potential for data anomalies). I agree it is useful to be able to display a running balance. But you may be able to calculate that when you need to display it. If you do, I would wand to do the calculation in the front end, not on the database server. So if the front end wanted to display the transactions and running balance for the last 30 days for an account, it would get from the database the current balance and the transactions for the last 30 days. Then the front end can loop thru the transactions backwards by datetime computing the running balance. So, for example, if the current balance is $200.00, and the last transaction is a debit of $75.00, then before that transaction, the balance must have been $275.00. And if the transaction before that was a credit of $100.00, then the running balance before that transaction must have been $175,00, etc. Then the front end can display the result.

Thanks again for yout time, guys!


You're welcome.

Tom

.



Relevant Pages

  • After delete Trigger on Transaction
    ... Please Advise ..After delete Trigger on Transaction ... WHERE code = (SELECT account FROM INSERTED) ... So i open journalDetail table and delete record one by one the after ...
    (microsoft.public.dotnet.framework.adonet)
  • Re: (LONG) More newbie questions on constructor/set+get methods
    ... Seeing as you couldn't figure out how to access your transAction method ... > * a transAction method, a getBalance method, and a setBalance method. ... > * to compute the final balance for each account. ... > * the initial and final balances for each account. ...
    (comp.lang.java.help)
  • (LONG) Student Question: constructors and setMethods
    ... program requires two classes, an App class and a Worker class, and several ... methods within the worker class (a constructor, a transaction method, a ... and forth to manipulate the balance and continue to adjust it with each ... //Get the starting balance, deposits, and withdrawals from the first user. ...
    (comp.lang.java.help)
  • Re: VALUE MANIPULATION IN MySQL with JAVA question..
    ... making tables for each account name upon input of the user. ... for each account name has four columns (date, debit, credit and ... For each row the balance column should be equal to debit- ... When you insert a new debit or credit, you exclude the current transaction ...
    (comp.lang.java.programmer)
  • Re: Lloyds Internet Banking problems?
    ... "available balance" showing on the on-line system was reduced by the ... However, no transaction, and the available ... Sounds like the sort of account we could all do with. ... Email address maintained for newsgroup use only, ...
    (uk.finance)