Re: Finding the nearest match without reusing results
- From: David Hilberg <david.hilberg@xxxxxxxxx>
- Date: Sat, 15 Sep 2007 08:45:26 -0000
Oh, only a couple of seconds for 49 formula instances--good to know.
An hour to calculate 1500 is a bit more cumbersome. Thanks for running
the test!
You're very welcome for the formula. It was an interesting challenge.
I noticed a simplification for the 4 sections that evaluate previously-
unused/used stores. Instead of:
MMULT ... <> ... = COUNTA($E$1:$E4)
MMULT ... <> ... <> COUNTA($E$1:$E4)
MMULT ... <> ... = COUNTA($E$1:$E4)
MMULT ... <> ... <> COUNTA($E$1:$E4)
One may use:
MMULT ... = ... = 0
MMULT ... = ... > 0
MMULT ... = ... = 0
MMULT ... = ... > 0
........still don't know how it works. :)
I'll leave my notes on the formula, in case you want to dissect.
[I'll assume you know the uses of not-equal-to (<>), exponentiation
(^), and arrays { }. And I'll assume you know, or could look up, how
comparing an array with a value generates an array of Trues and
Falses, that you can convert those to numbers by multiplying by
numbers, or must sometimes convert to 1's and 0's by doubly-negating
(--), that multiplying (*) two horizontal (or vertical) arrays of the
same length results in a horizontal (or vertical) array; but two
arrays of different lengths and same orientation can be multiplied (*)
or matrix-multiplied (MMULT) if the orientation of one is transposed
(the result being a rectangular array); that in using MMULT, the row
entries of matrix1 are multiplied by the column entries of matrix2 and
summed to form a single entry in the resulting array.]
*Current test formula, omitting the IsError function*
=INDEX($A$9:$A$12,MATCH(
MIN(
($B5=$B$9:$B$12)*ABS($C5-$C$9:$C$12)*
(MMULT(--($A$9:$A$12=TRANSPOSE($E$1:$E4)),ROW($E$1:$E4)^0)=0)+
999999999*(($B5<>$B$9:$B$12)+
(MMULT(--($A$9:$A$12=TRANSPOSE($E$1:$E4)),ROW($E$1:$E4)^0)>0))
),
($B5=$B$9:$B$12)*ABS($C5-$C$9:$C$12)*
(MMULT(--($A$9:$A$12=TRANSPOSE($E$1:$E4)),ROW($E$1:$E4)^0)=0)+
111111111*(($B5<>$B$9:$B$12)+
(MMULT(--($A$9:$A$12=TRANSPOSE($E$1:$E4)),ROW($E$1:$E4)^0)>0)),
0))
*Formula*
IsError (Index( StoresInTable2, Match( Min(UnusedStore SalesDiffs
across CurrentState), (UnusedStore SalesDiffs across CurrentState'),
exactmatch)), "No Match")
*Errors*
IsError returns "No Match" for no match or for general snafu. Could
use IsNA to trap #N/A caused by no match, but not hide other errors
(e.g. #VALUE! caused by not array-entering formula). IsNA would have
to be nested as If( IsNA(formula), "No Match", formula), duplicating
the long Index formula.
*Arrays*
UnusedStore SalesDiffs across CurrentState:
1D array of Diffs interspersed with 999999999's in the first mention
and 111111111's in the second, but start w/Diffs and 0's:
( IsGoodState * AbsoluteDiff * IsUnusedStore)
Any diff with wrong State or with used Store is implicitly zeroed out.
($B5=$B$9:$B$12) * ABS($C5-$C$9:$C$12) * (MMULT ... = ... = 0)
Finally, Add the below array to the above to fill in zeroed diffs with
huge numbers, so MIN won't grab a filler zero instead of a sales diff.
+ HugeNumber*( IsBadState + IsUsedStore ):
1D array of 0's and huge numbers. 0's where the diffs are above, huge
numbers where the 0's are above.
+ 999999999 * (($B5<>$B$9:$B$12)+(MMULT ... = ... > 0))
or
+ 111111111 * (($B5<>$B$9:$B$12)+(MMULT ... = ... > 0))
HugeNumber:
E.g., 999999999 the first time, something different the next, e.g.,
111111111.
Each must be larger than the absolute value of the maximum Sales
figure over both tables.
Within MATCH, different HugeNumber from 1st array (inside MIN) to 2nd
array, so when no match should be available, i.e., MIN is a
HugeNumber, no match is found among the different HugeNumbers of the
2nd array.
Note: MIN Diff could still legitimately be zero, if sales1-sales2=0
for unused Store2 in a legit State. Such 0's are not replaced by huge
numbers.
*Evaluating Store list in Table2 as Unused/Used according to Previous
Formula Results in Col E*
IsUNusedStore: 1D array of True and False:
Compares the entries of the 1D array resulting from MMULT collapse to
0.
MMULT_result = 0
MMULT((EachStore <> StoresUsed), (1's list)) = 0
(MMULT(--($A$9:$A$12=TRANSPOSE($E$1:$E4)),ROW($E$1:$E4)^0) = 0)
Similarly for the T/F list IsUSEDStores, MMULT_result > 0.
(EachStore <> StoresUsed): 2D array of True and False
($A$9:$A$12<>TRANSPOSE($E$1:$E4))
Compare a long vertical list of all table2 store numbers (in A) with a
short horizontal list (need transpose) of the store numbers already
used (E), to form a 2D rectangular matrix. Each row has Trues and
Falses according to whether each store matches any already used. Up to
1 True per row.
List of 1's to collapse the 2D matrix above into a 1D list:
ROW($E$1:$E4)^0
{1;2;3;4}^0 = {1;1;1;1}
Required by MMULT.
Nice trick seen in a Harlan Grove post.
Okay, that is probably both more and less than you wanted to know, if
you wanted any explanation at all. :)
- David
On Sep 14, 1:36 pm, goofy11 <goof...@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Luckily, I only had 49 stores that I needed to copy this down (there were
over 3,000 stores where it looked for a match). Copying this down 50 rows
was fast, just a couple of seconds).
To satisfy your curiousity I ran a test last night to see what happens with
more rows. I broke my tables up into similar sizes (1,564 and 1,566 stores
respectively). After dropping the formula down 1,566 rows, it took exactly 1
hour to finish calculating. There ended up being 310 stores with no matches.
I ran this on my Dell Latitude D620 laptop (Duo Core processors, 2GB RAM).
Thanks again for the help. Very cool formula........still don't know how it
works. :)
..
..
..
<snipped messages>
..
..
..
David Hilberg wrote:
Hi Goofy11,
Since no-one has posted VBA yet, would you like a working formula? As
you requested, it requires an exact match for State, then returns the
store number representing the closest unused match for sales (whether
higher or lower).
=INDEX($A$9:$A$12,MATCH(
MIN(
($B2=$B$9:$B$12)*ABS($C2-$C$9:$C$12)
*(MMULT(--($A$9:$A$12<>TRANSPOSE($E$1:$E1)),ROW($E$1:$E1)^0)=COUNTA($E$1:$E1))
+MAX($C$2:$C$5,$C$9:$C$12)
*(($B2<>$B$9:$B$12)
+(MMULT(--($A$9:$A$12<>TRANSPOSE($E$1:$E1)),ROW($E$1:$E1)^0)<>COUNTA($E$1:$E1)))
),
($B2=$B$9:$B$12)*ABS($C2-$C$9:$C$12)
*(MMULT(--($A$9:$A$12<>TRANSPOSE($E$1:$E1)),ROW($E$1:$E1)^0)=COUNTA($E$1:$E1)),
0))
It's on several lines to show that some expressions are similar or the
same.
It is an "array formula" that must be entered with Ctrl+Shift+Enter,
not just Enter, or you'll get #VALUE!
If a match cannot be made it returns #N/A.
In my test data,
Table1 = A2:C4
Table2 = B9:C12
Formula was entered in E2:E4.
In E1 (just above the formula) there must be a header (such as "Best
Unused Match").
I hope this is useful!
- David
goofy11 wrote:
I'm hoping someone can provide me with some vba to make quick work of
my task. I have 2 tables (call them table1 and table2) that are
setup like:
Store State Sales
1 MO 2,120
7 TX 3265
12 CA 4565
37 TX 3375
Both tables have these same fields, but each table has different
store numbers (no duplicates between the tables). I want to add a new
column to table1. For each store in table1, I want to return the
store number from table2 that is the closest match to itself based on
State and Sales. To elaborate further, I want to find a store that
is in the same state, and is closest in sales to itself. For the
sales parameter, it doesn't matter how high or low it has to
go.....it just needs to return the closest match.
The last condition, is that a result can only be used once. So if
store 7 found that store 325 was the closest match, and then later,
store 37 also found that 325 was the closest match for it, the code
would have to look for the next best match, and so on. Table 1 would
then look like:
Store State Sales Table2 match
1 MO 2,120 56
7 TX 3265 325
12 CA 4565 68
37 TX 3375 652
Any help is appreciated.
.
- Follow-Ups:
- Re: Finding the nearest match without reusing results
- From: David Hilberg
- Re: Finding the nearest match without reusing results
- References:
- Re: Finding the nearest match without reusing results
- From: David Hilberg
- Re: Finding the nearest match without reusing results
- From: David Hilberg
- Re: Finding the nearest match without reusing results
- From: David Hilberg
- Re: Finding the nearest match without reusing results
- From: goofy11
- Re: Finding the nearest match without reusing results
- From: goofy11
- Re: Finding the nearest match without reusing results
- From: David Hilberg
- Re: Finding the nearest match without reusing results
- From: goofy11
- Re: Finding the nearest match without reusing results
- Prev by Date: Changing a two-dimensional, one row array to one-dimensional
- Next by Date: Invoke Edit Mode
- Previous by thread: Re: Finding the nearest match without reusing results
- Next by thread: Re: Finding the nearest match without reusing results
- Index(es):
Relevant Pages
|