6

The problem I'm essentially trying to solve is a VLOOKUP that is checking Columns A:E for a value, and returning the value held in Column F should it be found in any of these.

With VLOOKUP not being up to the task I have looked into the INDEX-MATCH syntax, but I am struggling to get my head around how to complete this for an array of values, as opposed to a single column. I've built an example data set below to try and explain this:

A------B------C------D------E------F

1------2------3------4------5------Apple

12-----13--------------------------Banana

14---------------------------------Carrot

Should the cell being checked contain 1,2,3,4 or 5, the result of the formula should be Apple. If it is 12 or 13, it should return Banana and finally if it contains 14, it should return Carrot.

The second half to this comes from the fact that the cell being referenced isn't a single value, but a full table itself. As such, this search will be completed a large number of times according to different values.

So to demonstrate, there is another table elsewhere (as below) that has these values in. I am attempting to have the system identify which row, and therefore which of the "Apple, Banana, Carrot" values to associate with each column. The table would look as below

H------I------------

1------(Apple)----

2------(Apple)----

12-----(Banana)-

etc.-----------------

The values in brackets are where the formula is calculating these values.

4 Answers4

4

You have a number of different cases. Let's consider one case:

Somewhere in columns A through E there is one and only cell containing 13, return the contents of the cell in column F in the same row.

We will use a "helper" column. In G1 enter:

=COUNTIF(A1:E1,13)

and copy down. This allows us to identify the row:

enter image description here
Now we can use MATCH()/INDEX():

Pick a cell and enter:

=INDEX(F:F,MATCH(1,G:G,0))

enter image description here

If the "rules" change and there could be more than one 13 in a row or several rows containing 13, we would modify the helper column.

EDIT#1:

Based on your update, the first step would be to pull the hard-coded 13 out of the formulas in the "helper" column and put it in its own cell, (say H1). Then you can run different cases simply by changing a single cell.

If you have a large number of cases in a table, you could create a macro to setup each case (update H1) and record the results.

2

For a single formula in H1:

=INDEX($F$1:INDEX(F:F,MATCH("ZZZ",F:F)),AGGREGATE(15,6,ROW($A$1:INDEX(E:E,MATCH("ZZZ",F:F)))/($A$1:INDEX(E:E,MATCH("ZZZ",F:F))=H1),1))

This is an array formula so we need to confine the references to the size of the data set. All the INDEX(E:E,MATCH("ZZZ",F:F)) do that. This returns the last row in column F that has text. It then sets that as the last row to iterate.

@Gary'sStudent method avoids Array formulas and may be the method needed. As the Dataset and number of formulas increase so does the time for calculations. Even to, at some point, the crashing of Excel. Usually this takes a few thousand, but I want to make the warning.

enter image description here


EDIT

To avoid using Array formulas and still be one formula:

=IFERROR(INDEX(F:F,MIN(IFERROR(MATCH($H1,A:A,0),1050000),IFERROR(MATCH($H1,B:B,‌​0),1050000),IFERROR(MATCH($H1,C:C,0),1050000),IFERROR(MATCH($H1,D:D,0),1050000),I‌​FERROR(MATCH($H1,E:E,0),1050000))),"")

This is based on the OP's answer, just combined that method into one formula.

This formula will ignore duplicate entries and return the first row in which the number is found.

And because it is a non array full column references are not detrimental to the calc times.

![enter image description here

Scott Craner
  • 23,868
2

Based on my own research & discussions with @Gary'sStudent, the solution I used was to create a MATCH formula for each of the possible columns that the value could be contained within, along with a Blank catching "IFERROR" statement.

I1 =IFERROR(MATCH($H1,A$1:A$3,0),"")     
J1 =IFERROR(MATCH($H1,B$1:B$3,0),"")     
K1 =IFERROR(MATCH($H1,C$1:C$3,0),"")    
L1 =IFERROR(MATCH($H1,D$1:D$3,0),"")    
M1 =IFERROR(MATCH($H1,E$1:E$3,0),"")
etc.

These columns can now be hidden to prevent user confusion/interaction.

I then created an index which accumulate these into a single value, which should match the ROW in question. Again, there is a check (first SUM) to enter this as a blank value if the value isn't found in the table.

N1 =IF(SUM(I1:M1)=0,"",INDEX($A$1:$F$3,SUM(I1:M1),6))

INDEX-MATCH ARRAY Finally, I entered a few conditional formatting formula to ensure that the user identifies and replaces/removes any duplicate data.

A1:E3 Cell contains a blank value                [Formatting None Set, Stop if True]
A1:E3 =COUNTIF($A$1:$E$3,A1)>1                   [Formatting Text:White, Background:Red]

H1:N1 =COUNTIF($A$1:$E$3,H1)>1       [Formatting Text:Red, Background:Red]

This is merely a cue to the user to remove this duplicate data.

enter image description here

1

A different method would be based on an auxiliary table, which represents how this "should" have been structured in the first place. This would avoid the monster equations that are annoying to debug and change afterwards, and it's able to cleanly solve a varying number of columns, unlike the idea of having 5 lookup columns.

If the above is in Sheet1, add a Sheet2. On that place four columns; Row, Column, ID, Name

Formula in Row should be (in psuedo code, "Last" means "for the row above in sheet2")

=IF(Column = 1, Last row + 1 , Last row)

Formula in Column:

=IF(OR(Last Column = 5; INDEX(StartTable, last row, last column + 1) = ""), 1, Last column+1)

Formula in ID and Name:

=INDEX(StartTable, Row, Column)    
=INDEX(NameColumn, Row, 1)

Then you fill this down (basically until row>number of rows in the original table).

Finally you use the new table with an ordinary vlookup or index/match.

PRO: Much simpler formulas, easier to use and understand.

CONS: Need extra table, must maintain the length of the table. Performance wise there is a risk since this pretty much requires a single thread for the entire "string" of values.

Also, if a couple of error rows are ok, the code can be somewhat simpler and possibly more performant, we can then assume that number of columns always is 5, giving both row and column .

NiklasJ
  • 681