While using Excel 365 recently, I needed to create one dynamic array that would consist of two arrays, with one stacked on top of the other. Or, in relational database terms, I wanted to append two arrays.
I thought about it for a minute or two, and saw no obvious way to do it. I searched online, but found no solution.
So I had to figure it out for myself.
While playing around with the arrays, I also discovered how to create one array from two arrays placed side-by-side…just as books are placed on a shelf.
You can download the workbook with both solutions here.
So let’s look at the solutions…
Stacked Arrays
Here, our goal is to stack the data from DataA on top of the data from DataB and return the Final Array shown here:
I decided to use the LET function to manage the task. By doing so, I then could wrap a LAMBDA function around it to define a simple function that I could use whenever I needed to stack two arrays in the future.
Best practice for the LET function uses this syntax:
=LET(name1, name_value1, [name2, name_value2], […, …], named_result)
That is, you set up any number of pairs of names and their values (usually returned by a formula), and then in the last argument, you specify which named value you want to return. Usually, the named_result will be the name you defined last. But when you want to debug the formula, you can return any other name you’ve defined in the formula.
As you can see in the LET function’s formula below, the formula named Result relies on the values defined in earlier arguments. Let’s look at those arguments one at a time.
=LET(
NumCols,COLUMNS(DataA),
RowsA, ROWS(DataA),
RowsB, ROWS(DataB),
SeqCol, SEQUENCE( RowsA + RowsB ),
SeqRow, SEQUENCE(1, NumCols),
Result, IF( SeqCol <= RowsA, INDEX( DataA, SeqCol, SeqRow ),
INDEX( DataB, SeqCol – RowsA, SeqRow ) ), Result )
NumCols: Because both arrays must have the same number of columns, NumCols contains the number of columns in one of the arrays.
RowsA & RowsB: These names capture the number of rows in each array, of course.
SeqCol: The Sequence Column is the key array. Its height determines the number of rows in the final array.
The SEQUENCE function…
= SEQUENCE( rows, [columns], [start], [step] )
…allows us to specify important characteristics about the sequence that the function returns. But in this case, all we needed was a column of numbers from 1 through the number of rows in the final array.
SeqRow: Here, we capture one row of sequence values, where SEQUENCE counts from 1 to the total number of columns in either array.
And then we get to the Result. Let’s look just at the Result formula:
IF( SeqCol <= RowsA, INDEX( DataA, SeqCol, SeqRow ),
INDEX( DataB, SeqCol – RowsA, SeqRow ) )
Here’s an important thing to keep in mind with this formula: In any argument in any formula, if the argument expects a single value but we provide an array instead, the function will calculate a value for each item in the array and then return an array with every calculated result.
So in the Result formula, we begin by giving the IF function the SeqCol name, which contains a column array of seven values, numbering one through seven. That means that we automatically force the IF function to return an array of seven values, which is exactly what we want in this case.
The test in the IF function—SeqCol <= RowsA—will cause the function to first return values for the DataA array and then for the DataB array.
The DataA array values are simple: INDEX( DataA, SeqCol, SeqRow )
Here, INDEX is expecting single values in its second and third arguments. But because we pass arrays to both arguments, we’ll get an array of values. Specifically, the SeqCol provides one column of row numbers and the SeqRow provides one row of column numbers.
In short, this section of the IF function returns an array for every cell in DataA.
But then, when the SeqCol values are greater than the RowsA value, the second INDEX function in this formula does its work.
IF( SeqCol <= RowsA, INDEX( DataA, SeqCol, SeqRow ),
INDEX( DataB, SeqCol – RowsA, SeqRow ) )
The second INDEX returns data from the DataB array. But it must reference that array in rows 1, 2, and 3 of that array. That’s no problem because we can calculate those row values merely by subtracting the total number of rows in DataA from the current rows in the SeqCol. That is, to get the data from DataB, the second INDEX function uses these calculations:
INDEX( DataB, 5 – 4, {1, 2, 3} )
INDEX( DataB, 6 – 4, {1, 2, 3} )
INDEX( DataB, 7 – 4, {1, 2, 3} )
And that gives us these results in the Final Array:
The Shelved Array
In this case, we want to “shelve” the arrays side-by-side, just as we’d set books on a shelf, as shown here:
The logic for shelving multi-column arrays follows about the same pattern as for stacking arrays:
=LET(
ColsC, COLUMNS(DataC),
ColsD, COLUMNS(DataD),
NumRows, ROWS(DataC),
SeqCol, SEQUENCE( NumRows ),
SeqRow, SEQUENCE(1, ColsC + ColsD),
Result, IF( SeqRow <= ColsC, INDEX( DataC, SeqCol,SeqRow),
INDEX( DataD, SeqCol, SeqRow-ColsC ) ), Result )
Work through each line of this formula to test your knowledge about how both LET formulas return its results.
And finally, if you want to shelve only single-column arrays like this, the formula is much shorter:
=CHOOSE({1,2},DataE, DataF )
I spent a few minutes trying to make the CHOOSE formula work with multi-column source arrays, but I struck out. Perhaps you’ll have better luck.