# Test for Array vs Range in my TEXTJOIN UDF

Posted on

Problem

I use the following code in many of my answers on Stack Overflow, to mimic the new TEXTJOIN function available in Office 365 Excel:

``````Function TEXTJOIN(delim As String, skipblank As Boolean, arr)
Dim d As Long
Dim c As Long
Dim arr2()
Dim t As Long, y As Long
t = -1
y = -1
If TypeName(arr) = "Range" Then
arr2 = arr.Value
Else
arr2 = arr
End If
On Error Resume Next
t = UBound(arr2, 2)
y = UBound(arr2, 1)
On Error GoTo 0

If t >= 0 And y >= 0 Then
For c = LBound(arr2, 1) To UBound(arr2, 1)
For d = LBound(arr2, 1) To UBound(arr2, 2)
If arr2(c, d) <> "" Or Not skipblank Then
TEXTJOIN = TEXTJOIN & arr2(c, d) & delim
End If
Next d
Next c
Else
For c = LBound(arr2) To UBound(arr2)
If arr2(c) <> "" Or Not skipblank Then
TEXTJOIN = TEXTJOIN & arr2(c) & delim
End If
Next c
End If
TEXTJOIN = Left(TEXTJOIN, Len(TEXTJOIN) - Len(delim))
End Function
``````

I have successfully made it so it works with both ranges and arrays, so it works with:

``````=TEXTJOIN(",",TRUE,A1:B7)
``````

As well as

``````{=TEXTJOIN(",",TRUE,IF(A1:A7 = "x",B1:B7,""))}
``````

The question is: Is this the best method to check for array vs. range? Is there a better method?

As stated this works, nicely with both, but I have to think that I am doing it in a round about manner.

Solution

Okay, let’s first eliminate the easy stuff Rubberduck points out:

• Function is implicitly `Public` and implicitly returns a `Variant` – should be a `String`.
• All parameters are implicitly passed `ByRef` (and semantically should be `ByVal`).
• Parameter `arr` is implicitly `Variant`,
• Local variables `c`, `d`, `t`, `y`, `arr2` all have terrible, meaningless names (`arr`, matey! ..I like calling it Pirate Notation ðŸ˜‰
• `String`-returning `Left\$` function should be used over `Variant`-returning `Left` function.
• `vbNullString` should probably be preferred over `""` empty string literal.

With default settings Rubberduck will also complain about `Dim t As Long, y As Long`, because having multiple declarations in a single instruction isn’t ideal.

The variables are declared at the top of the function rather than as close as possible to their usage, which makes it hard to see what’s used where.

Type-checking isn’t type safe:

``````If TypeName(arr) = "Range" Then
arr2 = arr.Value
Else
arr2 = arr
End If
``````

If the Microsoft Word object model is referenced, you can pass a `Word.Range` object and it will happily take it; same if I made my own `Range` class and gave it an instance. Use the `TypeOf` operator to perform compile-time type-safe type checks:

``````If TypeOf arr Is Excel.Range Then
'definitely an Excel Range object
Else
'could be anything
End If
``````

Note, in the `Else` branch `arr` could literally be anything – but you’re assuming it’s an array. You could use the `IsArray` function to make sure of that, and then you can also assert on the upper bound using a rather ugly helper function, to make sure you’re looking at the 2D array you’re expecting.

``````Else
Debug.Assert IsArray(arr)
Debug.Assert GetArrayDimSize(arr) = 2
arr2 = arr
End If
``````

With the array dimension count known/asserted, the `On Error Resume Next` statement that follows can be removed… and `t` can be renamed `sourceColumns` or `colCount` or whatever, and `y` can be renamed `sourceRows` or `rowCount` or whatever – and `arr2` could be renamed to `sourceArray`:

``````sourceColumns = UBound(sourceArray, 2)
sourceRows = UBound(sourceArray, 1)
``````

And with that naming scheme I think I’d rename the `arr` parameter to `source`.

``````If t >= 0 And y >= 0 Then
``````

Ah, ok, so that condition is leveraging the fact that VBA would have thrown an error given a 1D array, leaving `t = 0`. Not very obvious, let’s improve that.

But first we need to tweak the assertions – we don’t really want a 2D array, we just want any array with at most two dimensions. So… let’s be less forgiving given anything other than that:

``````Dim dimensionCount As Long
If TypeOf(arr) Is Excel.Range Then
sourceArray = source.Value
dimensionCount = 2 'Range.Value is always a 2D array
Else
Dim isValidArray As Boolean
isValidArray = IsArray(source)
If isValidArray Then dimensionCount = GetArrayDimSize(source)
isValidArray = dimensionCount <> 0 And dimensionCount <= 2
If Not isValidArray Then Err.Raise 5, "TEXTJOIN", "Expected: 1D or 2D array"
End If
``````

And now that `If` statement can be much more explicit about what’s going on and why:

``````If dimensionCount = 2 Then
'handle 2D array
Else
Debug.Assert dimensionCount = 1
'handle 1D array
End If
``````

So, `c` iterates dimension 1 / rows, `d` iterates dimension 2 / columns.

Now all these concatenations are inherently slow. I realize that’s “just a worksheet function” and you’d like to share it as a single, simple, cohesive and focused little piece of code… but given thousands of iterations, VBA’s string-handling will start becoming the performance bottleneck of the function.

There’s a lightning-fast StringBuilder class right here on this site that you can use to address that.

As a bonus, using a `StringBuilder` makes the function assign to its return value only once – as opposed to what you have now, which is treating the return-value-identifier as if it were a local variable (which it technically is, …I just don’t like doing that).

Not sure why the `d` / column loop is inconsistent here:

``````For d = LBound(arr2, 1) To UBound(arr2, 2)
``````

You’re iterating the 2nd dimension, the `LBound` should be off the 2nd dimension too (yes, it should be the very same as that of the 1st dimension). Also, the upper bounds of both dimensions are already known and stored in local variables:

``````For currentRow = LBound(sourceArray, 1) To sourceRows
For currentColumn = LBound(sourceArray, 2) To sourceColumns
``````

This code will throw an error (by design?) if the array contains an `Error` value:

``````If arr2(c, d) <> "" Or Not skipblank Then
``````

You could have a parameter that helps you decide how to treat errors – perhaps an `Enum` could be used:

``````Public Enum TEXTJOIN_ErrorValues
ThrowOnError
SkipError
IncludeErrorText
End Enum
``````

`ThrowOnError` would be the current/default behavior; `SkipError` would treat errors as blanks, and `IncludeErrorText` would include the e.g. `#N/A` error text into the result.

Except, once you have an `Variant/Error` value and not a `Range`, it’s pretty much impossible to get the text back (unless you want to map `CVErr(xlErrWhatever)` values to a corresponding string… probably not worth it) – so scratch that enum value, and the behavior becomes either throw on error, or skip error values. And that’s entirely possible, and quite easy to do – an additional optional `Boolean` parameter could be helpful for that.

``````Dim sb As StringBuilder
Set sb = New StringBuilder
'...

For currentRow = LBound(sourceArray, 1) To sourceRows
For currentColumn = LBound(sourceArray, 2) To sourceColumns
If Not IsError(sourceArray(currentRow, currentColumn)) Then
If sourceArray(currentRow, currentColumn) <> vbNullString Or Not skipBlank Then
sb.Append sourceArray(currentRow, currentColumn)
sb.Append delim
End If
ElseIf Not skipErrors Then
sb.Append delim
End If
Next
Next

TEXTJOIN = sb.ToString
``````

Now, that’s the 2D loop.. the 1D loop does essentially the same thing.. and that’s annoying. I’d extract a method for that.

``````Private Sub ProcessValue(ByVal value As Variant, ByVal sb As StringBuilder, ByVal delim As String, ByVal skipBlanks As Boolean, ByVal skipErrors As Boolean)
If Not IsError(value) Then
If CStr(value) <> vbNullString Or Not skipBlanks Then
sb.Append CStr(value)
sb.Append delim
End If
ElseIf Not skipErrors Then
sb.Append delim
End If
End Sub
``````

That turns the logic into:

``````If dimensionCount = 2 Then
For currentRow = LBound(sourceArray, 1) To sourceRows
For currentColumn = LBound(sourceArray, 2) To sourceColumns
ProcessValue sourceArray(currentRow, currentColumn), sb, delim, skipBlanks, skipErrors
Next
Next
Else
Debug.Assert dimensionCount = 1
For currentRow = LBound(sourceArray, 1) To sourceRows
ProcessValue sourceArray(currentRow), sb, delim, skipBlanks, skipErrors
Next
End If

TEXTJOIN = sb.ToString
``````

The order of the parameters strikes me as unintuitive – might be by design to match Microsoft’s function, by I would have made the source array/range the first parameter, followed by an optional delimiter, followed by an optional flag to skip blanks (followed by an optional flag to skip errors).

Disclaimer: I know you are asking a relatively straightforward question, but this is CR after all…

Checking an array versus a range is perfectly fine (assuming this function will generally be called from the worksheet), but there are some important considerations to be made.

First, it took me a few minutes to decode what your code is doing. Given that you are quite active in the community (and thus you are helping reach many people) it would be worth investing in a copy of ‘Clean Code’ by Robert C Martin. This is a book that was recommended to me by Mat’s Mug, and has completely change how I think about code, and how I approach coding.

As a result, the first thing I noticed was how the small things within your code add up towards a larger potential for bugs. For example:

``````On Error Resume Next
t = UBound(arr2, 2)
y = UBound(arr2, 1)
On Error GoTo 0

If t >= 0 And y >= 0 Then

...

End If
``````

If I interpret this bit correctly, you’re just testing whether there is a second dimension within the array, and if there is, you’re looping through the array as a 2d array. Otherwise, you’re looping through as a one-dimensional array. Wouldnt it be nice if we could explicitly say that in VBA?

``````Private Function ArrayIsTwoDimensional(ByVal TestArray As Variant) As Boolean
Dim Test As Variant

On Error Resume Next
Test = TestArray(LBound(TestArray, 1), LBound(TestArray, 2))

' Note: This will work for a 3d array, or a 4d array, etc. If there is any risk of a higher-dimension array, use a function that returns
' the exact number of dimensions.
ArrayIsTwoDimensional = (Err.Number = 0)
On Error GoTo 0
End Function

Private Function JoinFromTwoDimensionalArray(ByVal Delimeter As String, ByVal SkipBlanks As Boolean, ByVal InputArray As Variant)
Dim Join As String

Dim i As Long
For i = LBound(InputArray, 1) To UBound(InputArray, 1)
Dim j As Long
For j = LBound(InputArray, 2) To UBound(InputArray, 2)
If InputArray(i, j) <> vbNullString Or Not SkipBlanks Then
If Join <> vbNullString Then Join = Join & Delimeter
Join = Join & InputArray(i, j)
End If
Next
Next

JoinFromTwoDimensionalArray= Join
End Function

Private Function JoinFromOneDimensionalArray(ByVal Delimeter As String, ByVal SkipBlanks As Boolean, ByVal InputArray As Variant) As String
Dim Join As String

Dim i As Long
For i = LBound(InputArray) To UBound(InputArray)
If InputArray(i) <> vbNullString Or Not SkipBlanks Then
' Note the placement of the delimeter forces the joined text to only ever add a delimeter when needed.
If Join <> vbNullString Then Join = Join & Delimeter
Join = Join & InputArray(i)
End If
Next i

JoinFromOneDimensionalArray = Join
End Function
``````

That was the first step I took in making your code make a little bit more sense. By extracting the two loops, and the dimension check, the main routine becomes much cleaner, and gets to rely on a few `Private Function`s to do the work it needs to do. The beauty of this is that your code now explicitly says what it is doing (anyone could read the code, regardless of whether they have been coding for days or years).

The next step I took was explicitly checking for the supported types. For example:

``````If TypeName(arr) = "Range" Then
arr2 = arr.Value
Else
arr2 = arr
End If
``````

If `arr` is a `Worksheet` for example, you will get a `With block or Object variable not set` error when trying to assign `arr2` to `arr` (the names here are troubling as well). Let’s explicitly raise an error if someone passes a value we don’t currently support:

``````Private Const ERR_NUMBER_TYPE_NOT_SUPPORTED As Long = 513
Private Const ERR_MESSAGE_TYPE_NOT_SUPPORTED As String = "Please provide a Variant() or Range for the InputValues argument of TEXTJOIN. Alternate types are not currently supported."

Dim Values As Variant
Select Case TypeName(InputValues)
Case "Range"
Values = InputValues.Value
Case "Variant()"
Values = InputValues
Case Else
Err.Raise ERR_NUMBER_TYPE_NOT_SUPPORTED, ERR_MESSAGE_TYPE_NOT_SUPPORTED
End Select
``````

The beauty here is that not only will our code raise an error explicitly related to the source of the problem, but we also have a very modular way of adding additional support. For example, if we wanted to support a worksheet (for whatever reason) we would want to update the error message, and add just a bit of additional code:

``````Dim Values As Variant
Select Case TypeName(InputValues)
Case "Range"
Values = InputValues.Value
Case "Variant()"
Values = InputValues
Case "Worksheet"
Values = GetArrayFromWorksheet(InputValues)
Case Else
Err.Raise ERR_NUMBER_TYPE_NOT_SUPPORTED, ERR_MESSAGE_TYPE_NOT_SUPPORTED
End Select
``````

Finally, making these few small changes has a huge impact on the readability/maintainability of the code. Here is the finished product (identical in function):

``````Private Const ERR_NUMBER_TYPE_NOT_SUPPORTED As Long = 513
Private Const ERR_MESSAGE_TYPE_NOT_SUPPORTED As String = "Please provide a Variant() or Range for the InputValues argument of TEXTJOIN. Alternate types are not currently supported."

Public Function TEXTJOIN(ByVal Delimeter As String, ByVal SkipBlanks As Boolean, ByVal InputValues As Variant) As String
Dim Values As Variant
Select Case TypeName(InputValues)
Case "Range"
Values = InputValues.Value
Case "Variant()"
Values = InputValues
Case Else
Err.Raise ERR_NUMBER_TYPE_NOT_SUPPORTED, ERR_MESSAGE_TYPE_NOT_SUPPORTED
End Select

If ArrayIsTwoDimensional(InputValues) Then
TEXTJOIN = JoinFromTwoDimensionalArray(Delimeter, SkipBlanks, InputValues)
Else
TEXTJOIN = JoinFromOneDimensionalArray(Delimeter, SkipBlanks, InputValues)
End If
End Function

Private Function ArrayIsTwoDimensional(ByVal TestArray As Variant) As Boolean
Dim Test As Variant

On Error Resume Next
Test = TestArray(LBound(TestArray, 1), LBound(TestArray, 2))

' Note: This will work for a 3d array, or a 4d array, etc. If there is any risk of a higher-dimension array, use a function that returns
' the exact number of dimensions.
ArrayIsTwoDimensional = (Err.Number = 0)
On Error GoTo 0
End Function

Private Function JoinFromTwoDimensionalArray(ByVal Delimeter As String, ByVal SkipBlanks As Boolean, ByVal InputArray As Variant)
Dim Join As String

Dim i As Long
For i = LBound(InputArray, 1) To UBound(InputArray, 1)
Dim j As Long
For j = LBound(InputArray, 2) To UBound(InputArray, 2)
If InputArray(i, j) <> vbNullString Or Not SkipBlanks Then
If Join <> vbNullString Then Join = Join & Delimeter
Join = Join & InputArray(i, j)
End If
Next
Next

JoinFromTwoDimensionalArray= Join
End Function

Private Function JoinFromOneDimensionalArray(ByVal Delimeter As String, ByVal SkipBlanks As Boolean, ByVal InputArray As Variant) As String
Dim Join As String

Dim i As Long
For i = LBound(InputArray) To UBound(InputArray)
If InputArray(i) <> vbNullString Or Not SkipBlanks Then
' Note the placement of the delimeter forces the joined text to only ever add a delimeter when needed.
If Join <> vbNullString Then Join = Join & Delimeter
Join = Join & InputArray(i)
End If
Next i

JoinFromOneDimensionalArray = Join
End Function
``````

We get rid of all of those counter variables (opting instead for i and j which are pretty standard). Our other variables are quite clearly named, and everything is what you would expect. As a result if you (however many years from now) come back to the function to add something new, or if it breaks, you can quickly find the source instead of trying to remember what `t` and `y` were and why you set `t` or `y` to `-1` and so on so forth.

Again, highly recommend picking up a copy of `Clean Code`. The idea of levels of abstraction, functions, classes, etc all can be difficult to understand at first but once you get going your code will become much easier to read.

I really like the OP’s concept but I feel that a pseudo Excel `Application.WorksheetFunction.TextJoin` function should take a ParamArray of mixed data types.

In my implementation I use a combination of a string buffer and the Mid function improve the speed by avoiding concatenating large strings. To test the speed of my function I filled 500K cells with random strings ranging from 5 to 50 characters in length. Using the TheSpreadsheetGuru: Timer I determined that it took 1.95 seconds to create a string of 14,256,557 characters.

I’m sure that by utilizing the lightning-fast StringBuilder that @Mat’sMug mentioned would increase the speed 5 fold. I did not use it myself because I wanted convenience of having a single function do all the work.

## Formulas

=TextJoin2(“,”,FALSE,”Numbers”,A6:C6,A7:C9,{10,11,12})

=TextJoin2(“,”,TRUE,”Numbers”,A6:C6,A7:C9,{10,11,12})

## Code

``````Function TextJoin2(Delimiter As String, Ignore_Emtpy As Boolean, ParamArray Args() As Variant) As Variant
Dim results As String
Dim count As Long, i As Long, j As Long, length As Long, pos As Long
Dim argument As Variant, v As Variant

Select Case TypeName(Args(0))
Case "Empty"
argument = Array()
Case "Range"
If Args(0).count = 1 Then
argument = Array(Args(0).value)
Else
argument = Args(0).value
End If
Case "String"
argument = Array(Args(0))
Case "Variant()"
argument = Args(0)
End Select

For Each v In argument
length = length + Len(v)
count = count + 1
Next

results = Space(length + count * Len(Delimiter))
If count - 1 + LBound(argument) = UBound(argument) Then
For Each v In argument
If Not Ignore_Emtpy Or Len(v) > 0 Then
Mid(results, pos + 1, Len(v) + Len(Delimiter)) = v & Delimiter
pos = pos + Len(v) + Len(Delimiter)
End If
Next
Else
For i = LBound(argument) To UBound(argument)
For j = LBound(argument, 2) To UBound(argument, 2)
If Not Ignore_Emtpy Or Len(argument(i, j)) > 0 Then
Mid(results, pos + 1, Len(argument(i, j)) + Len(Delimiter)) = argument(i, j) & Delimiter
pos = pos + Len(argument(i, j)) + Len(Delimiter)
End If
Next
Next
End If

'Trim results needed to adjust for skipping empty values
results = Left(results, pos)

For i = 1 To UBound(Args)
results = results & TextJoin2(Delimiter, Ignore_Emtpy, Args(i)) & Delimiter
Next

Debug.Print Left(results, Len(results) - Len(Delimiter))
TextJoin2 = Left(results, Len(results) - Len(Delimiter))
End Function
``````