Problem
I am working on learning more about arrays and loops! I have been trying to create a method which will allow me to take for example an unlimited amount of arrays and append or concatenate them to each other!
The method takes multiple arrays as arguments and creates a merged array in the order in which the arrays where passed through the parameter. My question is:
Is there a more efficient way to do this which I am maybe not aware of?
I have done my research but most ways usually use a set amount of arrays in order to achieve this. This is just something I am doing for the sake of learning, and these functions may not be useful in real life scenarios.
public static Object[] merge(Object[]... arrays) {
int[] lenghts = new int[arrays.length];
int wholeLenght = 0;
int maxLenght = 0;
for(int i = 0; i<arrays.length; i++){
lenghts[i] = arrays[i].length;
}
for(int i = 0; i<lenghts.length; i++){
wholeLenght += lenghts[i];
}
Object[] merged = new Object[wholeLenght];
for(int i = 0; i<arrays.length; i++){
System.arraycopy(arrays[i], 0, merged, maxLenght, lenghts[i]);
maxLenght+=lenghts[i];
}
return merged;
}
Version which allows generic types. A similar function can be easily made for
primitive a primitive type array of choice.
@SuppressWarnings("unchecked")
public static <T> T[] mergeGenerics(T[]... arrays) {
int[] lenghts = new int[arrays.length];
boolean allowOperation = true;
int wholeLenght = 0;
int maxLenght = 0;
T[] merged = null;
for (int i = 0; i < arrays.length; i++) {
if (arrays[0].getClass().getComponentType() != arrays[i].getClass().getComponentType()) {
allowOperation = false;
System.err.println("The arrays are not all of the same type!!");
}
}
if (allowOperation) {
for (int i = 0; i < arrays.length; i++) {
lenghts[i] = arrays[i].length;
}
for (int i = 0; i < lenghts.length; i++) {
wholeLenght += lenghts[i];
}
merged = (T[]) Array.newInstance(arrays[0].getClass().getComponentType(), wholeLenght);
for (int i = 0; i < arrays.length; i++) {
System.arraycopy(arrays[i], 0, merged, maxLenght, lenghts[i]);
maxLenght += lenghts[i];
}
}
return merged;
}
Test The method maybe with:
Object[] obj1 ={"A","B"};
Object[] obj2 ={"C","D","E","F"};
Object[] obj3 ={"G","H","I","J","K"};
Object[] obj4 ={"L","M","N"};
String[] obj5 ={"O","P","Q","R"};
Object[] obj6 ={"S"};
Object[] merge = Class.merge(obj1,obj2,obj3,obj4,obj5,obj6);
for(int i = 0; i<merge.length; i++)
System.out.print(merge[i]+" ");
Solution
The approach suggested by @Caridorc has the limitation that it requires the objects passed to be Iterable
s, which an array is not a subtype of due to certain limitations. Therefore, you either have to change your data type from an array to some Iterable
extension or use a different approach.
A simple, naive solution would be to simple convert arrays to List
s and add all to a combined list like this:
private static <T> T[] merge2(T[] ... arrays) {
final List<T> list = new ArrayList<>();
for (T[] array : arrays) {
list.addAll(Arrays.asList(array));
}
return list.toArray(arrays[0]);
}
which has certain runtime penalties as you create multiple new objects which furthermore on adding new lists via addAll(...)
will increase the capacity of the list by 50% of its current length on exceeding the current size and therefore contain some unused space.
A further approach is based on your generic approach but contains only two loops. The first one to learn the total length, the second one to copy the data from the source array to the target array:
private static <T> T[] merge3(T[] ... arrays) {
Class<?> componentType = arrays[0].getClass().getComponentType();
int size = 0;
for (T[] array : arrays) {
size += array.length;
}
T[] result = (T[])arrays[0].getClass().cast(Array.newInstance(componentType, size));
int pos = 0;
for (T[] array : arrays) {
System.arraycopy(array, 0, result, pos, array.length);
pos += array.length;
}
return result;
}
This method is shorter and IMO easier to read then your implementation. A short comparison between the 4 algorithms reveals also, that it is faster then the other ones:
Original - Total: 51.402688 ms, Average: 5.1402688E-4 ms
Original Generics - Total: 42.150794 ms, Average: 4.2150794E-4 ms
Merge2 - Total: 87.06185 ms, Average: 8.706185E-4 ms
Merge3 - Total: 28.701731 ms, Average: 2.8701731E-4 ms
Note however, that I did a naive performance test and didn’t take care of compiler optimizations or warm-up phases. The comparison therefore may be flawed and data messy.
Also note that I did not take care of checking if all of the passed arrays are of the same type. Mingling multiple arrays of different types might only work if the first passed array is of type object (or at least a supertype of any further arrays). In general, I wouldn’t allow to pass arrays of different types though.
BTW: the double cast in merge3
is to get rid of Unchecked cast
warnings
Please note that arrays or lists are sub-optimal for this task as the result will be given only after the processing of the last array is finished (time overhead) and the whole result array has to be stored in memory (space overhead).
Iterables do not store new values and yield from the iterables as soon as possible, so I suggest that you use them as data structure.
A standard implementation already exists so you may compare it to what you come up with to enhance the learning experience.
Spelling
Length
is (surprisingly consistently) misspelled as lenght
, it’s highly recommended to fix this mistake. 🙂
Stream-based processing
You can take advantage of the Stream
semantics in Java 8 to express the steps in another way:
@SuppressWarnings("unchecked")
public static <T> T[] merge(T[]... arrays) {
return Arrays.stream(arrays)
// MAGIC HERE
.toArray(x -> (T[]) Array.newInstance(
arrays.getClass().getComponentType().getComponentType(), x));
}
- Use
Arrays.stream(T[])
to convert the array ofT[]
arrays to aStream<T[]>
stream. MAGIC HERE
… will be explained below, essentially we need to convertStream<T[]>
to justStream<T>
.- Use the reflection-based
Array.newInstance(Class, int)
method to create our generic array.getComponentType()
is called twice, as the first time only gives usClass<T[]>
, and the second call on that will returnClass<T>
.
There’s two possible approaches for MAGIC HERE
.
-
A single
reduce(U, BiFunction, BinaryOperator)
step to reduce all theT[]
stream elements into justStream<T>
:.reduce(Stream.empty(), (s, a) -> a == null ? s : Stream.concat(s, Arrays.stream(a)), Stream::concat)
Stream.empty()
is our starting point, and I’ve chosen to specially handle the case where aT[]
array might benull
by skipping over it. To combine twoStream
pipelines together,Stream.concat(Stream, Stream)
is used as a method reference as the final step. -
Requires two steps, but each step is arguably simpler to digest, and overall performance is much faster too:
.map(a -> a == null ? Stream.empty() : Arrays.stream(a)) .flatMap(Function.identity())
The first step also takes care of a
null
array, by treating it as an empty stream. The second step replaces eachStream<T>
element with its contents on a resultingStream<T>
.
Using the second approach as the foundation, you can attempt to parallelize the process by adding the Stream.parallel()
, and if you are OK to drop the null
-handling, a tweaked solution can be just:
@SuppressWarnings("unchecked")
public static <T> T[] merge(T[]... arrays) {
return Arrays.stream(arrays)
.parallel()
.map(Arrays::stream)
.flatMap(Function.identity())
.toArray(x -> (T[]) Array.newInstance(
arrays.getClass().getComponentType().getComponentType(), x));
}