Problem
I have this function that takes a number N, and N C strings, concatenates and returns the result:
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* mystrcat(int count, ...)
{
char** p;
char* result;
char* ptr;
size_t* len_array;
va_list ap;
int j;
size_t total_length;
if (count < 1)
{
return "";
}
va_start(ap, count);
p = malloc(sizeof(char*) * count);
len_array = calloc(count, sizeof(size_t));
total_length = 0;
for (j = 0; j != count; ++j)
{
p[j] = va_arg(ap, char*);
total_length += (len_array[j] = strlen(p[j]));
}
result = malloc(sizeof(char) * (total_length + 1));
ptr = result;
for (j = 0; j != count; ++j)
{
strcpy(ptr, p[j]);
ptr += len_array[j];
}
*ptr = ' ';
va_end(ap);
return result;
}
int main(int argc, char* argv[]) {
puts(mystrcat(5, "Hello", ", ", "world", ", ", "friends!"));
}
As always, please tell me anything that comes to mind.
Solution
Always return allocated string or never do it
This code here:
if (count < 1)
{
return "";
}
is bad because you return a static string whereas the normal case returns a string allocated by malloc
. If the caller then tries to free this static string, it will cause some kind of error. You can fix this by either returning an allocated empty string:
if (count < 1)
{
return calloc(1, 1);
}
or by just returning NULL:
if (count < 1)
{
return NULL;
}
Memory leaks
Your function never frees p
or len_array
. Actually, I would probably not even use temporary arrays like these. If I were to write this function, I would measure the total length on one pass, and then copy the strings on a second pass.
Rewrite
After fixing the above issues, my rewrite of your function would look like this:
char *mystrcat(int count, ...)
{
va_list ap;
size_t len = 0;
if (count < 1)
return NULL;
// First, measure the total length required.
va_start(ap, count);
for (int i=0; i < count; i++) {
const char *s = va_arg(ap, char *);
len += strlen(s);
}
va_end(ap);
// Allocate return buffer.
char *ret = malloc(len + 1);
if (ret == NULL)
return NULL;
// Concatenate all the strings into the return buffer.
char *dst = ret;
va_start(ap, count);
for (int i=0; i < count; i++) {
const char *src = va_arg(ap, char *);
// This loop is a strcpy.
while (*dst++ = *src++);
dst--;
}
va_end(ap);
return ret;
}
Automate consistency
p = malloc(sizeof(char*) * count);
len_array = calloc(count, sizeof(size_t));
If you instead say
p = malloc(count * sizeof *p);
len_array = calloc(count, sizeof *len_array);
You can change the types of p
and len_array
without having to change these calls as well.
You should also check that these functions are successful. Both return NULL
on failure.
Prefer checking regions rather than points
for (j = 0; j != count; ++j)
If you write this as
for (j = 0; j < count; ++j)
then it is more robust in the face of future changes. For example, if you increment j
an extra time in an iteration, the original might overshoot the mark. This version will always stop.
C++ iterators have to compare against single values because of the way they work. Working with a loop variable doesn’t have that restriction.
Prefer memcpy
over strcpy
strcpy(ptr, p[j]);
Since you already know the length and don’t need to null terminate, you can use memcpy
.
memcpy(ptr, p[j], len_array[j]);
It might even be faster, as there may be a machine instruction for copying blocks of memory efficiently. The strcpy
command has to copy byte by byte, as the null could be anywhere.
Avoid memory leaks
It won’t matter in a toy program like this, but you should free
p
and len_array
before returning. Otherwise each call to mystrcat
would leak memory.
free(p);
free(len_array);
You should also do that in the caller.
puts(mystrcat(5, "Hello", ", ", "world", ", ", "friends!"));
Should be
char *concatenated = mystrcat(5, "Hello", ", ", "world", ", ", "friends!");
puts(concatenated);
free(concatenated);
This won’t matter in this program as it just ends and releases all the memory then anyway. But in a longer running program that calls mystrcat
many times, this could be a problem.
-
Avoid
sizeof(type)
. It disconnects type from the variable, and leads to double maintenance. Preferp = malloc(sizeof(p[0]) * count); len_array = calloc(count, sizeof(len_array[0]));
-
len_array
leaks memory. -
len_array
id strictly speaking unnecessary. Considerfor (j = 0; i != count; ++j) { strcpy(ptr, p[j]); ptr += strlen(p[j]); }
Of course it trades time (each
strlen
is evaluated twice) for space (no need for alen_array
). I prefer this version because it avoids extracalloc
(could be costly), and keeps the semantics of the increment clear. -
Surely you will get plenty recommendations to test the retrun value of
malloc
.
Others have pointed out flaws in your code (although I’d add that sizeof(char) is 1 by definition), so I’ll just offer an alternative strategy, namely to write a simple 2-string concatenation and use that if you need to join more strings:
char *mystrcat(const char * s, const char * t)
{
size_t slen = s ? strlen(s) : 0;
size_t tlen = t ? strlen(t) : 0;
char *cat = malloc(slen + tlen + 1);
if (cat) {
memcpy(cat, s, slen);
memcpy(cat + slen, t, tlen);
cat[slen + tlen] = ' ';
}
return cat;
}
Three and four-string functions are simple extensions of this.