C delete files and directories, recursively if directory

Posted on

Problem

This is my function that uses recursion to recursively delete a directory. I like my way of skipping entries . and ...

But should I be casting the name lengths to unsigned int because I am quite sure there will not be file paths longer than 4 gigacharacters? Or is storing the result of strlen in any type other than size_t bad practice?

Any other comments here?

Also is using VLAs in a recursive function risky, could it cause a stack overflow error? Is the VLA OK, or should I just use malloc() (I vehemently refuse to simply use a fixed size buffer that should be big enough, like char Entry[1024])?

int Remove(const char *const Object) {
    DIR *const Dir = opendir(Object);
    if (Dir) {
        struct dirent *Dirent;
        const unsigned int Len = (unsigned int)strlen(Object);
        while ((Dirent = readdir(Dir))) {
            const char *const Name = Dirent->d_name;
            const unsigned int NameLen = (unsigned int)strlen(Name);
            if (Name[0] == '.' && (NameLen == 1 || (Name[1] == '.' && NameLen == 2))) {
                continue;
            }
            char Entry[Len+NameLen+2], *const CurrPtr = memcpy(Entry, Object, Len)+Len;
            *CurrPtr = '/';
            *(char *)(memcpy(CurrPtr+1, Name, NameLen)+NameLen) = 0;
            if ((Dirent->d_type == DT_DIR? Remove : remove)(Entry)) {
                return -1;
            }
        }
        if (closedir(Dir)) {
            return -1;
        }
    }
    return remove(Object);
}

Solution

We’re missing the necessary headers for struct dirent and the Standard library string functions (strlen, memcpy etc).

It’s probably worth providing an explanatory comment that tells the user what the return values mean. It appears that it returns 0 on success, or -1 with errno set to a suitable value if it fails.

I don’t think it’s a great idea to use a name that differs only in case from the Standard Library function remove(). Call it remove_recursively() or something instead.

        const unsigned int Len = (unsigned int)strlen(Object);
        const unsigned int NameLen = (unsigned int)strlen(Name);

Why the cast? We should just keep them as size_t.

I’m not a fan of the cutesy

        *(char *)(memcpy(CurrPtr+1, Name, NameLen)+NameLen) = 0;

To be correct, we need to convert the return value of memcpy() to char* before adding NameLen:

        *((char *)memcpy(CurrPtr+1, Name, NameLen)+NameLen) = 0;

But honestly, I’d avoid the casting entirely and write as a pair of much more readable lines:

        memcpy(CurrPtr+1, Name, NameLen);
        CurrPtr[NameLen+1] = '';

Or simpler still, get rid of the manual concatenation and let the library do it for us:

        sprintf(Entry, "%s/%s", Object, Name);
       if ((Dirent->d_type == DT_DIR? Remove : remove)(Entry)) {

The conditional operator is unnecessary – just call our Remove(), which will get a null return from opendir() and then simply unlink the file. Actually, the test is a good idea, as it prevents us following symlinks.

        if ((Dirent->d_type == DT_DIR? Remove : remove)(Entry)) {
            return -1;
        }

Oops – we forgot to closedir() before returning, meaning that we leak a descriptor for every level of recursion.

    if (closedir(directory)) {
        return -1;
    }

This check may be overkill – the only documented failure possibility is EBADF if we pass a directory that’s e.g. already closed. We know the full history of the descriptor, so that can never happen.


Simplified version

#include <sys/types.h>
#include <dirent.h>

#include <stdio.h>
#include <string.h>


int remove_recursive(const char *const path) {
    DIR *const directory = opendir(path);
    if (directory) {
        struct dirent *entry;
        while ((entry = readdir(directory))) {
            if (!strcmp(".", entry->d_name) || !strcmp("..", entry->d_name)) {
                continue;
            }
            char filename[strlen(path) + strlen(entry->d_name) + 2];
            sprintf(filename, "%s/%s", path, entry->d_name);
            int (*const remove_func)(const char*) = entry->d_type == DT_DIR ? remove_recursive : remove;
            if (remove_func(entry->d_name)) {
                closedir(directory);
                return -1;
            }
        }
        if (closedir(directory)) {
            return -1;
        }
    }
    return remove(path);
}

It might be possible to avoid copying the filenames, if it’s okay for the function to change the working directory. We can chdir into the directory to be removed, remove all its content (using the relative names directly from the dirent structure), then change back when we’re done. That keeps the per-frame overhead absolutely minimal.

Actually we can avoid changing working directory if we’re on a platform that provides unlinkat() (thanks to Peter Cordes for suggesting this). The code then looks very simple:

        int unlinkflags = entry->d_type == DT_DIR ? AT_REMOVEDIR : 0;
        if (unlinkat(dirfd(directory), entry->d_name, unlinkflags)) {
            closedir(directory);
            return -1;
        }

Leave a Reply

Your email address will not be published. Required fields are marked *