I decided I’m going to try to reimplement the tree
utility from
scratch in a very limited version. Mine will only work from the current
directory and will support no arguments. This article will take you
through my implementation of the problem.
The tree
program produces a diagram of the directory structure from a
given starting point, generally the current working directory. You can
see an example in the screenshot below.
My version is much simpler. It does not implement:
- Sorting of each subdirectory result.
- Colored output based on file type by
dircolors
- Command arguments to configure behavior.
- Multiple output formats.
The first order of business is to open a directory and read it’s
content. For that we need opendir
and readdir
.
These functions from the posix standard. opendir
opens a directory
stream, and readdir
reads from that stream to produce a dirent
structs. These structs contain information about the current entry, such
as the name and type of file.
// from http://man7.org/linux/man-pages/man3/readdir.3.html
struct dirent {
ino_t d_ino; /* Inode number */
off_t d_off; /* Not an offset; see below */
unsigned short d_reclen; /* Length of this record */
unsigned char d_type; /* Type of file; not supported
by all filesystem types */
char d_name[256]; /* Null-terminated filename */
};
The next task is to separate out the results of readdir
from what we
want. We need to be careful of the current directory entry, and the
previous directory entry, if we recurse on those directories we could
get an infinite loop during our in order traversal. We also need to
filter out hidden files, because the normal find does not include
hidden files by default. To do this, we examine the contents of the
dirent.d_name
field.
The next step is to recurse on entries that contain more entries.
According to the man page when d_type
is equal to DT_DIR
, that entry
is a directory, and we should recurse on them. But we run into a problem,
we don’t have the full path. We could do string concatenation, and build
on each step, but I didn’t feel like it. Instead I used chdir
to move our context to that directory, and drop down once we finished
up. This may not be very efficient(system calls), but it avoids buffer
overflows, and C-style strings are miserable.
The final piece we must consider is counting up the number of files and the number of directories. I used a struct containing the counts, and returned them from each layer of the walk in the classic recursive way of building up a solution.
The important takeaways are:
- Be careful of how you walk a directory, you need to ignore the current and parent directories.
- Watch out for exhusting the available file descriptors for your program. You could do this if you try to go too deep or end up with an infinite loop.
Below is the finished version of this code. It’s quite simple as you can see.
/*
* tree.c
* Henry J Schmale
* September 30, 2018
*
* A trivial reimplementation of the tree program with no support for
* arguements. Works from the current directory, and produces a nice
* ascii diagram of the directory structure. Once it has fully walked
* the directory tree it prints out some stats.
*
*/
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <dirent.h>
#include <unistd.h>
const int INDENT_SIZE = 4;
typedef struct {
size_t num_dir;
size_t num_files;
} dir_res_t;
/*
* unix_error - unix-style error routine
*/
void
unix_error (char *msg)
{
fprintf (stderr, "%s: %s\n", msg, strerror(errno));
exit (1);
}
static inline int
self_ref (const char* s) {
return strcmp (s, ".") == 0 || strcmp(s, "..") == 0;
}
static inline int
hidden (const char* s) {
return s[0] == '.' && s[1] != '\0';
}
dir_res_t
walk_dirtree (const char* start, int depth) {
struct dirent* dir;
dir_res_t stats = {0};
DIR* dirstream = opendir (start);
if (dirstream == NULL) {
unix_error ("Failed to open dirstream");
}
while ((dir = readdir(dirstream)) != NULL) {
if (self_ref (dir->d_name) || hidden(dir->d_name))
continue;
// print directory name indented
printf ("%*c%s\n", depth, ' ', dir->d_name);
if (dir->d_type == DT_DIR) {
if (chdir(dir->d_name) == -1)
unix_error ("Failed to go deeper");
dir_res_t d_res = walk_dirtree (".", depth + INDENT_SIZE);
stats.num_files += d_res.num_files;
stats.num_dir += d_res.num_dir + 1;
if (chdir ("..") == -1)
unix_error ("Failed to go back up");
} else {
stats.num_files += 1;
}
}
if (closedir (dirstream)) {
unix_error ("Failed to close dirstream");
}
return stats;
}
int
main (int argc, char** argv) {
dir_res_t stats = walk_dirtree (".", 0);
printf ("%d directories, %d directories\n", stats.num_dir, stats.num_files);
return 0;
}