sean cassidy : Windows ruins everything

in: programming

Recently, at work, we had an annoying bug in our code which came about from a seemingly harmless refactor.

The Bug

We have an API that was recently refactored from this:

public void upload(File file, String remoteDir, String remoteName) {
    client.upload(file, remoteDir, remoteName);
}

To this using Apache Commons-IO's FilenameUtils:

public void upload(File file, String remotePath) {
    final String remoteDir = FilenameUtils.getPath(remotePath);
    final String remoteName = FilenameUtils.getName(remotePath);

    client.upload(file, remoteDir, remoteName);
}

This API worked better with our general use case and we were all happy.

Until we had a bug.

The differences

Can you see what the differences are between these two snippets of code? Obviously the use of FilenameUtils, but what problem would that cause?

Well, it turns out some code that was using this API generated a superfluous slash for the remotePath, so that it looked like this:

//directory1/directory2/file.txt

But we all know how UNIX paths work! Extra slashes are harmless. And the client object worked with the extra slash before. So what was the issue?

Windows and cross-platform support

In Windows, you can easily reference files on a networked server with what's called UNC paths:

\\MainServer1\rootdir\seconddir\file.txt

It's pretty useful. And, since you can use forward slashes instead of back slashes, the following is equivalent:

//MainServer1/rootdir/seconddir/file.txt

So, when FilenameUtils wants to get the path from a given String, it needs to first get rid of the server name.

FilenameUtils.getPath("//MainServer1/rootdir/seconddir/file.txt") == "rootdir/seconddir/file.txt"

Which is not what we wanted.

The fix

Well our paths were not being cleanly normalized, so I set out to do so. My first instinct was this:

final String remoteDir = FilenameUtils.getPath(remotePath.replaceAll("/+","/"));

But then I thought, well certainly it would be better to completely normalize our paths. And hey! FilenameUtils has that as well.

Unfortunately, even normalization of UNIX style paths is not possible with FilenameUtils. From the documentation:

//server/foo/../bar  -->   //server/bar
//server/../bar      -->   null

This is the case even if you set the boolean unixSeparator to true. You can't disable UNC behavior in FilenameUtils.

We can use getFullPath instead of getPath, but that doesn't strip the extraneous starting slash.

So, instead, I used my replaceAll function. Perhaps replaceAll and then normalization is even better.

Why Windows ruins everything

Getting cross platform behavior right is tricky. It's cool that there is support for UNC paths in Commons IO, but not being able to normalize UNIX paths properly sucks.

And this surprising behavior caused a bug in our code. Well, generating an extra starting slash was a small bug, but since we knew that it was "harmless" we left it alone.

Yeah, we should have read the documentation more closely. But, since we're not even running on Windows, UNC paths were not on our minds.

This is why Windows ruined my day: we weren't using it, and yet we still had to know about it. It makes people (like the Apache folks) create surprising behavior because of the large number of different path formats needed to support it.

Sean is the CTO and co-founder at DefenseStorm, a cloud cybersecurity startup.

Follow @sean_a_cassidy