The Unix File System -- a Gotcha?

A Powerful File System

When handling files under Unix, you have a mechanism which is completely different than the file system available under MS-Windows and most often programmers who are used to MS-Windows will not understand one of the most powerful feature of a Unix file system.

Each file is assigned what is called an inode. When a file is being accessed, its inode gets locked (a simple resource reference count), and once done with it, it gets unlocked.

While being locked, the file can get deleted. If that happens, the file disappears from the file system (i.e. an ls command does not reveal the file), yet the file is still on disk and can still be read from, written to, locked, unlocked, etc.

People used to the MS-Windows operating system cannot delete the file at all. Instead, they get an error when attempting to do so.

Why is this such a powerful feature? Isn't it bad that a file that exists can be made invisible?

Well... there is one definitive drawback: a Unix computer virus can be made invisible... However, the enormous advantage is the fact that the file can be replaced with a new version even while still open. This is extremely valuable for executables and dynamic libraries because such can be replaced with an upgrade without the need to reboot your Unix operating system!

The upgrade process will first delete the existing file, then create a new file with the new version of the file contents. Active processes are still using the old version while new processes will be given access to the new version.

Of course, if you are upgrading many versions of many different files, this may break since you may end up with an old version of a library and the new version of another library. A combination that was probably never even tested. However, in mast cases, Unix users can upgrade their system without having to reboot which is really a good thing, especially if you are running a server that cannot be shutdown.

So... why can this be a Gotcha?

Today I discovered a but in a Drupal module where the author would delete a file used as a lock between processes (using flock() with exclusive access.) His process was:

  1. Create new file, or truncate existing file
  2. Lock the file exclusively
  3. Run protected process
  4. Delete the file

Step 4 is used to unlock the file. He simply deletes the file which has the side effect, indeed, of releasing the exclusive lock we acquired in step 2.

Unfortunately, if you delete the file, a third process coming along will not see it, and instead of waiting for the 2nd process to release the lock it has, it will start running the proctected process immediately, in parallel with the 2nd process.

Let's put it in a table for clarification, say that those 3 processes are A, B, and C:

Step Lock File
A creates file lock file appears
A locks file lock file is locked by A
B opens file  
B locks file B waits for A to be done
A is done  
A unlinks file lock file disappears
C creates file new lock file appears
C locks file new lock file is locked by C
B is done  
C is done  

While B runs, C comes in and starts running immediately because the file created by A does not exist anymore. That file still exists as a hidden inode since it is held by process B, but process C has no clue and it creates a new inode.

Reference:

Creating severe load on system by logging thousands of unlink errors in DB log