Unix Power ToolsUnix Power ToolsSearch this book

10.4. More About Links

Unix provides two different kinds of links:

Hard links
With a hard link, two filenames (i.e., two directory entries) point to the same inode and the same set of data blocks. All Unix versions support hard links. They have two important limitations: a hard link can't cross a filesystem (i.e., both filenames must be in the same filesystem), and you can't create a hard link to a directory (i.e., a directory can only have one name).[41] They have two important advantages: the link and the original file are absolutely and always identical, and the extra link takes no disk space (except an occasional extra disk block in the directory file).

[41]Actually, every directory has at least two names. See the last section of this article.

Symbolic links (also called soft links or symlinks)
With a symbolic link, there really are two different files. One file contains the actual data; the other file just contains the name of the first file and serves as a "pointer." We call the pointer the link. The system knows that whenever it opens a symlink, it should read the contents of the link and then access the file that really holds the data you want. Nearly all Unix systems support symbolic links these days. Symbolic links are infinitely more flexible than hard links. They can cross filesystems or even computer systems (if you are using NFS or RFS (Section 44.9)). You can make a symbolic link to a directory. A symbolic link has its own inode and takes a small amount of disk space to store.

You obviously can't do without copies of files: copies are important whenever users need their own "private version" of some master file. However, links are equally useful. With links, there's only one set of data and many different names that can access it. Section 10.5 shows how to make links.

10.4.1. Differences Between Hard and Symbolic Links

With a hard link, the two filenames are identical in every way. You can delete one without harming the other. The system deletes the directory entry for one filename and leaves the data blocks (which are shared) untouched. The only thing rm does to the inode is decrement its "link count," which (as the name implies) counts the number of hard links to the file. The data blocks are only deleted when the link count goes to zero -- meaning that there are no more directory entries that point to this inode. Section 9.24 shows how to find the hard links to a file.

With a symbolic link, the two filenames are really not the same. Deleting the link with rm leaves the original file untouched, which is what you'd expect. But deleting or renaming the original file removes both the filename and the data. You are left with a link that doesn't point anywhere. Remember that the link itself doesn't have any data associated with it. Despite this disadvantage, you rarely see hard links on Unix versions that support symbolic links. Symbolic links are so much more versatile that they have become omnipresent.

Let's finish by taking a look at the ls listing for a directory. This directory has a file named file with another hard link to it named hardlink. There's also a symlink to file named (are you ready?) symlink:

$ ls -lai
total 8
 140330 drwxr-xr-x   2 jerry    ora    1024 Aug 18 10:11 .
  85523 drwxr-xr-x   4 jerry    ora    1024 Aug 18 10:47 ..
 140331 -rw-r--r--   2 jerry    ora    2764 Aug 18 10:11 file
 140331 -rw-r--r--   2 jerry    ora    2764 Aug 18 10:11 hardlink
 140332 lrwxrwxrwx   1 jerry    ora       4 Aug 18 10:12 symlink -> file

You've seen ls's -l option(Section 50.2) and, probably, the -a option (Section 8.9) for listing "dot files." The -i option lists the i-number (Section 14.2) for each entry in the directory; see the first column. The third column has the link count: this is the number of hard links to the file.

When you compare the entries for file and hardlink, you'll see that they have a link count of 2. In this case, both links are in the same directory. Every other entry (i-number, size, owner, etc.) for file and hardlink is the same; that's because they both refer to exactly the same file, with two links (names).

A symbolic link has an l at the start of the permissions field. Its i-number isn't the same as the file to which it points because a symbolic link takes a separate inode; so, it also takes disk space (which an extra hard link doesn't). The name has two parts: the name of the link (here, symlink) followed by an arrow and the name to which the link points (in this case, file). The symlink takes just four characters, which is exactly enough to store the pathname (file) to which the link points.

10.4.2. Links to a Directory

While we're at it, here's a section that isn't about linking to files or making symbolic links. Let's look at the first two entries in the previous sample directory in terms of links and link counts. This should help to tie the filesystem together (both literally and in your mind!).

You've seen . and .. in pathnames (Section 1.16); you might also have read an explanation of what's in a directory (Section 10.2). The . entry is a link to the current directory; notice that its link count is 2. Where's the other link? It's in the parent directory:

$ ls -li ..
total 2
 140330 drwxr-xr-x   2 jerry    ora      1024 Aug 18 10:11 sub
  85524 drwxr-xr-x   2 jerry    ora      1024 Aug 18 10:47 sub2

Look at the i-numbers for the entries in the parent directory. Which entry is for our current directory? The entry for sub has the i-number 140330, and so does the . listing in the current directory. So the current directory is named sub. Now you should be able see why every directory has at least two links. One link, named ., is to the directory itself. The other link, in its parent, gives the directory its name.

Every directory has a .. entry, which is a link to its parent directory. If you look back at the listing of our current directory, you can see that the parent directory has four links. Where are they?

When a directory has subdirectories, it will also have a hard link named .. in each subdirectory. You can see earlier, in the output from ls -li .., that the parent directory has two subdirectories: sub and sub2. That's two of the four links. The other two links are the . entry in the parent directory and the entry for the parent directory (which is named test in its parent directory):

-d Section 8.5

% ls -dli ../. ../../test
  85523 drwxr-xr-x   4 jerry    ora      1024 Aug 18 10:47 ../.
  85523 drwxr-xr-x   4 jerry    ora      1024 Aug 18 10:47 ../../test

As they should, all the links have the same i-number: 85523. Make sense? This concept can be a little abstract and hard to follow at first. Understanding it will help you, though -- especially if you're a system administrator who has to understand fsck's output because it can't fix something automatically or use strong medicine like clri. For more practice, make a subdirectory and experiment in it the way shown in this article.

By the way, directories and their hard links . and .. are added by the mkdir (2) system call. That's the only way that normal users can create a directory (and the links to it).

--JP and ML



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.