Unix Power ToolsUnix Power ToolsSearch this book

9.20. Finding Files (Much) Faster with a find Database

If you use find to search for files, you know that it can take a long time to work, especially when there are lots of directories to search. Here are some ideas for speeding up your finds.

NOTE: By design, setups like these that build a file database won't have absolutely up-to-date information about all your files.

If your system has "fast find" or locate, that's probably all you need. It lets you search a list of all pathnames on the system.

Even if you have "fast find" or locate, it still might not do what you need. For example, those utilities only search for pathnames. To find files by the owner's name, the number of links, the size, and so on, you have to use "slow find." In that case -- or, when you don't have "fast find" or locate -- you may want to set up your own version.

slocate can build and update its own database (with its -u option), as well as search the database. The basic "fast find" has two parts. One part is a command, a shell script usually named updatedb or locate.updatedb, that builds a database of the files on your system -- if your system has it, take a look to see a fancy way to build the database. The other part is the find or locate command itself -- it searches the database for pathnames that match the name (regular expression) you type.

To make your own "fast find":

To search the database, type:

% ffind somefile
/usr/freddie/lib/somefile
% ffind '/(sep|oct)[^/]*$'
/usr/freddie/misc/project/september
/usr/freddie/misc/project/october

You can do much more: I'll get you started. If you have room to store more information than just pathnames, you can feed your find output to a command like ls -l. For example, if you do a lot of work with links, you might want to keep the files' i-numbers as well as their names. You'd build your database with a command like this:

% cd
% find . -print | xargs ls -id > .fastfind.new
% mv -f .fastfind.new .fastfind

Or, if your version of find has the handy -ls operator, use the next script. Watch out for really large i-numbers; they might shift the columns and make cut give wrong output. The exact column numbers will depend on your system:

% cd
% find . -ls | cut -c1-7,67- > .fastfind.new
% mv -f .fastfind.new .fastfind

Then, your ffind script could search for files by i-number. For instance, if you had a file with i-number 1234 and you wanted to find all its links:

% ffind "^1234 "

The space at the end of that regular expression prevents matches with i-numbers like 12345. You could search by pathname in the same way. To get a bit fancier, you could make your ffind a little perl or awk script that searches your database by field. For instance, here's how to make awk do the previous i-number search; the output is just the matching pathnames:

awk '$1 == 1234 {print $2}' $HOME/.fastfind

With some information about Unix shell programming and utilities like awk, the techniques in this article should let you build and search a sophisticated file database -- and get information much faster than with plain old find.

-- JP



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.