sed & awksed & awkSearch this book

11.3. Commercial awks

There are also several commercial versions of awk. In this section, we review the ones that we know about.

11.3.1. MKS awk

Mortice Kern Systems (MKS) in Waterloo, Ontario (Canada)[80] supplies awk as part of the MKS Toolkit for MS-DOS/Windows, OS/2, Windows 95, and Windows NT.

[80]Mortice Kern Systems, 185 Columbia Street West, Waterloo, Ontario N2L 5Z5, Canada. Phone: 1-800-265-2797 in North America, 1-519-884-2251 elsewhere. URL is http://www.mks.com/.

The MKS version implements POSIX awk. It has the following extensions:

11.3.2. Thompson Automation awk (tawk)

Thompson Automation Software[81] makes a version of awk (tawk)[82] for MS-DOS/Windows, Windows 95 and NT, and Solaris. Tawk is interesting on several counts. First, unlike other versions of awk, which are interpreters, tawk is a compiler. Second, tawk comes with a screen-oriented debugger, written in awk! The source for the debugger is included. Third, tawk allows you to link your compiled program with arbitrary functions written in C. Tawk has received rave reviews in the comp.lang.awk newsgroup.

[81]Thompson Automation Software, 5616 SW Jefferson, Portland OR 97221 U.S.A. Phone: 1-800-944-0139 within the U.S., 1-503-224-1639 elsewhere.

[82]Michael Brennan, in the mawk(1) manpage, makes the following statement: "Implementors of the AWK language have shown a consistent lack of imagination when naming their programs."

Tawk comes with an awk interface that acts like POSIX awk, compiling and running your program. You can, however, compile your program into a standalone executable file. The tawk compiler actually compiles into a compact intermediate form. The intermediate representation is linked with a library that executes the program when it is run, and it is at link time that other C routines can be integrated with the awk program.

Tawk is a very full-featured implementation of awk. Besides implementing the features of POSIX awk (based on new awk), it extends the language in some fundamental ways, and also has a very large number of built-in functions.

11.3.2.1. Tawk language extensions

This section provides a "laundry list" of the new features in tawk. A full treatment of them is beyond the scope of this book; the tawk documentation does a nice job of presenting them. Hopefully, by now you should be familiar enough with awk that the value of these features will be apparent. Where relevant, we'll contrast the tawk feature with a comparable feature in gawk.

Whew! That's a rather long list, but these features bring additional power to programming in awk.

11.3.2.2. Additional built-in tawk functions

Besides extending the language, tawk provides a large number of additional built-in functions. Here is another "laundry list," this time of the different classes of functions available. Each class has two or more functions associated with it. We'll briefly describe the functionality of each class.

  • Extended string functions. Extensions to the standard string functions and new string functions allow you to match and substitute for subpatterns within patterns (similar to gawk's gensub() function), assign to substrings within strings, and split a string into an array based on a pattern that matches elements, instead of the separator. There are additional printf formats, and string translation functions. While undoubtedly some of these functions could be written as user-defined functions, having them built in provides greater performance.

  • Bit manipulation functions. You can perform bitwise AND, OR, and XOR operations on (integer) values. These could also be written as user-defined functions, but with a loss of performance.

  • More I/O functions. There is a suite of functions modeled after those in the stdio(3) library. In particular, the ability to seek within a file, and do I/O in fixed-size amounts, is quite useful.

  • Directory operation functions. You can make, remove, and change directories, as well as remove and rename files.

  • File information functions. You can retrieve file permissions, size, and modification times.

  • Directory reading functions. You can get the current directory name, as well as read a list of all the filenames in a directory.

  • Time functions. There are functions to retrieve the current time of day, and format it in various ways. These functions are not quite as flexible as gawk's strftime() function.

  • Execution functions. You can sleep for a specific amount of time, and start other functions running. Tawk's spawn() function is interesting because it allows you to provide values for the new program's environment, and also indicate whether the program should or should not run asynchronously. This is particularly valuable on non-UNIX systems, where the command interpreters (such as MS-DOS's command.com) are quite limited.

  • File locking. You can lock and unlock files and ranges within files.

  • Screen functions. You can do screen-oriented I/O. Under UNIX, these functions are implemented on top of the curses(3) library.

  • Packing and unpacking of binary data. You can specify how binary data structures are laid out. This, together with the new I/O functions, makes it possible to do binary I/O, something you would normally have to do in C or C++.

  • Access to internal state. You can get or set the value of any awk variable through function calls.

  • Access to MS-DOS low-level facilities. You can use system interrupts, and peek and poke values at memory addresses. These features are obviously for experts only.

From this list, it becomes clear that tawk provides a nice alternative to C and to Perl for serious programming tasks. As an example, the screen functions and internal state functions are used to implement the tawk debugger in awk.

11.3.3. Videosoft VSAwk

Videosoft[84] sells software called VSAwk that brings awk-style programming into the Visual Basic environment. VSAwk is a Visual Basic control that works in an event driven fashion. Like awk, VSAwk gives you startup and cleanup actions, and splits the input record into fields, as well as the ability to write expressions and call the awk built-in functions.

[84]Videosoft can be reached at 2625 Alcatraz Avenue, Suite 271, Berkeley CA 94705 U.S.A. Phone: 1-510-704-8200. Fax: 1-510-843-0174. Their site is http://www.videosoft.com.

VSAwk resembles UNIX awk mostly in its data processing model, not its syntax. Nevertheless, it's interesting to see how people apply the concepts from awk to the environment provided by a very different language.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.