Skip to content

Latest commit

 

History

History
94 lines (67 loc) · 3.47 KB

README.md

File metadata and controls

94 lines (67 loc) · 3.47 KB

DFORMAT - A Program for Typesetting Data Formats

This is Jon Bentley's dformat program, reconstituted from a PDF version of the original memo describing the program.

Introduction

dformat reads descriptions of data formats and turns them into pic specifications. It's intended for use as yet another troff preprocessor.

It's of interest to me since it's written in awk. It's been around since 1988, but I've only ever seen either PostScript or PDF versions of the memo.

I've often wanted to at least extract the awk program and make it usable, but never got “a round tuit.”

In the fall of 2019 I came across a PDF copy of the memo and saved it so that I could reconstitute the original troff -ms input for it, as well as the awk code. I finally stole some time to do this in June of 2020.

Process

I started by simply copying all the text from the PDF into a text file via copy/paste.

The next step was to get the awk script put back together enough to actually be run by gawk (my favorite awk interpreter). Once that was done, I used gawk --pretty-print to format the code nicely.

I then started on the memorandum itself, inserting troff -ms requests, formatting the text into lines of reasonable length, and getting the document into shape to match the original as much as possible.

Fortunately, the memo contained the dformat input for all of the figures displayed, so it was simple enough to copy/paste the displayed input into the real input to be processed, and then the result could be compared visually to the original.

Along the way, I had to re-read the original documentation on tbl so that I could format the tables properly. There was additional fun here, as copy/paste of a table dumped the text by columns, not by rows. I ended up saving each column to a small text file and then writing a throw-away awk script to read the files and merge them back into lines.

Finally, I hand-edited the awk program to match what was in the original memorandum.

Bugs In the Document

Along the way, I found a few bugs in the document. There were two cases where input for a figure was shown, but the figure itself was not. I restored the actual figures.

Another figure needed an additional directive in order to be drawn correctly; this directive was missing in the dformat input shown in the PDF file. This too I corrected.

Finally, the dash alias for dashed did not work. I noted this in a footnote and simply replaced dash with dashed in both the real input and in the sample input shown in the memo.

Other Files Here

The original version of the memorandum was published as Bell Labs Computing Science Technical Report 142. The file 142.ps is this original memorandum.

Jon Bentley was kind enough to send me his original troff and awk source; they are included in the jlb directory.

Jon also pointed me at an article he wrote, published in the AT&T Technical Journal, called “Little Languages for Pictures in awk,''. I've included a copy here for convenience in the file little-languages-for-pictures-in-awk.pdf.

Creating the Document

If you have GNU troff installed, with its preprocessors, it should be enough to just type make to create dformat.pdf.

Conclusion

I'm pretty pleased with the result. I'm making everything available on GitHub so that others may also take advantage of this nice little program.

Last Modified

Wed Feb 24 17:53:51 IST 2021