Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback on Locale::Po4a::Man(3): "Use the plain macro set" #547

Open
g-branden-robinson opened this issue Dec 16, 2024 · 0 comments
Open

Comments

@g-branden-robinson
Copy link

g-branden-robinson commented Dec 16, 2024

Hi there. I maintain GNU roff ("groff"), and have some experience with man page formatting. I noticed the following section of the Locale::Po4a::Man(3) page.

Use the plain macro set
       There are still some macros which are not supported by po4a::man. This is
       only  because  I failed to find any documentation about them. Here is the
       list of unsupported macros used on my box.  Note  that  this  list  isn’t
       exhaustive  since  the program fails on the first encountered unsupported
       macro. If you have any information  about  some  of  these  macros,  I’ll
       happily add support for them. Because of these macros, about 250 pages on
       my box are inaccessible to po4a::man.

        ..               ."              .AT             .b              .bank
        .BE              ..br            .Bu             .BUGS           .BY
        .ce              .dbmmanage      .do                             .En
        .EP              .EX             .Fi             .hw             .i
        .Id              .l              .LO             .mf
        .N               .na             .NF             .nh             .nl
        .Nm              .ns             .NXR            .OPTIONS        .PB
        .pp              .PR             .PRE            .PU             .REq
        .RH              .rn             .S<             .sh             .SI
        .splitfont       .Sx             .T              .TF             .The
        .TT              .UC             .ul             .Vb             .zZ

I believe this list can be trimmed down some. There are troff requests, typos, man(7) macros, mdoc(7) macros, and other macros, possibly page-local ones, in evidence here. I'll go through them.

.. is standard roff syntax for ending a macro definition (de) or ignored content (ig). If you parse de or ig, you should parse this as terminating the corresponding scope. If not, you should ignore it.

." is probably misspelled comment syntax. I would warn about it.

groff_man_style(7):

       \"     Comment.  Everything after the double‐quote to the end of
              the input line is ignored.  Whole‐line comments should be
              placed immediately after the empty request (“.”).

.AT is a man(7) extension from 4.3BSD in 1986. I would ignore it.

groff_man(7):

   Deprecated features
       Use of the following in man pages for public distribution is
       discouraged.

       .AT [system [release]]
              Alter the footer for use with legacy AT&T man pages,
              overriding any definition of the footer‐inside argument to
              .TH.  This macro exists only to render man pages from
              historical systems.

              system can be any of the following.

                     3      7th edition (default)

                     4      System III

                     5      System V

              The optional release argument specifies the release
              number, as in “System V Release 3”.

The following are me(7) macros. Possibly the page author was confused regarding which macro package they were using (or somebody ran po4a-gettextize -f man mistakenly on an me document).

.b
.i
.pp
.sh

The following look like errors, typos, calls of page-local macros, calls of man(7) extension macros from an obscure source, or a primitive form of commenting--note how some are the same as common man section headings--since roff implementations traditionally silently ignored attempts to call undefined macros. I would warn upon encountering them.

.bank
.BE
.Bu
.BUGS
.BY
.dbmmanage
.EP
.Fi
.Id
.l
.LO
.mf
.N
.NF
.nl
.NXR
.OPTIONS
.PB
.PR
.PRE
.PU
.RH
.S<
.SI
.splitfont
.T
.TF
.The
.TT

..br looks like a typo for a break request, .br. I would warn about it.

.ce is a roff centering request. I would ignore it.

.do is a GNU roff extension that interprets the remainder of the input line as a control line using GNU troff's syntax even when the formatter is operating in AT&T troff compatibility mode. Idiomatically, what follows is likely to be a groff extension request or a traditional AT&T request using a groff syntax extension (like long identifiers within square brackets). You can ignore it, but if it would be just as easy to treat what follows as an ordinary control line (one that has begun with a dot), it would slightly increase your parser's capability with no further logic required, I think. (You can moreover apply this rule recursively; .do do do do do do nr my-long-register 1 is perfectly valid, if silly.)

The following are mdoc(7) macros.

.En
.Nm
.Sx

.EX is a man extension macro. I would ignore it, but not what lies between it and its closing macro EE.

groff_man(7):

       .EX
       .EE    Begin and end example.  After .EX, filling is disabled and
              a constant‐width (monospaced) font is selected.  Calling
              .EE enables filling and restores the previous font.

              These macros are extensions introduced in Ninth Edition
              Research Unix.  Systems running that troff, or those from
              Documenter’s Workbench, Heirloom Doctools, or Plan 9 troff
              support them.  To be certain your page will be portable to
              systems that do not, copy their definitions from the
              an-ext.tmac file of a groff installation.

.hw is a roff request that defines hyphenation exceptions. I would ignore it.

.na is a roff request for disabling adjustment. I would ignore it.

.nh is a roff request for disabling hyphenation. I would ignore it.

.ns is a roff request that enables no-space mode. I would ignore it.

.REq might be a typo when attempting to invoke the man RE macro. AT&T troff would have interpreted it as such without an evident diagnostic, but might not have behaved as desired. groff will interpret it as an undefined macro, except when in compatibility mode. I would warn about it.

.rn is a roff request that renames a request, macro, string, or diversion. Unless you track definitions of any such objects, I would ignore it.

.UC is a man(7) extension from 3BSD in 1980. I would ignore it.

groff_man(7):

   Deprecated features
       Use of the following in man pages for public distribution is
       discouraged.
...
       .UC [version]
              Alter the footer for use with legacy BSD man pages,
              overriding any definition of the footer‐inside argument to
              .TH.  This macro exists only to render man pages from
              historical systems.

              version can be any of the following.

                     3      3rd Berkeley Distribution (default)

                     4      4th Berkeley Distribution

                     5      4.2 Berkeley Distribution

                     6      4.3 Berkeley Distribution

                     7      4.4 Berkeley Distribution

.ul is a roff request that turns underlining of output text on and off. I would ignore it.

.Vb is a page-local macro defined in man pages generated by Pod::Man. I would ignore it.

.zZ is a page-local macro used by the bash(1) man page. I would warn about it. Some day maybe I will talk Chet into using the so request to achieve the same thing he does with registers and no-op macros. :)

Please follow up with any questions or disagreements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant