17. Correct version of some examples
Some of the examples in this manuals are buggy or not very robust, for
demonstration purposes. Improved versions of these composite macros are
presented here.
17.1 Solution for exch
The exch
macro (see section Arguments to macros) as presented requires clients
to double quote their arguments. A nicer definition, which lets
clients follow the rule of thumb of one level of quoting per level of
parentheses, involves adding quotes in the definition of exch
, as
follows:
| define(`exch', ``$2', `$1'')
⇒
define(exch(`expansion text', `macro'))
⇒
macro
⇒expansion text
|
17.2 Solution for forloop
The forloop
macro (see section Iteration by counting) as presented earlier can go
into an infinite loop if given an iterator that is not parsed as a macro
name. It does not do any sanity checking on its numeric bounds, and
only permits decimal numbers for bounds. Here is an improved version,
shipped as ‘m4-1.4.11/examples/forloop2.m4’; this
version also optimizes based on the fact that the starting bound does
not need to be passed to the helper _forloop
.
| $ m4 -I examples
undivert(`forloop2.m4')dnl
⇒divert(`-1')
⇒# forloop(var, from, to, stmt) - improved version:
⇒# works even if VAR is not a strict macro name
⇒# performs sanity check that FROM is larger than TO
⇒# allows complex numerical expressions in TO and FROM
⇒define(`forloop', `ifelse(eval(`($3) >= ($2)'), `1',
⇒ `pushdef(`$1', eval(`$2'))_$0(`$1',
⇒ eval(`$3'), `$4')popdef(`$1')')')
⇒define(`_forloop',
⇒ `$3`'ifelse(indir(`$1'), `$2', `',
⇒ `define(`$1', incr(indir(`$1')))$0($@)')')
⇒divert`'dnl
include(`forloop2.m4')
⇒
forloop(`i', `2', `1', `no iteration occurs')
⇒
forloop(`', `1', `2', ` odd iterator name')
⇒ odd iterator name odd iterator name
forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
⇒ 0xa 0xb 0xc
forloop(`i', `a', `b', `non-numeric bounds')
error-->m4:stdin:6: bad expression in eval (bad input): (b) >= (a)
⇒
|
One other change to notice is that the improved version used ‘_$0’
rather than ‘_foreach’ to invoke the helper routine. In general,
this is a good practice to follow, because then the set of macros can be
uniformly transformed. The following example shows a transformation
that doubles the current quoting and appends a suffix ‘2’ to each
transformed macro. If foreach
refers to the literal
‘_foreach’, then foreach2
invokes _foreach
instead of
the intended _foreach2
, and the mixing of quoting paradigms leads
to an infinite recursion loop in this example.
| $ m4 -d -L 9 -I examples
define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
⇒
define(`double', `define(`$1'`2',
arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
⇒
double(`forloop')double(`_forloop')defn(`forloop2')
⇒ifelse(eval(``($3) >= ($2)''), ``1'',
⇒ ``pushdef(``$1'', eval(``$2''))_$0(``$1'',
⇒ eval(``$3''), ``$4'')popdef(``$1'')'')
forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
⇒
changequote(`[', `]')changequote([``], [''])
⇒
forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
⇒
changequote`'include(`forloop.m4')
⇒
double(`forloop')double(`_forloop')defn(`forloop2')
⇒pushdef(``$1'', ``$2'')_forloop($@)popdef(``$1'')
forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
⇒
changequote(`[', `]')changequote([``], [''])
⇒
forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
error-->m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
|
Of course, it is possible to make even more improvements, such as
adding an optional step argument, or allowing iteration through
descending sequences. GNU Autoconf provides some of these
additional bells and whistles in its m4_for
macro.
17.3 Solution for foreach
The foreach
and foreachq
macros (see section Iteration by list contents) as
presented earlier each have flaws. First, we will examine and fix the
quadratic behavior of foreachq
:
| $ m4 -I examples
include(`foreachq.m4')
⇒
traceon(`shift')debugmode(`aq')
⇒
foreachq(`x', ``1', `2', `3', `4'', `x
')dnl
⇒1
error-->m4trace: -3- shift(`1', `2', `3', `4')
error-->m4trace: -2- shift(`1', `2', `3', `4')
⇒2
error-->m4trace: -4- shift(`1', `2', `3', `4')
error-->m4trace: -3- shift(`2', `3', `4')
error-->m4trace: -3- shift(`1', `2', `3', `4')
error-->m4trace: -2- shift(`2', `3', `4')
⇒3
error-->m4trace: -5- shift(`1', `2', `3', `4')
error-->m4trace: -4- shift(`2', `3', `4')
error-->m4trace: -3- shift(`3', `4')
error-->m4trace: -4- shift(`1', `2', `3', `4')
error-->m4trace: -3- shift(`2', `3', `4')
error-->m4trace: -2- shift(`3', `4')
⇒4
error-->m4trace: -6- shift(`1', `2', `3', `4')
error-->m4trace: -5- shift(`2', `3', `4')
error-->m4trace: -4- shift(`3', `4')
error-->m4trace: -3- shift(`4')
|
Each successive iteration was adding more quoted shift
invocations, and the entire list contents were passing through every
iteration. In general, when recursing, it is a good idea to make the
recursion use fewer arguments, rather than adding additional quoted
uses of shift
. By doing so, m4
uses less memory, invokes
fewer macros, is less likely to run into machine limits, and most
importantly, performs faster. The fixed version of foreachq
can
be found in ‘m4-1.4.11/examples/foreachq2.m4’:
| $ m4 -I examples
include(`foreachq2.m4')
⇒
undivert(`foreachq2.m4')dnl
⇒include(`quote.m4')dnl
⇒divert(`-1')
⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt)
⇒# quoted list, improved version
⇒define(`foreachq', `pushdef(`$1')_$0($@)popdef(`$1')')
⇒define(`_arg1q', ``$1'')
⇒define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@))')')
⇒define(`_foreachq', `ifelse(`$2', `', `',
⇒ `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
⇒divert`'dnl
traceon(`shift')debugmode(`aq')
⇒
foreachq(`x', ``1', `2', `3', `4'', `x
')dnl
⇒1
error-->m4trace: -3- shift(`1', `2', `3', `4')
⇒2
error-->m4trace: -3- shift(`2', `3', `4')
⇒3
error-->m4trace: -3- shift(`3', `4')
⇒4
|
Note that the fixed version calls unquoted helper macros in
_foreachq
to trim elements immediately; those helper macros
in turn must re-supply the layer of quotes lost in the macro invocation.
Contrast the use of _arg1q
, which quotes the first list
element, with _arg1
of the earlier implementation that
returned the first list element directly. Additionally, by calling the
helper method immediately, the ‘defn(`iterator')’ no longer
contains unexpanded macros.
The astute m4 programmer might notice that the solution above still uses
more memory, and thus more time, than strictly necessary. Note that
‘$2’, which contains an arbitrarily long quoted list, is expanded
and rescanned three times per iteration of _foreachq
.
Furthermore, every iteration of the algorithm effectively unboxes then
reboxes the list, which costs a couple of macro invocations. It is
possible to rewrite the algorithm for a bit more speed by swapping the
order of the arguments to _foreachq
in order to operate on an
unboxed list in the first place, and by using the fixed-length ‘$#’
instead of an arbitrary length list as the key to end recursion. This
alternative approach is available as
‘m4-1.4.11/examples/foreach3.m4’:
| $ m4 -I examples
include(`foreachq3.m4')
⇒
undivert(`foreachq3.m4')dnl
⇒divert(`-1')
⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt)
⇒# quoted list, alternate improved version
⇒define(`foreachq',
⇒`pushdef(`$1')_$0(`$1', `$3'ifelse(`$2', `', `',
⇒ `, $2'))popdef(`$1')')
⇒define(`_foreachq', `ifelse(`$#', `2', `',
⇒ `define(`$1', `$3')$2`'$0(`$1', `$2'ifelse(`$#', `3', `',
⇒ `, shift(shift(shift($@)))'))')')
⇒divert`'dnl
traceon(`shift')debugmode(`aq')
⇒
foreachq(`x', ``1', `2', `3', `4'', `x
')dnl
⇒1
error-->m4trace: -4- shift(`x', `x
error-->', `1', `2', `3', `4')
error-->m4trace: -3- shift(`x
error-->', `1', `2', `3', `4')
error-->m4trace: -2- shift(`1', `2', `3', `4')
⇒2
error-->m4trace: -4- shift(`x', `x
error-->', `2', `3', `4')
error-->m4trace: -3- shift(`x
error-->', `2', `3', `4')
error-->m4trace: -2- shift(`2', `3', `4')
⇒3
error-->m4trace: -4- shift(`x', `x
error-->', `3', `4')
error-->m4trace: -3- shift(`x
error-->', `3', `4')
error-->m4trace: -2- shift(`3', `4')
⇒4
|
For yet another approach, the improved version of foreach
,
available in ‘m4-1.4.11/examples/foreach2.m4’, simply
overquotes the arguments to _foreach
to begin with, using
dquote_elt
. Then _foreach
can just use
_arg1
to remove the extra layer of quoting that was added up
front:
| $ m4 -I examples
include(`foreach2.m4')
⇒
undivert(`foreach2.m4')dnl
⇒include(`quote.m4')dnl
⇒divert(`-1')
⇒# foreach(x, (item_1, item_2, ..., item_n), stmt)
⇒# parenthesized list, improved version
⇒define(`foreach', `pushdef(`$1')_$0(`$1',
⇒ (dquote(dquote_elt$2)), `$3')popdef(`$1')')
⇒define(`_arg1', `$1')
⇒define(`_foreach', `ifelse(`$2', `(`')', `',
⇒ `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
⇒divert`'dnl
traceon(`shift')debugmode(`aq')
⇒
foreach(`x', `(`1', `2', `3', `4')', `x
')dnl
error-->m4trace: -4- shift(`1', `2', `3', `4')
error-->m4trace: -4- shift(`2', `3', `4')
error-->m4trace: -4- shift(`3', `4')
⇒1
error-->m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
⇒2
error-->m4trace: -3- shift(``2'', ``3'', ``4'')
⇒3
error-->m4trace: -3- shift(``3'', ``4'')
⇒4
error-->m4trace: -3- shift(``4'')
|
In summary, recursion over list elements is trickier than it appeared at
first glance, but provides a powerful idiom within m4
processing.
As a final demonstration, both list styles are now able to handle
several scenarios that would wreak havoc on one or both of the original
implementations. This points out one other difference between the
list styles. foreach
evaluates unquoted list elements only once,
in preparation for calling _foreach
, similary for
foreachq
as provided by ‘foreachq3.m4’. But
foreachq
, as provided by ‘foreachq2.m4’,
evaluates unquoted list elements twice while visiting the first list
element, once in _arg1q
and once in _rest
. When
deciding which list style to use, one must take into account whether
repeating the side effects of unquoted list elements will have any
detrimental effects.
| $ m4 -I examples
include(`foreach2.m4')
⇒
include(`foreachq2.m4')
⇒
dnl 0-element list:
foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
⇒ /
dnl 1-element list of empty element
foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
⇒<> / <>
dnl 2-element list of empty elements
foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
⇒<><> / <><>
dnl 1-element list of a comma
foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
⇒<,> / <,>
dnl 2-element list of unbalanced parentheses
foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
⇒<(><)> / <(><)>
define(`ab', `oops')dnl using defn(`iterator')
foreach(`x', `(`a', `b')', `defn(`x')') /dnl
foreachq(`x', ``a', `b'', `defn(`x')')
⇒ab / ab
define(`active', `ACT, IVE')
⇒
traceon(`active')
⇒
dnl list of unquoted macros; expansion occurs before recursion
foreach(`x', `(active, active)', `<x>
')dnl
error-->m4trace: -4- active -> `ACT, IVE'
error-->m4trace: -4- active -> `ACT, IVE'
⇒<ACT>
⇒<IVE>
⇒<ACT>
⇒<IVE>
foreachq(`x', `active, active', `<x>
')dnl
error-->m4trace: -3- active -> `ACT, IVE'
error-->m4trace: -3- active -> `ACT, IVE'
⇒<ACT>
error-->m4trace: -3- active -> `ACT, IVE'
error-->m4trace: -3- active -> `ACT, IVE'
⇒<IVE>
⇒<ACT>
⇒<IVE>
dnl list of quoted macros; expansion occurs during recursion
foreach(`x', `(`active', `active')', `<x>
')dnl
error-->m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
error-->m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
foreachq(`x', ``active', `active'', `<x>
')dnl
error-->m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
error-->m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
dnl list of double-quoted macro names; no expansion
foreach(`x', `(``active'', ``active'')', `<x>
')dnl
⇒<active>
⇒<active>
foreachq(`x', ```active'', ``active''', `<x>
')dnl
⇒<active>
⇒<active>
|
17.4 Solution for m4wrap
The replacement m4wrap
versions presented above, designed to
guarantee FIFO or LIFO order regardless of the underlying M4
implementation, share a bug when dealing with wrapped text that looks
like parameter expansion. Note how the invocation of
m4wrapn
interprets these parameters, while using the
builtin preserves them for their intended use.
| $ m4 -I examples
include(`wraplifo.m4')
⇒
m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
')
⇒
builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
')
⇒
^D
⇒bar:-a-a,b-2-
⇒m4wrap0:---0-
|
Additionally, the computation of _m4wrap_level
and creation of
multiple m4wrapn
placeholders in the original examples is
more expensive in time and memory than strictly necessary. Notice how
the improved version grabs the wrapped text via defn
to avoid
parameter expansion, then undefines _m4wrap_text
, before
stripping a level of quotes with _arg1
to expand the text. That
way, each level of wrapping reuses the single placeholder, which starts
each nesting level in an undefined state.
Finally, it is worth emulating the GNU M4 extension of saving
all arguments to m4wrap
, separated by a space, rather than saving
just the first argument. This is done with the join
macro
documented previously (see section Recursion in m4
). The improved LIFO example is
shipped as ‘m4-1.4.11/examples/wraplifo2.m4’, and can
easily be converted to a FIFO solution by swapping the adjacent
invocations of joinall
and defn
.
| $ m4 -I examples
include(`wraplifo2.m4')
⇒
undivert(`wraplifo2.m4')dnl
⇒dnl Redefine m4wrap to have LIFO semantics, improved example.
⇒include(`join.m4')dnl
⇒define(`_m4wrap', defn(`m4wrap'))dnl
⇒define(`_arg1', `$1')dnl
⇒define(`m4wrap',
⇒`ifdef(`_$0_text',
⇒ `define(`_$0_text', joinall(` ', $@)defn(`_$0_text'))',
⇒ `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
⇒define(`_$0_text', joinall(` ', $@))')')dnl
m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
')
⇒
m4wrap(`lifo text
m4wrap(`nested', `', `$@
')')
⇒
^D
⇒lifo text
⇒foo:-a-a,b-2-
⇒nested $@
|
17.5 Solution for cleardivert
The cleardivert
macro (see section Discarding diverted text) cannot, as it stands, be
called without arguments to clear all pending diversions. That is
because using undivert with an empty string for an argument is different
than using it with no arguments at all. Compare the earlier definition
with one that takes the number of arguments into account:
| define(`cleardivert',
`pushdef(`_n', divnum)divert(`-1')undivert($@)divert(_n)popdef(`_n')')
⇒
divert(`1')one
divert
⇒
cleardivert
⇒
undivert
⇒one
⇒
define(`cleardivert',
`pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
`undivert`'', `undivert($@)')divert(_num)popdef(`_num')')
⇒
divert(`2')two
divert
⇒
cleardivert
⇒
undivert
⇒
|
17.6 Solution for capitalize
The capitalize
macro (see section Substituting text by regular expression) as presented earlier does
not allow clients to follow the quoting rule of thumb. Consider the
three macros active
, Active
, and ACTIVE
, and the
difference between calling capitalize
with the expansion of a
macro, expanding the result of a case change, and changing the case of a
double-quoted string:
| $ m4 -I examples
include(`capitalize.m4')dnl
define(`active', `act1, ive')dnl
define(`Active', `Act2, Ive')dnl
define(`ACTIVE', `ACT3, IVE')dnl
upcase(active)
⇒ACT1,IVE
upcase(`active')
⇒ACT3, IVE
upcase(``active'')
⇒ACTIVE
downcase(ACTIVE)
⇒act3,ive
downcase(`ACTIVE')
⇒act1, ive
downcase(``ACTIVE'')
⇒active
capitalize(active)
⇒Act1
capitalize(`active')
⇒Active
capitalize(``active'')
⇒_capitalize(`active')
define(`A', `OOPS')
⇒
capitalize(active)
⇒OOPSct1
capitalize(`active')
⇒OOPSctive
|
First, when capitalize
is called with more than one argument, it
was throwing away later arguments, whereas upcase
and
downcase
used ‘$*’ to collect them all. The fix is simple:
use ‘$*’ consistently.
Next, with single-quoting, capitalize
outputs a single character,
a set of quotes, then the rest of the characters, making it impossible
to invoke Active
after the fact, and allowing the alternate macro
A
to interfere. Here, the solution is to use additional quoting
in the helper macros, then pass the final over-quoted output string
through _arg1
to remove the extra quoting and finally invoke the
concatenated portions as a single string.
Finally, when passed a double-quoted string, the nested macro
_capitalize
is never invoked because it ended up nested inside
quotes. This one is the toughest to fix. In short, we have no idea how
many levels of quotes are in effect on the substring being altered by
patsubst
. If the replacement string cannot be expressed entirely
in terms of literal text and backslash substitutions, then we need a
mechanism to guarantee that the helper macros are invoked outside of
quotes. In other words, this sounds like a job for changequote
(see section Changing the quote characters). By changing the active quoting characters, we
can guarantee that replacement text injected by patsubst
always
occurs in the middle of a string that has exactly one level of
over-quoting using alternate quotes; so the replacement text closes the
quoted string, invokes the helper macros, then reopens the quoted
string. In turn, that means the replacement text has unbalanced quotes,
necessitating another round of changequote
.
In the fixed version below, (also shipped as
‘m4-1.4.11/examples/capitalize.m4’), capitalize
uses the alternate quotes of ‘<<[’ and ‘]>>’ (the longer
strings are chosen so as to be less likely to appear in the text being
converted). The helpers _to_alt
and _from_alt
merely
reduce the number of characters required to perform a
changequote
, since the definition changes twice. The outermost
pair means that patsubst
and _capitalize_alt
are invoked
with alternate quoting; the innermost pair is used so that the third
argument to patsubst
can contain an unbalanced
‘]>>’/‘<<[’ pair. Note that upcase
and downcase
must be redefined as _upcase_alt
and _downcase_alt
, since
they contain nested quotes but are invoked with the alternate quoting
scheme in effect.
| $ m4 -I examples
include(`capitalize2.m4')dnl
define(`active', `act1, ive')dnl
define(`Active', `Act2, Ive')dnl
define(`ACTIVE', `ACT3, IVE')dnl
define(`A', `OOPS')dnl
capitalize(active)
⇒Act1,Ive
capitalize(`active')
⇒Act2, Ive
capitalize(``active'')
⇒Active
capitalize(```actIVE''')
⇒`Active'
undivert(`capitalize2.m4')dnl
⇒divert(`-1')
⇒# upcase(text)
⇒# downcase(text)
⇒# capitalize(text)
⇒# change case of text, improved version
⇒define(`upcase', `translit(`$*', `a-z', `A-Z')')
⇒define(`downcase', `translit(`$*', `A-Z', `a-z')')
⇒define(`_arg1', `$1')
⇒define(`_to_alt', `changequote(`<<[', `]>>')')
⇒define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
⇒define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
⇒define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
⇒define(`_capitalize_alt',
⇒ `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
⇒ <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
⇒define(`capitalize',
⇒ `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
⇒ _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
⇒divert`'dnl
|
17.7 Solution for fatal_error
The fatal_error
macro (see section Exiting from m4
) is not robust to versions
of GNU M4 earlier than 1.4.8, where invoking
__file__
(see section Printing current location) inside m4wrap
would result
in an empty string, and __line__
resulted in ‘0’ even
though all files start at line 1. Furthermore, versions earlier than
1.4.6 did not support the __program__
macro. If you want
fatal_error
to work across the entire 1.4.x release series, a
better implementation would be:
| define(`fatal_error',
`errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
`:ifelse(__line__, `0', `',
`__file__:__line__:')` fatal error: $*
')m4exit(`1')')
⇒
m4wrap(`divnum(`demo of internal message')
fatal_error(`inside wrapped text')')
⇒
^D
error-->m4:stdin:6: Warning: excess arguments to builtin `divnum' ignored
⇒0
error-->m4:stdin:6: fatal error: inside wrapped text
|
This document was generated by root on April, 3 2008 using texi2html 1.78.