ABNFGEN(1) ABNFGEN(1)
NAME
abnfgen - ABNF-based test case generator
SYNOPSIS
abnfgen [ -7hvclux ] [ -d output-directory ]
[ -y depth ] [ -n ncases ]
[ -p filename-pattern ] [ -w prefix ] [ -r seed ]
[ -s start-symbol ] [ -t tentative-file ]
[ files... ]
DESCRIPTION
ABNF ("Advanced Backus Naur Form") grammars are frequently
used in RFCs to describe protocols or presentation formats
for Internet Standards. The abnfgen program produces text
composed according to the rules of an ABNF grammar, given
the grammar. Such text can then be used to test a parser
that claims to implement the grammar.
OPTIONS
-h Print a brief usage.
-v Verbose mode. Print a trace of the expanded gram-
mar rules to standard error.
-w Write seed. Begin each generated file with prefix,
followed by the seed as a decimal number and a new-
line.
-c Try for complete coverage. Rather than randomly
picking productions, try to cover each leaf and
branch of the grammar.
For repetitions with * where maximum and minimum
are less than 100 iterations apart, the repetition
counts as fully covered when both maximum and mini-
mum have been produced. (The stages in between are
only tried once the extremes have been covered.)
For repetitions with * where maximum and minimum
are more than 100 apart, only the minimum and some
small number of repetitions are tried; doubling the
-c removes that limit and causes the extremes to be
tried even if the maximum is very large.
For character ranges with up to 256 elements, each
elements must be produced for full coverage. For
(Unicode) character ranges with more than 256 ele-
ments, every possible value modulo 256 is produced
at least once for complete coverage.
A case-insensitive strings counts as fully covered
once both an all-uppercase and an all-lowercase
version have been produced.
-y Descend freely through at most depth nonterminal
expansions or repetitions. After recursing that
deeply, always pick that branch of the grammar that
terminates most quickly. This can be used to limit
expansion in grammars likely to recurse infinitely
such as a = a a a a | "x".
The default depth is 100.
-n Rather than generating output on standard output
(the default), generate ncases test files named
####.tst, where #### is replaced with the test's
running number from 1 to ncases.
-p Name test cases using filename-pattern. A sequence
of # characters in the pattern is replaced with the
running number of the test, padded to the specified
size.
-r Initialize the random generator using seed. For
the same grammar and version of the software, the
same seed always creates the same subtree.
-s Start production with start-symbol. (Default: the
first nonterminal defined in the grammar file.)
-t Read the nonterminals defined in tentative-file and
use them if they don't get defined in the regular
input files.
-u Reject any grammar that contains <>-enclosed prose.
-x Exclude the core set of definitions. The RFC that
defines the grammar also defines a core set of
symbols like CRLF, DIGIT, SP and so on. Since
version 0.9, abnfgen predefines these symbols as if
their definition had been passed in with -t.
Specify -x to suppress those predefinitions.
-7 Disable RFC 7405: do not interpret '%i"foo"' and
'%s"foo"' as case-insensitive or case-sensitive lit-
erals, respectively.
-l ("legal") Disable extensions to RFC 4234: do not
allow case-sensitive literals in single quotes, do
not allow branch tagging with {}, and do not convert
character constants from Unicode to UTF-8.
-_ Allow "_" in identifier names.
ABNF doesn't (it allows "-" instead), but many other
grammar systems do.
GRAMMAR
The input grammar is a slight extension of ABNF with pro-
visions for literal strings and control of the chance of
descent into specific branches of the grammar.
nonterminal = expression
The nonterminal expands to expression.
nonterminal =/ expression
Alternative to whatever else has been defined, the
nonterminal can also expand to expression.
x / y Either x or y.
x y X followed by y.
"abc" The case-insensitive string abc. That is, one of
ABC, ABc, AbC, Abc, aBC, aBc, abC, or abc. There
is no way of quoting " in a string; in a pinch, use
<"> or %x22.
'abc' The case-sensitive string abc. This is an exten-
sion to ABNF, and can be disabled using the -l
option. In ABNF, 'abc' must be specified using
the ascii values of the characters, e.g. as
%x61.62.63. There is no way of quoting ' in a
string; use '"'"'. (That is, leave the single-
quoted string, enter a double-quoted string, the
single quote, leave the double-quoted string, reen-
ter the single-quoted string.)
%zN The character with base z value N. Bases are: x
(16), d (10), and b (2).
%zM-N Any character with values betwen M and N inclusive,
base z.
M*N expression
Between (inclusive) M and N repetitions of expres-
sion. The default for M is 0, for N infinity.
[expression]
Same as 0*1(expression), that is, an optional
expression.
(expression)
Expands to expression. (Use parentheses for group-
ing.)
BRANCH CONTROL
Each alternative in the grammar has a weight assigned to
it. Normally, these weights are all 1. The higher the
weight, the more likely it is that a alternative is used
when expanding the nonterminal it occurs in. Assign spe-
cific weights to an alternative by prefixing it with {N},
where N is an integer. For example
nt = {30} nt x / x
x = {2} 'a' / {3} 'b'
should tend to produce output that is about 2/5th 'a',
about 3/5th 'b', and fairly lengthy.
Repetition can be branch controlled with two chance param-
eters that govern whether to stop or continue on each step
of the repetition. They're placed before the two counts
around the *:
nt = {1}1*{3}10 'X'
will tend to produce somewhere around 3 Xs on average.
This is an extension to RFC 4234 and can be disabled by
specifying -l on the command line.
BUGS
Please send problems, bugs, questions, desirable enhance-
ments, etc. to:
jutta@pobox.com
4 March 2002 ABNFGEN(1)