Normative Addendum 1 embodies C's reaction to both the limitations
and promises of international character sets.  
Digraphs and the
<iso646.h> header were meant to improve the appearance of C
programs written in national variants of ISO 646 without, e.g., {
or } characters.  
On the other end of the spectrum, the facilities
connected to <wchar.h> and <wctype.h>
extend the old Standard's barely adequate basis into a complete and
consistent set of utilities for handling wide characters and multibyte strings.
This document summarizes Normative Addendum 1. It is intended to quickly inform readers who are already familiar with the Standard; it does not, and cannot, introduce the complex subject matter behind NA1, nor can it replace the original document as a reference manual. (Nevertheless, it tries to be as accurate as possible, and its author would like to hear about any errors or omissions.)
STDC_VERSION__ shall expand to
199409L.  
(The Normative Addendum was formally registered with ISO in September 1994.)
<: :> <% %> %: %:%:These tokens behave identically to the tokens and preprocessing tokens:
    [    ]    {    }    #    ##
	
	respectively (except that they are spelled differently,
	and so stringize differently).
 %:  and %:%:.
    #define  and     &&
    #define  and_eq  &=
    #define  bitand  &
    #define  bitor   |
    #define  compl   ~
    #define  not     !
    #define  not_eq  !=
    #define  or      ||
    #define  or_eq   |=
    #define  xor     ^
    #define  xor_eq  ^=
	These macro names are reserved for all purposes in translation
	units that include the header,
	but are not reserved in those that do not (this is
	the same as for any other Standard macros).
 wchar_t.  
	Not all code values have to represent a character;
	those that do not must not appear in wide
	strings that are converted to multibyte characters.  
	Code value 0 is reserved for the ``end of string''
	indicator.
char).
 	A character can have representations in more than
	one state, and can have more than one representation
	in any given state.  The representation
	in different states can differ.  
	Not all byte sequences are necessarily valid;
	an invalid sequence causes an encoding error
	when interpreted (normally shown by setting errno 
	to EILSEQ).
However, for encodings used by other library functions, there are further restrictions:
fwprintf);
	all these identifiers are declared by <wctype.h>
	or <wchar.h>.  
	These identifiers are reserved with external linkage in
	all the translation units of a program if and only if
	any translation unit includes either of those
	headers (thus changes in one translation unit may cause another
	translation unit to invoke undefined behavior).
EILSEQ is added to the list of error
	conditions (currently this list consists of EDOM
	and ERANGE).
typedef ... wint_t;
 WEOF (described
		below).  
	It can be the same type as wchar_t.
typedef ... wctrans_t;typedef ... wctype_t;
wctype_t represents a
	classification of characters (like ``is lower case'' or
	``is accented''), while wctrans_t
	represents a character conversion (like ``change to upper case'' or
	``remove any accent'').
 wint_t.  
	It need not be negative nor equal EOF,
	but it serves the same purpose:
	the value, which must not be a valid wide character, is used to
	represent an end of file or as an error indication.
LC_CTYPE category
	of the current locale.
int iswalnum (wint_t);int iswalpha (wint_t);int iswcntrl (wint_t);int iswdigit (wint_t);int iswgraph (wint_t);int iswlower (wint_t);int iswprint (wint_t);int iswpunct (wint_t);int iswspace (wint_t);int iswupper (wint_t);int iswxdigit (wint_t);
WEOF or representable as
	a wchar_t.  
	The function will
	return nonzero if and only if the argument is a wide character of the
	appropriate type.  
	The types are the same as for the <ctype.h>
	functions, except that iswprint and iswgraph
	are guaranteed to return false not only for
	space (as their char counterparts do),
	but for any character
	that iswspace() considers white space.  
	Thus isgraph('\t') is true,
	but iswgraph(L'\t') is false.  
	For the remaining nine functions the expression
 (!isXXXXX(wctob(wc)) || iswXXXXX(wc))
	is true for every wide
	 character.  
	 That is, for any wide character which has a corresponding
	 singlebyte character (which is what
		 wctob returns),
	 if the latter has the given property, then so does the
	 former.  
	 Note that this is not a symmetric relationship.
wctype_t wctype (const char *);int iswctype (wint_t, wctype_t);
isXXXXX
	or iswXXXXX functions to
	test for other properties (e.g. ``is a katakana character''),
	it was felt that this cluttered the namespace (though the names are
	all reserved) without being flexible enough for
	future needs.  
	Instead, the committee introduced a mechanism that can be extended
	at run-time.  
	wctype()
	names a category to test for; wctype()
	returns a wctype_t magic cookie that can
	be handed to iswctype to test for the
	named category, or zero if it does not recognize the
	category.  
	The eleven builtin categories "alnum",
	"alpha", ... "xdigit"
	must be recognized by all
	implementations.  
	Thus, iswctype(ch, wctype("punct"))
	is the same as 
	iswpunct(ch).  
	The wctype_t value is only valid for the
	LC_CTYPE category used to create it.
wint_t towlower (wint_t);wint_t towupper (wint_t);
toupper and tolower.        toupper('é') == 'E',      towupper(L'é') == L'É'
wctrans_t wctrans (const char *);wint_t towctrans (wint_t, wctrans_t);
wctype() and
	iswctype() provide extensible tests.
struct tm;typedef ... size_t;typedef ... wchar_t;typedef ... wint_t;#define NULL ...#define WEOF ...
struct tm ;
	it is still necessary to include <time.h>
	before defining a
	variable of this type.
typedef ... mbstate_t;
WCHAR_MAX and
	WCHAR_MIN
wchar_t can hold.  
	They are integral
	constant expressions of type wchar_t,
	but not necessarily valid
	as wide characters.  
	For example, if wchar_t is a typedef for
	unsigned short, then
	WCHAR_MIN will be zero
	and WCHAR_MAX will
	be the same as USHRT_MAX.
mbstate_t, and an orientation;
	it can be byteoriented, wideoriented,
	or unoriented.  
	When a stream is opened (including stdin etc.,
	and calls to freopen), it is
	unoriented.  
	The functions ungetc, fgetc,
	fputc, and those defined to work though them,
	change an unoriented stream to byteoriented, and shall
	not be called on a wideoriented stream.  
	The functions ungetwc, fgetwc,
	fputwc, and those defined to work though them,
	change an unoriented stream to wideoriented,
	and shall not be called on a byteoriented stream.
Wide binary streams shall obey the positioning restrictions of both text and binary streams. Positioning a wideoriented stream within the middle of an existing character representation and then writing makes all following contents undefined.
	The mbstate_t object associated with a
	stream is saved by fgetpos and restored
	by fsetpos.  
	The object is initialized when the stream is opened as if it were
	an object declared with static lifetime (i.e. all
	zeroes and null pointers).
	The *scanf and *printf
	functions have the ability to handle strings of
	the opposite type to the majority (that is,
	wide strings in fprintf etc.
	and multibyte strings in fwprintf etc.).  
	These strings are converted to the majority form before
	(for *printf) or after (for *scanf)
	any other processing.  
	This conversion is done as if using calls to
	mbrtowc or
	wcrtomb,
	but with an mbstate_t
	object set to the initial state before each
	such conversion.
wint_t fgetwc (FILE *);
mbrtowc
	(using the stream's
	mbstate_t object) until a complete wide
	character has been read, or an error
	occurs.  
	The character or WEOF is returned; the latter can indicate
	end of file (the eof indicator is set), a read error (the error
	indicator is set), or a conversion error (errno is set to
	EILSEQ).  
	All other wide character
	input is done as if via fgetwc.
wint_t fputwc (wchar_t, FILE *);
wcrtomb
	(using the
	stream's mbstate_t object)
	and writes the resulting bytes to the stream.  
	The character or WEOF is
	returned; the latter can indicate a write error (the error
	indicator is set) or a conversion error
	(errno is set to EILSEQ).  
	All other wide character output is done as if via fputwc.
fprintf (and
	printf and sprintf):
%lc,
		which requires a wint_t argument,
	and %ls,
	which requires a wchar_t *
		argument.  
	%lc is equivalent to %ls called with
	a two element array (the argument in
	the first element, and zero in the second).  
	%ls converts the wide characters to bytes;
	the precision indicates the maximum number of bytes
	written (conversion will also stop on a zero wide character);
	a partial multibyte character will not be output,
	though complete trailing shift sequences might be.
fscanf (and
		scanf and sscanf):
	%lc, %ls, and %l[;
	all take a pointer to wchar_t,
	and convert the input to multibyte representation after
	matching.  
	(The qualified and unqualified conversions match the same input.)
int fwprintf (FILE *, const wchar_t *, ...);int wprintf (const wchar_t *, ...);int swprintf (wchar_t *, size_t, const wchar_t *, ...);int vfwprintf (FILE *, const wchar_t *, va_list);int vwprintf (const wchar_t *, va_list);int vswprintf (wchar_t *, size_t, const wchar_t*, va_list);
fprintf,
	including the extensions
	above.  
	With %c, the character is converted
	using btowc;
	with %s, the string
	is converted to wide characters before output.  
	With all formats, width and precision are measured in wide
	characters.  
	The second argument of
	swprintf is the the number of elements
	of the destination array
	(including the terminating zero which is always written).
int fwscanf (FILE *, const wchar_t *, ...);int wscanf (const wchar_t *, ...);int swscanf (const wchar_t *, const wchar_t *, ...);
fscanf,
	including the extensions above.  
	With %c, %s, and %[,
	the accepted input field will be converted
	to its multibyte equivalent after being matched.  
	With all formats, width and precision are
	measured in wide characters.
wchar_t *fgetws (wchar_t *, int, FILE *);int fputws (const wchar_t *, FILE *);wint_t getwc (FILE *);wint_t getwchar (void);wint_t putwc (wchar_t, FILE *);wint_t putwchar (wchar_t);wint_t ungetwc (wint_t, FILE *);
getwc
	and putwc's FILE * argument.)
int fwide (FILE *, int);
double wcstod (const wchar_t *, wchar_t **);long int wcstol (const wchar_t *, wchar_t **, int);unsigned long int wcstoul (const wchar_t*, wchar_t**, int);wchar_t *wcscpy (wchar_t *, const wchar_t *);wchar_t *wcsncpy (wchar_t *, const wchar_t *, size_t);wchar_t *wcscat (wchar_t *, const wchar_t *);wchar_t *wcsncat (wchar_t *, const wchar_t *, size_t);int wcscmp (const wchar_t *, const wchar_t *);int wcscoll (const wchar_t *, const wchar_t *);int wcsncmp (const wchar_t *, const wchar_t *, size_t);size_t wcsxfrm (wchar_t *, const wchar_t *, size_t);wchar_t *wcschr (const wchar_t *, wchar_t);size_t wcscspn (const wchar_t *, const wchar_t *);wchar_t *wcspbrk (const wchar_t *, const wchar_t *);wchar_t *wcsrchr (const wchar_t *, wchar_t);size_t wcsspn (const wchar_t *, const wchar_t *);wchar_t *wcsstr (const wchar_t *, const wchar_t *);size_t wcslen (const wchar_t *);wchar_t *wmemchr (const wchar_t *, wchar_t, size_t);int wmemcmp (const wchar_t *, const wchar_t *, size_t);wchar_t *wmemcpy (wchar_t *, const wchar_t *, size_t);wchar_t *wmemmove (wchar_t *, const wchar_t *, size_t);wchar_t *wmemset (wchar_t *, wchar_t, size_t);size_t wcsftime (wchar_t *, size_t, const wchar_t *, const struct tm *);
wchar_t *wcstok (wchar_t*, const wchar_t*, wchar_t**);
strtok,
	but uses the object pointed to
	by the third argument to keep state, rather than keeping it
	internally as strtok does.  
	This change makes it possible to interleave
	calls to wcstok over different input strings.
mbstate_t
	object that they keep their conversion state in.  
	Such an object can be set to all zeroes (e.g. by
	assigning to it the value of an mbstate_t
	object with static lifetime which has not been explicitly
	initialized)
	and is then in its initial state.  
	When an object is in the initial state
	(no matter how this occurred),
	it is prepared for conversion in either direction
	(from multibyte to wide characters or vice versa)
	starting in the initial state.  
	Once an object has left its initial state
	(which happens whenever it is used with one
	of the following functions unless the description says otherwise),
	it shall only be used in the same
	LC_CTYPE category [*]
	and same direction as the previous call,
	and shall not be used after a conversion error.  
	If a null pointer is passed, each
	function uses its own internal object
	which is initialized to all zeroes at program startup.
mbstate_t object associated with a stream is bound
to an encoding by the first fgetwc or fputwc
call after the stream is opened, and can then be used with any locale.
wint_t btowc (int);
unsigned char)
	to the corresponding wide character, if any, or else returns
	WEOF.
int wctob (wint_t);
EOF.
int mbsinit (const mbstate_t *);
mbstate_t object is
	in the initial state (the object is unaffected).
size_t mbrlen (const char *s, size_t n, mbstate_t *pcs);
mbrtowc(NULL,
	s, n,
	pcs), except
	that it uses its own internal mbstate_t object,
	not that of mbrtowc, when given a null pointer.
size_t mbrtowc
	(wchar_t *ws,
	const char *s,
	size_t n,
mbstate_t *pcs);s (inspecting no
	more than n bytes) to a wide character.  
	If ws is not a null pointer, the wide character
	is stored in *ws.  
	If s is a null pointer, mbrtowc 
	ignores ws and n and acts as if the first
	three arguments are a null pointer, an empty string, and 1 respectively.
  (size_t)-2mbstate_t, but no
           complete wide character has been found.
(size_t)-10mbstate_t object has been restored to the initial state.
mbstate_t object has been updated.
mbstate_t object; the inspected
  bytes do not need to be
  passed to the function a second time.
size_t wcrtomb (char *, wchar_t, mbstate_t *);
MB_CUR_MAX bytes and
	places them in the array pointed to by the
	first argument; if the wide character is zero,
	the resulting sequence will end in the initial 
	state,
	followed by a zero byte, and the mbstate_t
	object will be in the initial state.
  wcrtomb returns the number of bytes written to the
  character buffer, or (size_t)-1 to indicate an encoding
  error (errno is set to EILSEQ).
size_t mbsrtowcs (wchar_t *ws, const char **ps, size_t n,
mbstate_t *pcs);
 *ps to wide characters.  
	The result is either (size_t)-1 if a
	conversion error occurs (in which case errno is set to
	EILSEQ), or else the number
	of bytes processed.  
	ws is a null pointer,
	     processing stops at the end of the string
	     (the terminating zero byte is not counted in the returned value),
	     and *pcs will be set to the initial state.
	ws is not a null pointer,
	     the resulting wide character sequence
	     is stored in the array it points to.  
	     Conversion stops when:
	     n wide characters have been stored;
		 *pcs will be set to the conversion state
		 after processing the indicated number of bytes,
		 and *ps will point to the first unprocessed byte
	     *pcs will be set to the initial state, 
		 *ps will be set to a null pointer, and a zero
		 wide character will have been stored.
	      
size_t wcsrtombs (char *s, const wchar_t **pws, size_t n,
mbstate_t *pcs);pws
	to a multibyte character sequence.  
	The result is either (size_t)-1 if a conversion
	error occurs (in which case errno is set to
	EILSEQ), or else the number of bytes in the
	resulting multibyte string.  
	Processing of the wide string stops either when a zero wide
	character - indicating the end of the wide string - is reached
	(the resulting multibyte string will end with a zero byte
	which is not included in the returned result), or (if s 
	is not a null pointer) when it is not possible to process another wide
	character without placing more than n bytes into the
	array pointed to by
	s.  In the first case, *pcs 
	will be left in the initial state.  
	If s is a null pointer, the value of n 
	is ignored.  Otherwise *pws will
	be set to either a null pointer (if conversion stopped on a
	zero wide character) or a pointer to the first unprocessed
	wide character.  In the latter case, the returned
	value will be at least (n-MB_CUR_MAX+1).
<wctype.h> reserves function names beginning
	with is or to followed by a lowercase
	letter.  
	<wchar.h> reserves function names beginning
	with wcs followed by a lowercase letter.  
	Lowercase letters are reserved as conversion
	specifiers for fwprintf and fwscanf.