Files
oldlinux-files/study/Ref-docs/C/lib_scan.html
2024-02-19 00:25:23 -05:00

428 lines
21 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<HTML><HEAD><TITLE>Formatted Input</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">
<H1><A NAME="Formatted Input">Formatted Input</A></H1><HR>
<P><B><A HREF="#Scan Formats">Scan Formats</A>
&#183; <A HREF="#Scan Functions">Scan Functions</A>
&#183; <A HREF="#Scan Conversion Specifiers">Scan Conversion Specifiers</A>
</B></P>
<HR>
<P>Several library functions help you convert data values from
text sequences that are generally readable by people to
encoded internal representations. You provide a
<A HREF="lib_prin.html#format string" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#format string">format string</A> as the value of the
<CODE>format</CODE> argument to each of these functions, hence
the term <B>formatted input</B>. The functions fall
into two categories:</P>
<UL>
<LI>The <B><A NAME="byte scan functions">byte scan functions</A></B>
(declared in
<A HREF="stdio.html#&lt;stdio.h&gt;" tppabs="http://ccs.ucsd.edu/c/stdio.html#&lt;stdio.h&gt;"><CODE>&lt;stdio.h&gt;</CODE></A>)
convert sequences of type <I>char</I> to internal representations,
and help you scan such sequences that you read:
<A HREF="stdio.html#fscanf" tppabs="http://ccs.ucsd.edu/c/stdio.html#fscanf">fscanf</A>,
<A HREF="stdio.html#scanf" tppabs="http://ccs.ucsd.edu/c/stdio.html#scanf">scanf</A>, and
<A HREF="stdio.html#sscanf" tppabs="http://ccs.ucsd.edu/c/stdio.html#sscanf">sscanf</A>.
For these function, a format string is a
<A HREF="lib_over.html#multibyte string" tppabs="http://ccs.ucsd.edu/c/lib_over.html#multibyte string">multibyte string</A>
that begins and ends in the
<A HREF="charset.html#initial shift state" tppabs="http://ccs.ucsd.edu/c/charset.html#initial shift state">initial shift state</A>.
<LI>The <B><A NAME="wide scan functions">wide scan functions</A></B>
(declared in
<A HREF="wchar.html#&lt;wchar.h&gt;" tppabs="http://ccs.ucsd.edu/c/wchar.html#&lt;wchar.h&gt;"><CODE>&lt;wchar.h&gt;</CODE></A>
and hence added with
<B><A HREF="lib_over.html#Amendment 1" tppabs="http://ccs.ucsd.edu/c/lib_over.html#Amendment 1">Amendment 1</A></B>)
convert sequences of type
<A HREF="stddef.html#wchar_t" tppabs="http://ccs.ucsd.edu/c/stddef.html#wchar_t"><CODE>wchar_t</CODE></A>,
to internal representations,
and help you scan such sequences that you read:
<A HREF="wchar.html#fwscanf" tppabs="http://ccs.ucsd.edu/c/wchar.html#fwscanf">fwscanf</A>,
<A HREF="wchar.html#wscanf" tppabs="http://ccs.ucsd.edu/c/wchar.html#wscanf">wscanf</A> and
<A HREF="wchar.html#swscanf" tppabs="http://ccs.ucsd.edu/c/wchar.html#swscanf">swscanf</A>.
For these functions, a format string is a
<A HREF="lib_over.html#wide-character string" tppabs="http://ccs.ucsd.edu/c/lib_over.html#wide-character string">wide-character string</A>.
In the descriptions that follow, a wide character <CODE>wc</CODE>
from a format string or a stream is compared to a specific (byte)
character <CODE>c</CODE> as if by evaluating the expression
<CODE><A HREF="wchar.html#wctob" tppabs="http://ccs.ucsd.edu/c/wchar.html#wctob">wctob</A>(wc) == c</CODE>.
</UL>
<H2><A NAME="Scan Formats">Scan Formats</A></H2>
<P>A format string has the same general
<B><A HREF="lib_prin.html#format string" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#format string">syntax</A></B>
for the scan functions as for the
<A HREF="lib_prin.html#Formatted Output" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#Formatted Output">print functions</A>:
zero or more
<B><A HREF="lib_prin.html#conversion specification" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#conversion specification">
conversion specifications</A></B>,
interspersed with literal text and
<B><A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">white space</A></B>.
For the scan functions, however, a conversion specification is one of the
<A HREF="#scan conversion specification">
scan conversion specifications</A> described below.</P>
<P>A scan function scans the format string once from beginning
to end to determine what conversions to perform. Every scan
function accepts a
<A HREF="stdarg.html#varying number of arguments" tppabs="http://ccs.ucsd.edu/c/stdarg.html#varying number of arguments">varying number
of arguments</A>, either directly or under control of an argument of type
<A HREF="stdarg.html#va_list" tppabs="http://ccs.ucsd.edu/c/stdarg.html#va_list"><CODE>va_list</CODE></A>.
Some scan conversion specifications
in the format string use the next argument in the list.
A scan function uses each successive argument no more than
once. Trailing arguments can be left unused.</P>
<P>In the description that follows, the
<A HREF="lib_prin.html#integer conversions" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#integer conversions">integer conversions</A> and
<A HREF="lib_prin.html#floating-point conversions" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#floating-point conversions">
floating-point conversions</A> are the same as for the
<A HREF="lib_prin.html#Formatted Output" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#Formatted Output">print functions</A>.</P>
<H2><A NAME="Scan Functions">Scan Functions</A></H2>
<P>For the scan functions, literal text in a format string must
match the next characters to scan in the input text.
<A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">White space</A> in
a format string must match the longest possible sequence of the next
zero or more white-space characters in the input. Except for the
<A HREF="#Scan Conversion Specifiers">
scan conversion specifier</A>
<A HREF="#%n"><CODE>%n</CODE></A>
(which consumes no input), each
<B><A NAME="scan conversion specification">
scan conversion specification</A></B>
determines a pattern that one or more of the next characters in the
input must match. And except for the
<A HREF="#Scan Conversion Specifiers">
scan conversion specifiers
<A HREF="#%c"><CODE>c</CODE></A>,
<A HREF="#%n"><CODE>n</CODE></A>, and
<A HREF="#%["><CODE>[</CODE></A>,
every match begins by skipping any
<A HREF="lib_prin.html#%n" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#%n">white space</A> characters in the input.</P>
<P>A scan function returns when:</P>
<UL>
<LI>it reaches the terminating null in the format string
<LI>it cannot obtain additional input characters to scan
(<B><A NAME="input failure">input failure</A></B>)
<LI>a conversion fails
(<B><A NAME="matching failure">matching failure</A></B>)
</UL>
<P>A scan function returns
<A HREF="stdio.html#EOF" tppabs="http://ccs.ucsd.edu/c/stdio.html#EOF"><CODE>EOF</CODE></A>
if an input failure occurs before any conversion. Otherwise it returns
the number of converted values stored. If one or more characters form
a valid prefix but the conversion fails, the valid prefix is consumed
before the scan function returns. Thus:</P>
<PRE>
scanf("%i", &amp;i) <B>consumes 0X from the field 0XZ</B>
scanf("%f", &amp;f) <B>consumes 3.2E from the field 3.2EZ</B></PRE>
<P>A scan conversion specification typically converts the matched input
characters to a corresponding encoded value. The next argument value
must be the address of an object. The conversion converts the encoded
representation (as necessary) and stores its value in the object.
A scan conversion specification has the format:</P>
<P><IMG SRC="scan.gif" tppabs="http://ccs.ucsd.edu/c/gif/scan.gif"></P>
<P>Following the percent character (<B><CODE>%</CODE></B>)
in the format string, you can write an asterisk (<B><CODE>*</CODE></B>)
to indicate that the conversion should not store
the converted value in an object.</P>
<P>Following any <CODE>*</CODE>, you can write a nonzero
<B><A NAME="scan field width">field width</A></B>
that specifies the maximum number of input characters
to match for the conversion (not counting any
<A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">white space</A> that the
pattern can first skip).</P>
<H2><A NAME="Scan Conversion Specifiers">Scan Conversion Specifiers</A></H2>
<P>Following any
<A HREF="#scan field width">field width</A>,
you must write a one-character <B>scan conversion specifier</B>,
either a one-character code or a
<A HREF="#scan set">scan set</A>,
possibly preceded by a one-character qualifier.
Each combination determines the type required of the
next argument (if any) and how
the scan functions interpret the text sequence and converts it to
an encoded value.
The <A HREF="lib_prin.html#integer conversions" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#integer conversions">integer</A> and
<A HREF="lib_prin.html#floating-point conversions" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#floating-point conversions">
floating-point conversions</A> also determine
what base to assume for the text representation. (The base is
the <CODE>base</CODE> argument to the functions
<A HREF="stdlib.html#strtol" tppabs="http://ccs.ucsd.edu/c/stdlib.html#strtol"><CODE>strtol</CODE></A> and
<A HREF="stdlib.html#strtoul" tppabs="http://ccs.ucsd.edu/c/stdlib.html#strtoul"><CODE>strtoul</CODE></A>.)
The following table lists all defined combinations and
their properties.</P>
<PRE><B>
Conversion Argument Conversion
Specifier Type Function Base</B>
%c char x[]
%lc wchar_t x[]
%d int *x strtol 10
%hd short *x strtol 10
%ld long *x strtol 10
%e float *x strtod 10
%le double *x strtod 10
%Le long double *x strtod 10
%E float *x strtod 10
%lE double *x strtod 10
%LE long double *x strtod 10
%f float *x strtod 10
%lf double *x strtod 10
%Lf long double *x strtod 10
%g float *x strtod 10
%lg double *x strtod 10
%Lg long double *x strtod 10
%G float *x strtod 10
%lG double *x strtod 10
%LG long double *x strtod 10
%i int *x strtol 0
%hi short *x strtol 0
%li long *x strtol 0
%n int *x
%hn short *x
%ln long *x
%o unsigned int *x strtoul 8
%ho unsigned short *x strtoul 8
%lo unsigned long *x strtoul 8
%p void **x
%s char x[]
%ls wchar_t x[]
%u unsigned int *x strtoul 10
%hu unsigned short *x strtoul 10
%lu unsigned long *x strtoul 10
%x unsigned int *x strtoul 16
%hx unsigned short *x strtoul 16
%lx unsigned long *x strtoul 16
%X unsigned int *x strtoul 16
%hX unsigned short *x strtoul 16
%lX unsigned long *x strtoul 16
%[...] char x[]
%l[...] wchar_t x[]
%% <B>none</B></PRE>
<P>The scan conversion specifier (or
<A HREF="#scan set">scan set</A>) determines any behavior
not summarized in this table. In the following descriptions,
examples follow each of the scan conversion specifiers.
In each example, the function
<A HREF="stdio.html#sscanf" tppabs="http://ccs.ucsd.edu/c/stdio.html#sscanf"><CODE>sscanf</CODE></A>
matches the <B><CODE>bold</CODE></B> characters.</P>
<P>You write <B><A NAME="%c"><CODE>%c</CODE></A></B>
to store the matched input characters in
an array object. If you specify no field width <I>w,</I> then <I>w</I>
has the value one. The match does not skip leading
<A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">white space</A>. Any
sequence of <I>w</I> characters matches the conversion pattern. For a
<A HREF="lib_file.html#wide stream" tppabs="http://ccs.ucsd.edu/c/lib_file.html#wide stream">wide stream</A>,
conversion occurs as if by repeatedly calling
<A HREF="wchar.html#wcrtomb" tppabs="http://ccs.ucsd.edu/c/wchar.html#wcrtomb"><CODE>wcrtomb</CODE></A>,
beginning in the
<A HREF="charset.html#initial conversion state" tppabs="http://ccs.ucsd.edu/c/charset.html#initial conversion state">
initial conversion state</A>.</P>
<PRE>
sscanf("<B>1</B>29E-2", "%c", &amp;c) <B>stores '1'</B>
sscanf("<B>12</B>9E-2", "%2c", &amp;c[0]) <B>stores '1', '2'</B>
swscanf(L"<B>1</B>29E-2", L"%c", &amp;c) <B>stores '1'</B></PRE>
<P>You write <B><A NAME="%lc"><CODE>%lc</CODE></A></B>
to store the matched input characters in an array object,
with elements of type
<A HREF="stddef.html#wchar_t" tppabs="http://ccs.ucsd.edu/c/stddef.html#wchar_t">wchar_t</A>.
If you specify no field width <I>w,</I> then <I>w</I>
has the value one. The match does not skip leading
<A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">white space</A>. Any
sequence of <I>w</I> characters matches the conversion pattern. For a
<A HREF="lib_file.html#byte stream" tppabs="http://ccs.ucsd.edu/c/lib_file.html#byte stream">byte stream</A>,
conversion occurs as if by repeatedly calling
<A HREF="wchar.html#wcrtomb" tppabs="http://ccs.ucsd.edu/c/wchar.html#wcrtomb"><CODE>mbrtowc</CODE></A>, beginning in the
<A HREF="charset.html#initial conversion state" tppabs="http://ccs.ucsd.edu/c/charset.html#initial conversion state">
initial conversion state</A>.</P>
<PRE>
sscanf("<B>1</B>29E-2", "%lc", &amp;c) <B>stores L'1'</B>
sscanf("<B>12</B>9E-2", "%2lc", &amp;c) <B>stores L'1', L'2'</B>
swscanf(L"<B>1</B>29E-2", L"%lc", &amp;c) <B>stores L'1'</B></PRE>
<P>You write <B><A NAME="%d"><CODE>%d</CODE></A></B>,
<B><A NAME="%i"><CODE>%i</CODE></A></B>,
<B><A NAME="%o"><CODE>%o</CODE></A></B>,
<B><A NAME="%u"><CODE>%u</CODE></A></B>,
<B><A NAME="%x"><CODE>%x</CODE></A></B>, or
<B><A NAME="%X"><CODE>%X</CODE></A></B> to convert
the matched input characters as a signed integer
and store the result in an integer object.</P>
<PRE>
sscanf("<B>129E</B>-2", "%o%d%x", &amp;i, &amp;j, &amp;k) <B>stores 10, 9, 14</B></PRE>
<P>You write <B><A NAME="%e"><CODE>%e</CODE></A></B>,
<B><A NAME="%E"><CODE>%E</CODE></A></B>,
<B><A NAME="%f"><CODE>%f</CODE></A></B>,
<B><A NAME="%g"><CODE>%g</CODE></A></B>, or
<B><A NAME="%G"><CODE>%G</CODE></A></B>
to convert the matched input characters as a signed fraction, with
an optional exponent, and store the result in a floating-point object.</P>
<PRE>
sscanf("<B>129E-2</B>", "%e", &amp;f) <B>stores 1.29</B></PRE>
<P>You write <B><A NAME="%n"><CODE>%n</CODE></A></B>
to store the number of characters
matched (up to this point in the format) in an integer object. The
match does not skip leading
<A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">white space</A>
and does not match any input characters.</P>
<PRE>
sscanf("<B>12</B>9E-2", "12%n", &amp;i) <B>stores 2</B></PRE>
<P>You write <B><A NAME="%p"><CODE>%p</CODE></A></B>
to convert the matched input characters as
an external representation of a <I>pointer to void</I> and store the
result in an object of type <I>pointer to void.</I> The input characters
must match the form generated by the
<A HREF="lib_prin.html#%p" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#%p"><CODE>%p</CODE>
<A HREF="lib_prin.html#print conversion specification" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#print conversion specification">
print conversion specification</A>.</P>
<PRE>
sscanf("<B>129E</B>-2", "%p", &amp;p) <B>stores, e.g. 0x129E</B></PRE>
<P>You write <B><A NAME="%s"><CODE>%s</CODE></A></B>
to store the matched input characters in
an array object, followed by a terminating null character. If you
do not specify a field width <I>w,</I> then <I>w</I> has a large value.
Any sequence of up to <I>w</I> non white-space characters matches
the conversion pattern. For a
<A HREF="lib_file.html#wide stream" tppabs="http://ccs.ucsd.edu/c/lib_file.html#wide stream">wide stream</A>,
conversion occurs as if by repeatedly calling
<CODE>wcrtomb</CODE> beginning in the
<A HREF="charset.html#initial conversion state" tppabs="http://ccs.ucsd.edu/c/charset.html#initial conversion state">
initial conversion state</A>.</P>
<PRE>
sscanf("<B>129E-2</B>", "%s", &amp;s[0]) <B>stores "129E-2"</B>
swscanf(L"<B>129E-2</B>", L"%s", &amp;s[0]) <B>stores "129E-2"</B></PRE>
<P>You write <B><A NAME="%ls"><CODE>%ls</CODE></A></B>
to store the matched input characters in
an array object, with elements of type
<A HREF="stddef.html#wchar_t" tppabs="http://ccs.ucsd.edu/c/stddef.html#wchar_t">wchar_t</A>,
followed by a terminating null wide character. If you
do not specify a field width <I>w,</I> then <I>w</I> has a large value.
Any sequence of up to <I>w</I> non white-space characters matches
the conversion pattern. For a
<A HREF="lib_file.html#byte stream" tppabs="http://ccs.ucsd.edu/c/lib_file.html#byte stream">byte stream</A>,
conversion occurs as if by repeatedly calling
<A HREF="wchar.html#mbrtowc" tppabs="http://ccs.ucsd.edu/c/wchar.html#mbrtowc"><CODE>mbrtowc</CODE></A>,
beginning in the
<A HREF="charset.html#initial conversion state" tppabs="http://ccs.ucsd.edu/c/charset.html#initial conversion state">
initial conversion state</A>.</P>
<PRE>
sscanf("<B>129E-2</B>", "%ls", &amp;s[0]) <B>stores L"129E-2"</B>
swscanf(L"<B>129E-2</B>", L"%ls", &amp;s[0]) <B>stores L"129E-2"</B></PRE>
<P>You write <B><A NAME="%["><CODE>%[</CODE></A></B>
to store the matched input characters in
an array object, followed by a terminating null character. If you
do not specify a field width <I>w,</I> then <I>w</I> has a large value.
The match does not skip leading
<A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">white space</A>.
A sequence of up to <I>w</I>
characters matches the conversion pattern in the
<B><A NAME="scan set">scan set</A></B> that follows.
To complete the scan set, you follow the left bracket
(<CODE>[</CODE>) in the conversion specification with a sequence
of zero or more <B>match</B> characters, terminated by a right bracket
(<B><CODE>]</CODE></B>).</P>
<P>If you do not write a caret (<B><CODE>^</CODE></B>)
immediately after the <CODE>[</CODE>, then each
input character must match <I>one</I> of the match
characters. Otherwise, each input character must not match <I>any</I>
of the match characters, which begin with the character following
the <CODE>^</CODE>. If you write a <B><CODE>]</CODE></B>
immediately after the <CODE>[</CODE> or <CODE>[^</CODE>,
then the <CODE>]</CODE> is the first match character, not
the terminating <CODE>]</CODE>. If you write a minus
(<B><CODE>-</CODE></B>) as other than the first or last match character,
an implementation can give it special meaning.
It usually indicates a range of characters, in conjunction with the
characters immediately preceding or following, as in
<CODE>0-9</CODE> for all the digits.)
You cannot specify a null match character.</P>
<P> For a <A HREF="lib_file.html#wide stream" tppabs="http://ccs.ucsd.edu/c/lib_file.html#wide stream">wide stream</A>,
conversion occurs as if by repeatedly calling
<A HREF="wchar.html#wcrtomb" tppabs="http://ccs.ucsd.edu/c/wchar.html#wcrtomb"><CODE>wcrtomb</CODE></A>,
beginning in the
<A HREF="charset.html#initial conversion state" tppabs="http://ccs.ucsd.edu/c/charset.html#initial conversion state">
initial conversion state</A>.</P>
<PRE>
sscanf("<B>12</B>9E-2", "[54321]", &amp;s[0]) <B>stores "12"</B>
swscanf(L"<B>12</B>9E-2", L"[54321]", &amp;s[0]) <B>stores "12"</B></PRE>
<P>You write <B><A NAME="%l["><CODE>%l[</CODE></A></B>
to store the matched input characters in
an array object, with elements of type
<A HREF="stddef.html#wchar_t" tppabs="http://ccs.ucsd.edu/c/stddef.html#wchar_t">wchar_t</A>,
followed by a terminating null wide character. If you
do not specify a field width <I>w,</I> then <I>w</I> has a large value.
The match does not skip leading
<A HREF="lib_prin.html#white space" tppabs="http://ccs.ucsd.edu/c/lib_prin.html#white space">.
A sequence of up to <I>w</I>
characters matches the conversion pattern in the
<A HREF="#scan set">scan set</A> that follows.
<P> For a <A HREF="lib_file.html#byte stream" tppabs="http://ccs.ucsd.edu/c/lib_file.html#byte stream">byte stream</A>,
conversion occurs as if by repeatedly calling
<A HREF="wchar.html#mbrtowc" tppabs="http://ccs.ucsd.edu/c/wchar.html#mbrtowc"><CODE>mbrtowc</CODE></A>,
beginning in the
<A HREF="charset.html#initial conversion state" tppabs="http://ccs.ucsd.edu/c/charset.html#initial conversion state">
initial conversion state</A>.</P>
<PRE>
sscanf("<B>12</B>9E-2", "l[54321]", &amp;s[0]) <B>stores L"12"</B>
swscanf(L"<B>12</B>9E-2", L"l[54321]", &amp;s[0]) <B>stores L"12"</B></PRE>
<P>You write <B><A NAME="%%"><CODE>%%</CODE></A></B>
to match the percent character (<CODE>%</CODE>).
The function does not store a value.</P>
<PRE>
sscanf("<B>% 0XA</B>", "%% %i") <B>stores 10</B></PRE>
<HR>
<P>See also the
<B><A HREF="index.html#Table of Contents" tppabs="http://ccs.ucsd.edu/c/index.html#Table of Contents">Table of Contents</A></B> and the
<B><A HREF="_index.html" tppabs="http://ccs.ucsd.edu/c/_index.html">Index</A></B>.</P>
<P><I>
<A HREF="crit_pb.html" tppabs="http://ccs.ucsd.edu/c/crit_pb.html">Copyright</A> &#169; 1989-1996
by P.J. Plauger and Jim Brodie. All rights reserved.</I></P>
</BODY></HTML>