Files
oldlinux-files/Ref-docs/C/lib_file.html
2024-02-19 00:21:47 -05:00

373 lines
20 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<HTML><HEAD><TITLE>Files and Streams</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">
<H1><A NAME="Files and Streams">Files and Streams</A></H1><HR>
<P><B><A HREF="#Text and Binary Streams">Text and Binary Streams</A>
&#183; <A HREF="#Byte and Wide Streams">Byte and Wide Streams</A>
&#183; <A HREF="#Controlling Streams">Controlling Streams</A>
&#183; <A HREF="#Stream States">Stream States</A>
</B></P>
<HR>
<P>A program communicates with the target environment by reading
and writing
<B><A NAME="files">files</A></B> (ordered sequences of bytes). A file can
be, for example, a data set that you can read and write repeatedly
(such as a disk file), a stream of bytes generated by a program (such
as a pipeline), or a stream of bytes received from or sent to a peripheral
device (such as the keyboard or display). The latter two are
<B><A NAME="interactive files">interactive files</A></B>.
Files are typically the principal means by which to interact with a
program.</P>
<P>You manipulate all these kinds of files in much the same way
-- by calling library functions. You include the standard header
<CODE>&lt;stdio.h&gt;</CODE> to declare most of these functions.</P>
<P>Before you can perform many of the operations on a file, the
file must be
<B><A NAME="file open">opened</A></B>.
Opening a file associates it with a
<B><A NAME="stream">stream</A></B>, a data structure within
the Standard C library that glosses over many differences
among files of various kinds.
The library maintains the state of each stream in an object of type
<B><A HREF="stdio.html#FILE" tppabs="http://ccs.ucsd.edu/c/stdio.html#FILE"><CODE>FILE</CODE></A></B>.</P>
<P>The target environment opens three files prior to
<A HREF="lib_over.html#program startup" tppabs="http://ccs.ucsd.edu/c/lib_over.html#program startup">program startup</A>.
You can open a file by calling the library function
<A HREF="stdio.html#fopen" tppabs="http://ccs.ucsd.edu/c/stdio.html#fopen"><CODE>fopen</CODE></A> with
two arguments. The first argument is a
<A HREF="lib_over.html#filename" tppabs="http://ccs.ucsd.edu/c/lib_over.html#filename">filename</A>, a
<A HREF="lib_over.html#multibyte string" tppabs="http://ccs.ucsd.edu/c/lib_over.html#multibyte string">multibyte string</A>
that the target environment uses to identify which file you
want to read or write. The second argument is a
<A HREF="lib_over.html#C string" tppabs="http://ccs.ucsd.edu/c/lib_over.html#C string">C string</A> that specifies:</P>
<UL>
<LI>whether you intend to read data from the file or write data
to it or both
<LI>whether you intend to generate new contents for the file (or
create a file if it did not previously exist) or leave the existing
contents in place
<LI>whether writes to a file can alter existing contents or should
only append bytes at the end of the file
<LI>whether you want to manipulate a
<A HREF="#text stream">text stream</A> or a
<A HREF="#binary stream">binary stream</A>
</UL>
<P>Once the file is successfully opened, you can then determine
whether the stream is
<B><A NAME="byte oriented">byte oriented</A></B> (a
<B><A HREF="#byte stream">byte stream</A></B>) or
<B><A NAME="wide oriented">wide oriented</A></B> (a
<B><A HREF="#wide stream">wide stream</A></B>).
Wide-oriented streams are supported only with
<A HREF="lib_over.html#Amendment 1" tppabs="http://ccs.ucsd.edu/c/lib_over.html#Amendment 1">Amendment 1</A>.
A stream is initially
<B><A NAME="unbound stream">unbound</A></B>.
Calling certain functions to operate on the stream makes it byte oriented,
while certain other functions make it wide oriented. Once established,
a stream maintains its orientation until it is closed by a call to
<A HREF="stdio.html#fclose" tppabs="http://ccs.ucsd.edu/c/stdio.html#fclose"><CODE>fclose</CODE></A> or
<A HREF="stdio.html#freopen" tppabs="http://ccs.ucsd.edu/c/stdio.html#freopen"><CODE>freopen</CODE></A>.</P>
<H2><A NAME="Text and Binary Streams">Text and Binary Streams</A></H2>
<P>A
<B><A NAME="text stream">text stream</A></B> consists of one or more
<B><A NAME="text lines">lines</A></B> of text
that can be written to a text-oriented display so that they can
be read. When reading from a text stream, the program reads an
<CODE><I>NL</I></CODE> (newline) at the end of each line.
When writing to a text stream, the program writes an
<CODE><I>NL</I></CODE> to signal the end of a line. To match
differing conventions among target environments for representing text
in files, the library functions can alter the number and representations
of characters transmitted between the program and a text stream.</P>
<P>Thus, positioning within a text stream is limited.
You can obtain the current
<A HREF="#file-position indicator">file-position indicator</A>
by calling <CODE><A HREF="stdio.html#fgetpos" tppabs="http://ccs.ucsd.edu/c/stdio.html#fgetpos">fgetpos</A></CODE> or
<CODE><A HREF="stdio.html#ftell" tppabs="http://ccs.ucsd.edu/c/stdio.html#ftell">ftell</A></CODE>.
You can position a text stream at a position obtained this way,
or at the beginning or end of the stream, by calling
<CODE><A HREF="stdio.html#fsetpos" tppabs="http://ccs.ucsd.edu/c/stdio.html#fsetpos">fsetpos</A></CODE> or
<CODE><A HREF="stdio.html#fseek" tppabs="http://ccs.ucsd.edu/c/stdio.html#fseek">fseek</A></CODE>.
Any other change of position might well be not supported.</P>
<P>For maximum portability, the program should not write:</P>
<UL>
<LI>empty files
<LI><CODE><I>space</I></CODE> characters at the end of a line
<LI>partial lines (by omitting the <CODE><I>NL</I></CODE>
at the end of a file)
<LI>characters other than the printable characters,
<CODE><I>NL</I></CODE>, and <CODE><I>HT</I></CODE> (horizontal tab)
</UL>
<P>If you follow these rules, the sequence of characters you read
from a text stream (either as byte or multibyte characters)
will match the sequence of characters you wrote to the text stream
when you created the file. Otherwise, the library functions can remove
a file you create if the file is empty when you close it. Or they
can alter or delete characters you write to the file.</P>
<P>A
<B><A NAME="binary stream">binary stream</A></B> consists of
one or more bytes of arbitrary information.
You can write the value stored in an arbitrary object
to a (byte-oriented) binary stream and read exactly what was stored
in the object when you wrote it. The library functions do not alter
the bytes you transmit between the program and a binary stream. They
can, however, append an arbitrary number of null bytes to the file
that you write with a binary stream. The program must deal with these
additional null bytes at the end of any binary stream.</P>
<P>Thus, positioning within a binary stream is well defined,
except for positioning relative to the end of the stream.
You can obtain and alter the current
<A HREF="#file-position indicator">file-position indicator</A>
the same as for a <A HREF="#text stream">text stream</A>.
Moreover, the offsets used by
<CODE><A HREF="stdio.html#ftell" tppabs="http://ccs.ucsd.edu/c/stdio.html#ftell">ftell</A></CODE> and
<CODE><A HREF="stdio.html#fseek" tppabs="http://ccs.ucsd.edu/c/stdio.html#fseek">fseek</A></CODE>
count bytes from the beginning of the stream (which is byte zero),
so integer arithmetic on these offsets yields predictable results.</P>
<H2><A NAME="Byte and Wide Streams">Byte and Wide Streams</A></H2>
<P>A
<B><A NAME="byte stream">byte stream</A></B>
treats a file as a sequence of bytes. Within the program,
the stream looks like the same sequence of bytes, except
for the possible alterations described above.</P>
<P>By contrast, a
<B><A NAME="wide stream">wide stream</A></B> treats a file as a sequence of
<B><A NAME="generalized multibyte characters">
generalized multibyte characters</A></B>,
which can have a broad range of encoding rules.
(Text and binary files are still read and written as described above.)
Within the program, the stream looks like the corresponding sequence of
<A HREF="charset.html#Wide-Character Encoding" tppabs="http://ccs.ucsd.edu/c/charset.html#Wide-Character Encoding">wide characters</A>.
Conversions between the two representations occur
within the Standard C library. The conversion rules can, in principle,
be altered by a call to
<A HREF="locale.html#setlocale" tppabs="http://ccs.ucsd.edu/c/locale.html#setlocale"><CODE>setlocale</CODE></A>
that alters the category
<A HREF="locale.html#LC_CTYPE" tppabs="http://ccs.ucsd.edu/c/locale.html#LC_CTYPE"><CODE>LC_CTYPE</CODE></A>.
Each wide stream determines its conversion rules
at the time it becomes wide oriented, and retains
these rules even if the category
<A HREF="locale.html#LC_CTYPE" tppabs="http://ccs.ucsd.edu/c/locale.html#LC_CTYPE"><CODE>LC_CTYPE</CODE></A>
subsequently changes.</P>
<P>Positioning within a wide stream suffers the same limitations as for
<A HREF="#text stream">text streams</A>. Moreover, the
<A HREF="#file-position indicator">file-position indicator</A>
may well have to deal with a
<A HREF="charset.html#state-dependent encoding" tppabs="http://ccs.ucsd.edu/c/charset.html#state-dependent encoding">state-dependent encoding</A>.
Typically, it includes both a byte offset within the stream
and an object of type
<CODE><A HREF="wchar.html#mbstate_t" tppabs="http://ccs.ucsd.edu/c/wchar.html#mbstate_t"></CODE>. Thus, the only
reliable way to obtain a file position within a wide stream is by calling
<CODE><A HREF="stdio.html#fgetpos" tppabs="http://ccs.ucsd.edu/c/stdio.html#fgetpos">fgetpos</A></CODE>,
and the only reliable way to restore a position
obtained this way is by calling
<CODE><A HREF="stdio.html#fsetpos" tppabs="http://ccs.ucsd.edu/c/stdio.html#fsetpos">fsetpos</A></CODE>.
<H2><A NAME="Controlling Streams">Controlling Streams</A></H2>
<P><A HREF="stdio.html#fopen" tppabs="http://ccs.ucsd.edu/c/stdio.html#fopen"><CODE>fopen</CODE>
returns the address of an object of type
<A HREF="stdio.html#FILE" tppabs="http://ccs.ucsd.edu/c/stdio.html#FILE"><CODE>FILE</CODE></A>.
You use this address as the <CODE>stream</CODE> argument to several library
functions to perform various operations on an open file. For a byte
stream, all input takes place as if each character is read by calling
<A HREF="stdio.html#fgetc" tppabs="http://ccs.ucsd.edu/c/stdio.html#fgetc"><CODE>fgetc</CODE></A>,
and all output takes place as if each character is written by calling
<A HREF="stdio.html#fputc" tppabs="http://ccs.ucsd.edu/c/stdio.html#fputc"><CODE>fputc</CODE></A>. For a wide stream (with
<A HREF="lib_over.html#Amendment 1" tppabs="http://ccs.ucsd.edu/c/lib_over.html#Amendment 1">Amendment 1</A>),
all input takes place as if each character is read by calling
<A HREF="wchar.html#fgetwc" tppabs="http://ccs.ucsd.edu/c/wchar.html#fgetwc"><CODE>fgetwc</CODE></A>,
and all output takes place as if each character is written by calling
<A HREF="wchar.html#fputwc" tppabs="http://ccs.ucsd.edu/c/wchar.html#fputwc"><CODE>fputwc</CODE></A>.</P>
<P>You can
<B><A NAME="file close">close</A></B> a file by calling
<A HREF="stdio.html#fclose" tppabs="http://ccs.ucsd.edu/c/stdio.html#fclose"><CODE>fclose</CODE></A>,
after which the address of the
<A HREF="stdio.html#FILE" tppabs="http://ccs.ucsd.edu/c/stdio.html#FILE"><CODE>FILE</CODE></A> object is invalid.</P>
<P>A <A HREF="stdio.html#FILE" tppabs="http://ccs.ucsd.edu/c/stdio.html#FILE"><CODE>FILE</CODE></A>
object stores the state of a stream, including:</P>
<UL>
<LI>an <B><A NAME="error indicator">error indicator</A></B> --
set nonzero by a function that encounters a read or write error
<LI>an <B><A NAME="end-of-file indicator">end-of-file indicator</A></B> --
set nonzero by a function that
encounters the end of the file while reading
<LI>a <B><A NAME="file-position indicator">file-position indicator</A></B> --
specifies the next byte in the stream to read or write,
if the file can support positioning requests
<LI>a <B><A HREF="#Stream States">stream state</A></B> --
specifies whether the stream will accept reads and/or writes and, with
<A HREF="lib_over.html#Amendment 1" tppabs="http://ccs.ucsd.edu/c/lib_over.html#Amendment 1">Amendment 1</A>, whether the stream is
<A HREF="#unbound stream">unbound</A>,
<A HREF="#byte oriented">byte oriented</A>, or
<A HREF="#wide oriented">wide oriented</A>
<LI>a <B><A HREF="charset.html#conversion state" tppabs="http://ccs.ucsd.edu/c/charset.html#conversion state">conversion state</A></B> --
remembers the state of any partly assembled or generated
<A HREF="#generalized multibyte characters">
generalized multibyte character</A>, as well as
any shift state for the sequence of bytes in the file)
<LI>a <B><A NAME="file buffer">file buffer</A></B> --
specifies the address and size of an array object
that library functions can use to improve the performance
of read and write operations to the stream
</UL>
<P>Do not alter any value stored in a
<A HREF="stdio.html#FILE" tppabs="http://ccs.ucsd.edu/c/stdio.html#FILE"><CODE>FILE</CODE></A> object or in
a file buffer that you specify for use with that object.
You cannot copy a
<A HREF="stdio.html#FILE" tppabs="http://ccs.ucsd.edu/c/stdio.html#FILE"><CODE>FILE</CODE></A> object
and portably use the address of the copy
as a <CODE>stream</CODE> argument to a library function.</P>
<H2><A NAME="Stream States">Stream States</A></H2>
<P>The valid states, and state transitions, for a stream are:</P>
<P><IMG SRC="stream.gif" tppabs="http://ccs.ucsd.edu/c/gif/stream.gif"></P>
<P>Each of the circles denotes a stable
state. Each of the lines denotes a transition that can occur as the
result of a function call that operates on the stream. Five groups
of functions can cause state transitions.</P>
<P>Functions in the first three groups are declared in
<A HREF="stdio.html" tppabs="http://ccs.ucsd.edu/c/stdio.html"><CODE>&lt;stdio.h&gt;</CODE></A>:</P>
<UL>
<LI>the <B><A NAME="byte read functions">byte read functions</A></B> --
<A HREF="stdio.html#fgetc" tppabs="http://ccs.ucsd.edu/c/stdio.html#fgetc"><CODE>fgetc</CODE></A>,
<A HREF="stdio.html#fgets" tppabs="http://ccs.ucsd.edu/c/stdio.html#fgets"><CODE>fgets</CODE></A>,
<A HREF="stdio.html#fread" tppabs="http://ccs.ucsd.edu/c/stdio.html#fread"><CODE>fread</CODE></A>,
<A HREF="stdio.html#fscanf" tppabs="http://ccs.ucsd.edu/c/stdio.html#fscanf"><CODE>fscanf</CODE></A>,
<A HREF="stdio.html#getc" tppabs="http://ccs.ucsd.edu/c/stdio.html#getc"><CODE>getc</CODE></A>,
<A HREF="stdio.html#getchar" tppabs="http://ccs.ucsd.edu/c/stdio.html#getchar"><CODE>getchar</CODE></A>,
<A HREF="stdio.html#gets" tppabs="http://ccs.ucsd.edu/c/stdio.html#gets"><CODE>gets</CODE></A>,
<A HREF="stdio.html#scanf" tppabs="http://ccs.ucsd.edu/c/stdio.html#scanf"><CODE>scanf</CODE></A>, and
<A HREF="stdio.html#ungetc" tppabs="http://ccs.ucsd.edu/c/stdio.html#ungetc"><CODE>ungetc</CODE></A>
<LI>the <B><A NAME="byte write functions">byte write functions</A></B> --
<A HREF="stdio.html#fprintf" tppabs="http://ccs.ucsd.edu/c/stdio.html#fprintf"><CODE>fprintf</CODE></A>,
<A HREF="stdio.html#fputc" tppabs="http://ccs.ucsd.edu/c/stdio.html#fputc"><CODE>fputc</CODE></A>,
<A HREF="stdio.html#fputs" tppabs="http://ccs.ucsd.edu/c/stdio.html#fputs"><CODE>fputs</CODE></A>,
<A HREF="stdio.html#fwrite" tppabs="http://ccs.ucsd.edu/c/stdio.html#fwrite"><CODE>fwrite</CODE></A>,
<A HREF="stdio.html#printf" tppabs="http://ccs.ucsd.edu/c/stdio.html#printf"><CODE>printf</CODE></A>,
<A HREF="stdio.html#putc" tppabs="http://ccs.ucsd.edu/c/stdio.html#putc"><CODE>putc</CODE></A>,
<A HREF="stdio.html#putchar" tppabs="http://ccs.ucsd.edu/c/stdio.html#putchar"><CODE>putchar</CODE></A>,
<A HREF="stdio.html#puts" tppabs="http://ccs.ucsd.edu/c/stdio.html#puts"><CODE>puts</CODE></A>,
<A HREF="stdio.html#vfprintf" tppabs="http://ccs.ucsd.edu/c/stdio.html#vfprintf"><CODE>vfprintf</CODE></A>, and
<A HREF="stdio.html#vprintf" tppabs="http://ccs.ucsd.edu/c/stdio.html#vprintf"><CODE>vprintf</CODE></A>
<LI>the <B><A NAME="position functions">position functions</A></B> --
<A HREF="stdio.html#fflush" tppabs="http://ccs.ucsd.edu/c/stdio.html#fflush"><CODE>fflush</CODE></A>,
<A HREF="stdio.html#fseek" tppabs="http://ccs.ucsd.edu/c/stdio.html#fseek"><CODE>fseek</CODE></A>,
<A HREF="stdio.html#fsetpos" tppabs="http://ccs.ucsd.edu/c/stdio.html#fsetpos"><CODE>fsetpos</CODE></A>, and
<A HREF="stdio.html#rewind" tppabs="http://ccs.ucsd.edu/c/stdio.html#rewind"><CODE>rewind</CODE></A>
</UL>
<P>Functions in the remaining two groups are declared
in <CODE>&lt;wchar.h&gt;</CODE>:</P>
<UL>
<LI>the <B><A NAME="wide read functions">wide read functions</A></B> --
<A HREF="wchar.html#fgetwc" tppabs="http://ccs.ucsd.edu/c/wchar.html#fgetwc"><CODE>fgetwc</CODE></A>,
<A HREF="wchar.html#fgetws" tppabs="http://ccs.ucsd.edu/c/wchar.html#fgetws"><CODE>fgetws</CODE></A>,
<A HREF="wchar.html#fwscanf" tppabs="http://ccs.ucsd.edu/c/wchar.html#fwscanf"><CODE>fwscanf</CODE></A>,
<A HREF="wchar.html#getwc" tppabs="http://ccs.ucsd.edu/c/wchar.html#getwc"><CODE>getwc</CODE></A>,
<A HREF="wchar.html#getwchar" tppabs="http://ccs.ucsd.edu/c/wchar.html#getwchar"><CODE>getwchar</CODE></A>,
<A HREF="wchar.html#ungetwc" tppabs="http://ccs.ucsd.edu/c/wchar.html#ungetwc"><CODE>ungetwc</CODE></A>, and
<A HREF="wchar.html#wscanf" tppabs="http://ccs.ucsd.edu/c/wchar.html#wscanf"><CODE>wscanf</CODE></A>,
<LI>the <B><A NAME="wide write functions">wide write functions</A></B> --
<A HREF="wchar.html#fwprintf" tppabs="http://ccs.ucsd.edu/c/wchar.html#fwprintf"><CODE>fwprintf</CODE></A>,
<A HREF="wchar.html#fputwc" tppabs="http://ccs.ucsd.edu/c/wchar.html#fputwc"><CODE>fputwc</CODE></A>,
<A HREF="wchar.html#fputws" tppabs="http://ccs.ucsd.edu/c/wchar.html#fputws"><CODE>fputws</CODE></A>,
<A HREF="wchar.html#putwc" tppabs="http://ccs.ucsd.edu/c/wchar.html#putwc"><CODE>putwc</CODE></A>,
<A HREF="wchar.html#putwchar" tppabs="http://ccs.ucsd.edu/c/wchar.html#putwchar"><CODE>putwchar</CODE></A>,
<A HREF="wchar.html#vfwprintf" tppabs="http://ccs.ucsd.edu/c/wchar.html#vfwprintf"><CODE>vfwprintf</CODE></A>,
<A HREF="wchar.html#vwprintf" tppabs="http://ccs.ucsd.edu/c/wchar.html#vwprintf"><CODE>vwprintf</CODE></A>, and
<A HREF="wchar.html#wprintf" tppabs="http://ccs.ucsd.edu/c/wchar.html#wprintf"><CODE>wprintf</CODE></A>,
</UL>
<P>For the stream <CODE>s</CODE>, the call
<CODE><A HREF="wchar.html#fwide" tppabs="http://ccs.ucsd.edu/c/wchar.html#fwide">fwide</A>(s, 0)</CODE>
is always valid and never causes a change of state. Any other call to
<A HREF="wchar.html#fwide" tppabs="http://ccs.ucsd.edu/c/wchar.html#fwide"><CODE>fwide</CODE></A>, or to any of the five
groups of functions described above, causes the state transition shown
in the state diagram. If no such transition is shown, the function
call is invalid.</P>
<P>The state diagram shows how to establish the orientation of
a stream:</P>
<UL>
<LI>The call
<CODE><A HREF="wchar.html#fwide" tppabs="http://ccs.ucsd.edu/c/wchar.html#fwide">fwide</A>(s, -1)</CODE>,
or to a byte read or byte write function, establishes the stream as
<A HREF="#byte oriented">byte oriented</A>.
<LI>The call
<CODE><A HREF="wchar.html#fwide" tppabs="http://ccs.ucsd.edu/c/wchar.html#fwide">fwide</A>(s, 1)</CODE>,
or to a wide read or wide write function, establishes the stream as
<A HREF="#wide oriented">wide oriented</A>.
</UL>
<P>The state diagram also shows that you must call one of the position
functions between most write and read operations:</P>
<UL>
<LI>You cannot call a read function if the last operation on the
stream was a write.
<LI>You cannot call a write function if the last operation on the
stream was a read, unless that read operation set the
<A HREF="#end-of-file indicator">end-of-file indicator</A>.
</UL>
<P>Finally, the state diagram shows that a position operation never
<I>decreases</I> the number of valid function calls that can follow.</P>
<HR>
<P>See also the
<B><A HREF="index.html#Table of Contents" tppabs="http://ccs.ucsd.edu/c/index.html#Table of Contents">Table of Contents</A></B> and the
<B><A HREF="_index.html" tppabs="http://ccs.ucsd.edu/c/_index.html">Index</A></B>.</P>
<P><I>
<A HREF="crit_pb.html" tppabs="http://ccs.ucsd.edu/c/crit_pb.html">Copyright</A> &#169; 1989-1996
by P.J. Plauger and Jim Brodie. All rights reserved.</I></P>
</BODY></HTML>