-
Notifications
You must be signed in to change notification settings - Fork 2
Files and IO
We all know what a file is. We create files when we write code, when we download things from the Internet, when we make creative projects, when we import photos from cameras or phones. But if you think about it, all a file is it's just someplace on your computer that data can be stored, to be processed at a later time. That's all it is. And not only that, but every file is accessed as simply a sequence of bytes; it is written to as a series of bytes, and read as a series of bytes.
We define the four fundamental operations that a program can do with a file: opening, reading, writing, and closing. These operations should be relatively self-explanatory.
The genius of the Unix operating system (and all of its various incarnations: Linux, MacOS Darwin, BSD, Solaris, etc.) is that it tries to abstract away all inputs to a program and all outputs from a program as files, even when these files are not necessarily what we would think of as files. This is to make programming in Unix easier to understand and learn; with this ubiquitous interface for programs inputting and outputting information, the programmer need only learn the behavior of these four functions once and can apply it to all other types of input and output.
As mentioned previously, files are essentially just locations where a process can read input or receive output in the form of bytes. However, a process can have multiple files open at the same time, which illustrates the need for some sort of "ID" for each open file. That's where file descriptors come in. File descriptors are non-negative integers that are used to refer to a specific open file. Additionally, these descriptors "remember" the location of last access, i.e. if a program calls read()
successively, starting from the beginning of a file, each read()
will pick up where the previous one left off, never returning the same data twice. File descriptors are how files are represented internally in the operating system.
However, the C standard library provides the header stdio.h
, short for "standard input/output". This library is a wrapper around the file handling done with file descriptors. This library is faster and provides more convenient functionality for reading and writing to files (more on that later). Instead of file descriptors, we now work with file pointers, which have a type of pointer to FILE
(FILE *
). All of the functions that use file pointers have an initial letter f
in front of the name (ex. fread()
, fwrite()
).
There are also functions that allow the program to essentially convert a file descriptor into a file pointer and vice versa.
Next, we dive into the differences between these two representations: file pointers vs. file descriptors.
In one sentence, the major difference between file pointers and file descriptors is that input/output done with file pointers is buffered, while input/output done with file descriptors is unbuffered. The best way to illustrate the difference is with an example:
Suppose you have a file that has 1000 bytes in it, and you want to process these bytes and change any bytes that are 0 (0b 0000 0000
in binary) to 255 (0b 1111 1111
in binary). You must open the file, read from the file one byte at a time, changing any 0s to 255s, and then close the file when you get to the end. The problem is this: the file is stored on disk storage on your computer (because that's where most files reside on computers), and disk storage is an EXTREMELY SLOW operation by computer standards. (For a comparison, usually a read from disk will take on the order of a millisecond, i.e. 1/1000th of a second. However, modern machines hit about 1-4 GHz on their CPUs. An addition of two integers in the registers takes a few cycles to complete, or on the order of a nanosecond, i.e. 1/1000000000th of a second. The read from disk is literally a million times slower than the addition!) In other words, if you read your file from disk one byte at a time, it's possible that it will take about a second to run your processing program. That's crazy! You don't want a program that will take one entire second to process 1000 bytes of data (~1 kB).
The scenario described above is classic unbuffered input/output. The program asks for a byte, so the operating system goes to disk, fetches the next byte, and hands it to the program. But, as discussed before, this is not efficient at all.
Suppose you have the same file and same processing you need to do on the file. Instead of going to disk 1000 times, asking for a byte each time, let's instead ask the operating system for the entire 1000 bytes in one read and load it into local memory (if you've taken 61C, load it from disk into L1 or L2 cache) where access times are now sub-microseconds (less than 1 millionth of a second). Then, from there, read the data in one byte at a time until the program is done. This program will do the single read from disk in about a millisecond, and then execute several sub-microsecond reads from local memory, and then 1000 nanosecond-scale comparisons and replacement operations, all adding up to a running time of 1 - 2 milliseconds. Compare this to the first version, which took about a second to run. Huge difference!
The scenario described above is classic buffered input/output. You can imagine that the intermediate location that the data is copied into is a "buffer" where data is sort of put into a staging area between where it normally rests and where it will be operated on. This staging area is much more easily accessible than the disk, and so by minimizing the number of times the program needs to go to disk to fetch new data and reading huge chunks of data from disk when we do go, we speed up the program by orders or magnitude.
But are there any downsides to buffered input? Yes! Suppose you have a file that is 1000 bytes and that you have both a file descriptor (unbuffered input) and a file pointer (buffered input) open on the same file. (This is rarely necessary, however this example is to illustrate the point that you must be careful when using buffered input and make sure to always take the buffer into account when writing this kind of code.) Now suppose that you ask to read 10 bytes using the file pointer. It's possible that the standard library function went to your file and read all 1000 bytes of the file. After processing the first 10 bytes, you now have 990 bytes in your input buffer, and the memory "location" of the file descriptor is at the end of the file. If you now try to read from the file descriptor, that file descriptor will come back and give you an end of file (since the standard library function called the unbuffered read
to get the 1000 bytes, which moved the file descriptor to the end of the file), even though you've only processed 10 bytes! The reason is that the rest of the file is in the 990 bytes of the input buffer, which the program has no access to except through the file pointer.
The first three file descriptors 0
, 1
, and 2
are very special. Whenever you run a program, these three are automatically opened:
-
0
:stdin
, i.e. standard in, is the file descriptor pointing at the default input location of the program—typically this is your keyboard attached to your terminal screen -
1
:stdout
, i.e. standard out, is the file descriptor pointing at the default output location of the program—typically this is your terminal screen.printf()
,puts()
, andputc()
all direct output here. -
2
:stderr
, i.e. standard error, is the file descriptor pointing at the default output location of errors reported by the program—typically this is your terminal screen.perror()
directs output here.
Since it is ugly to refer to 0
, 1
, and 2
in your code, the following names are preferred to reference the file descriptors and file pointers to standard in, standard out, and standard error:
- File descriptors:
STDIN_FILENO
,STDOUT_FILENO
, andSTDERR_FILENO
for0
,1
, and2
- File pointers:
stdin
,stdout
, andstderr
are pointers toFILE
for the three locations
Next, we take a closer look at the functions used when working with files.
The open
and fopen
functions have the following definition:
int open(const char *path, int oflag, ... );
FILE *fopen(const char *path, const char *mode);
These two functions open a file with certain options and permissions; open()
returns a file descriptor to the newly opened file, and fopen()
returns a file pointer to the newly opened file.
For both functions, the first argument is a path to the file to be opened.
The second argument of the open()
function is a bitwise OR of the following flags (there are more, but we only list the most common flags here):
-
O_RDONLY
: open the file for read only -
O_WRONLY
: open the file for write only -
O_RDWR
: open the file for reading and writing -
O_APPEND
: open the file for reading and writing, and prior to each write, set the location of the write to be the end of file location -
O_CREAT
: open the file and create it if it doesn't exist
The optional argument is to specify the permissions of the new file if the O_CREAT
flag is set, otherwise it is not provided. The permissions are specified numerically (e.g. 0660
to specify read and write permissions to user and group, but no permissions to other) or as bitwise OR of the various permissions constants.
The second argument of the fopen()
function is one of the following strings:
-
"r"
: open the file for read only -
"w"
: open the file for write only; truncate the file to zero length before starting to write or create the file if it doesn't exist -
"a"
: open the file for appending; prior to each write, set the location of the write to the end of file location -
"r+"
: open the file for read and write -
"w+"
: open the file for reading and writing; truncate the file to zero length before starting to write or create the file if it doesn't exist
The read
and fread
functions have the following definition:
ssize_t read(int fildes, void *buf, size_t nbyte);
size_t fread(void *restrict ptr, size_t size, size_t nitems, FILE *restrict stream);
These two functions both read a specified number of bytes from the given file descriptor / file pointer. Both functions block if there is no more input to be read from the file but the end of file has not been reached. Both functions return the number of bytes successfully read from the file.
For read()
, the first argument is the file descriptor to read from. The second argument is a pointer to a memory location into which the read data will be copied. The third argument is the number of bytes to read from the file descriptor. Because of how unbuffered reads work, a successful call to read()
is actually any number of bytes read between 1 byte and the number of bytes you specified. In other words, there's no guarantee that you'll get the exact number of bytes you asked for; it could be less, but not more.
For fread()
, the first argument is a pointer to a memory location into which the read data will be copied. The second argument is the size of each element of the incoming data. The third argument is how many elements to read from the file (so the total number of bytes requested is size * nitems
). The fourth argument is the file pointer to read from. This function may block if there is no data on the file pointer but end of file has not been encountered. As soon as enough data arrives, however, then fread()
will always grab up to size * nitems
bytes, unless less than that number of bytes arrived, in which case fread()
will "give up" after a split second and just return what it was able to get back to the caller.
The write
and fwrite
functions have the following definitions:
ssize_t write(int fildes, const void *buf, size_t nbyte);
size_t fwrite(const void *restrict ptr, size_t size, size_t nitems, FILE *restrict stream);
These two functions both write a specified number of bytes to the specified file descriptor / file pointer. Both functions do not block and return the number of bytes successfully written to the file.
For write()
, the first argument is the file descriptor to write to, the second argument is a pointer to the data that we wish to write, and the third argument is the number of bytes to write.
For fwrite()
, the first argument is a pointer to the data that we wish to write. The second argument is the size, in bytes, of one element of the data that we wish to write. The third argument is how many elements we wish to write (a total of size * nitems
bytes written). The fourth element is the file pointer to write to. Because the output is buffered, a call to fwrite()
is not guaranteed to write through to the file immediately. Rather, the file will be written to some intermediate buffer, and only written to the file when the operating system has some time to spare or when the buffer is intentionally flushed by the program using the fflush()
function (explained later).
The close
and fclose
functions have the following definitions:
int close(int fildes);
int fclose(FILE *stream);
These two functions both close the specified file descriptor / file pointer. The file descriptor / file descriptor associated with the file pointer is freed up for use by another file opened later.
For close()
, the argument is the file descriptor to close.
For fclose()
, the argument is the file pointer to close; the contents of the output buffer associated with the file pointer is flushed before the file pointer is closed.
The following functions provide extra utility or convenience when handling files with the buffered I/O with the standard library.
The fgets
function has the following definition:
char *fgets(char *restrict s, int n, FILE *restrict stream);
This function is used to return the next line from a file pointed to by the specified file pointer. A next line is denoted by a sequence of characters up to and including the final newline (\n
) character. The first argument is a pointer to the location into which the data will be read. The second argument is the maximum number of characters minus one to read from the file (i.e. the size of the buffer pointed to by the first argument). The last argument is the file pointer to read from. Once a newline is encountered, n - 1
bytes are read, or the end of file is reached, the function stops reading and places a null byte (\0
) after the last character that was read.
The fflush
function has the following definition:
int fflush(FILE *stream);
This function takes any data that is still left in the buffer between the program and the actual file and flushes it out of the buffer. After a call to this function, it is guaranteed that all data written to the file previously is actually present on the file. The argument is the file pointer to the open file. This function is slow, but is necessary when you know that a file is written out and read in simultaneously, for example, because you want the contents of the file being read in to have the contents of the latest writes in it.
The lseek
and fseek
functions have the following definitions:
off_t lseek(int fildes, off_t offset, int whence);
int fseek(FILE *stream, long offset, int whence);
These functions move the file descriptor / file pointer for a given open file. The first argument is the file descriptor (for lseek()
) / file pointer (for fseek()
) to the open file. The new position of the file descriptor / file pointer is specified by a combination of the second and third arguments (which are basically the same—off_t
is pretty much the same as long
). The new position is given by adding offset
bytes to the position specified by whence
. whence
can be one of three values: SEEK_CUR
, SEEK_SET
, or SEEK_END
for the current position of the file pointer / descriptor, the beginning of the file, and the end of the file, respectively. For example, to move the file descriptor 10 bytes forward from its current position, you would make a call like lseek(fd, 10, SEEK_CUR);
.
The fdopen
and fileno
functions have the following definitions:
FILE *fdopen(int fildes, const char *mode);
int fileno(FILE *stream);
These functions return a file pointer from a file descriptor (fdopen()
) or a file descriptor from a file pointer (fileno()
). The file descriptor / file pointer provided in the first argument of both functions must refer to a valid, currently open file descriptor / file pointer on a file. After a call to either function, both the file descriptor and the file pointer are valid ways of performing I/O on the file; the programmer can choose to perform the I/O using unbuffered or buffered I/O.
The mode
argument to fdopen()
is a string that specifies whether the file pointer will be for reading only, writing only, etc. with exactly the same meaning as the strings described in the description of fopen()
.
These functions allow for some advanced file descriptor manipulation (these do not work for file pointers, and there are no equivalent functions for file pointers).
The dup
function has the following definition:
int dup(int fildes);
This function takes an open, valid file descriptor and duplicates it. The returned value is the second file descriptor on the file specified by the argument. For example, if you wish to duplicate standard output, you can write int new_stdout = dup(STDOUT_FILENO);
and now both new_stdout
and STDOUT_FILENO
can be used to write to standard output.
The dup2
function has the following definition:
int dup2(int fildes, int fildes2);
This function takes an open file descriptor and duplicates it to another specific file descriptor. If the specified file descriptor to be duplicated to is currently open, it is first closed (unless the duplication fails for some reason). The first argument is the file descriptor that will be duplicated, and the second argument is the file descriptor that the first descriptor will be duplicated to. After the function returns, both the first and second arguments will be valid ways of performing I/O on the file described by the first descriptor. This function is useful for redirecting I/O to unusual locations, especially the three special file descriptors STDOUT_FILENO
, STDIN_FILENO
, and STDERR_FILENO
. For example, suppose you have a file open with file descriptor fd
. If you wish for standard output to instead go to this file, you can write dup2(fd, STDOUT_FILENO);
and now both STDOUT_FILENO
and fd
will be valid ways of performing I/O on that file.
Now that we know more about the functions that are used for manipulating file descriptors and what file descriptors / file pointers are, we return to a more in-depth discussion about different I/O objects and creating descriptors. See the corresponding wiki pages for a more detailed look at these topics.
The most basic I/O object is the ordinary file, something like foo.txt
or bar.config
that exists on your disk. For these, the open()
and fopen()
functions are the ones that give you the file descriptors / file pointers you need for reading and writing to the file.
Next, there are pipes and FIFOs. These are "files" that essentially do not retain information about what has been written to them after they have been read out of them. In other words, once some data gets written to them, they store it until the data is read out, at which point that data is no longer available in the pipe / FIFO. For these, the function pipe()
gives you two file descriptors to refer to the two ends of the pipe; and the functions mkfifo()
followed by a call to open()
gives you a file descriptor to refer to the FIFO.
Next, there are sockets. These are the foundational networking communications objects that essentially allow you to communicate with another process, be it on the same machine or on a different machine connected to the Internet. For these, the functions socket()
and accept()
give you the file descriptor to use to perform I/O using sockets.
Next, there are serial ports. These are ports that allow programs to communicate to a device that is physically attached to the computer via a wire (typically this is things like USB devices). Serial ports operate as streams of bytes that are sent between the two devices, and have their origins in telegraph communications and morse code. For these, the open()
function called on a device file under the directory /dev
will give you the file descriptor to use to perform I/O using serial ports.
Finally, there's shared memory. This is a mechanism to ask the operating system itself to allocate some memory that can then be accessed by multiple processes to communicate. The memory that the operating system allocates can be thought of as a file, and indeed, the function that opens the shared memory, shm_open()
, returns a file descriptor to use to perform I/O to the shared memory (however, additional steps are normally taken to not perform I/O to the shared memory using read()
and write()
and the file descriptor returned by shm_open()
).
EOF
, or "end of file", is relatively intuitive to understand when we are talking about an actual, ordinary file like foo.txt
. If you perform a read()
on a file descriptor that is pointing at an ordinary file and it returns with EOF
, it means that you have reached, well, the end of the file.
However, if the file descriptor is referring to a socket, pipe, or serial port, the meaning of EOF
is not as clear. Consider the socket, for example. A connection between two machines over the internet could last for an arbitrary amount of time, and neither side knows where the "end of the file" is. So, a read()
operation performed on a socket, pipe, FIFO, or serial port will actually block until something becomes available to read on the given I/O object. Once that data becomes available, then read()
will return with that newly available data. For those types of I/O that have some notion of a "connection", i.e. the process that is receiving data has some way of determining whether the process that is sending data still exists, a read()
on the receiving end can return with an EOF
, which represents the fact that the sending end has closed the connection and no more data will be read from the process. This explanation was intentionally "hand-wavy", since the precise meaning of EOF
is slightly different for each of these I/O objects and is described in more detail in their corresponding wiki pages.
Imagine a scenario where you have a program that is loops and is just supposed to process data that becomes available on more than one file descriptor. For example, if you have 3 network connections from a single process, and you want to process incoming data from any of the 3 network connections. Each network connection is one file descriptor that you have to monitor, and a call to read()
on any of the three file descriptors will block until data is available from that descriptor. We don't know anything about which order the network connections will send data to us. We might imagine trying to write some code like:
while (1) {
// read from first file descriptor
// process the data
// read from second file descriptor
// process the data
// read from third file descriptor
// process the data
}
But this won't work! What if we are blocking on waiting for data on the first file descriptor and it takes a really long time, and while we are blocked, urgent messages arrive on the second and third file descriptors? Those urgent messages won't get processed until the first connection decides to send a message, at which point we process the message and then go to the second file descriptor and find the urgent messages. This is not what we want.
One potential solution is to open all three file descriptors in nonblocking mode by specifying the flag O_NONBLOCK
in the oflag
argument to open()
when we opened these descriptors. That will cause calls to read()
to return immediately (i.e. they won't block), and the above while loop will work as intended. However, this solution is a waste of CPU, because the while loop will run as fast as the computer can go, without regard as to whether there is actually data available on those descriptors. Assuming that the connections send data at a relatively slow rate (maybe a few times per second), the while loop could spin millions of times, not processing any data, between messages.
What we need is to somehow specify to the operating system to block until data becomes available on one or more of the three file descriptors; we want our actions to be event-driven, meaning that we only act when an event (data arriving on a descriptor) occurs. This is where the select()
function comes in. The select()
function has the following definition:
int select(int nfds, fd_set *readset, fd_set *writeset, fd_set *exceptset, struct timeval *timeout);
This function allows us to specify to the operating system a set of file descriptors (the "read set") to wait on for data to become available for reading; a set of file descriptors (the "write set") to wait on for space on a descriptor to become available for writing; a set of file descriptors (the "exception set") to wait on for errors/exceptions to occur; and a time to wait for any of these events to occur before timing out. The function returns when any of the above conditions are true (if data becomes available for reading on one or more of the file descriptors in the read set, if space becomes available for writing on one or more of the file descriptors in the write set, if an error occurred on one or more of the file descriptors in the exception set, or if the timeout time has passed since the function was called).
For our purposes (and in the scenario described prior to the function definition being given), we only make use of the read set. The write set and the exception set are both set to NULL
in calls to select()
. If we don't want the function to time out, the timeout
argument is also set to NULL
.
In order to tell select()
which file descriptors to check and to test if a certain file descriptor triggered select()
to return, we first create a variable of type fd_set
. We then use the following three macros to manipulate this variable:
int FD_ISSET(int fd, fd_set *set);
void FD_SET(int fd, fd_set *set);
void FD_ZERO(fd_set *set);
The FD_ISSET()
macro takes two arguments: the first argument is the file descriptor to test, and the second argument is the file descriptor set to examine. This macro is called after select()
returns to test if the specified file descriptor triggered select()
to return. If the macro returns nonzero (true), then we know that the specified descriptor triggered select()
to return and we need to handle that file descriptor. In the case where we are examining the read set, if a file descriptor tested against the provided read set after select()
returns true, this indicates the tested file descriptor has something available on it to read, and we can read the new data from that file descriptor and process it.
The FD_SET()
macro takes two arguments: the first argument is the file descriptor to set, and the second argument is the file descriptor set that we want to set the specified file descriptor in. This macro is called before select()
is called in order to tell select()
to wait for something to happen on the specified file descriptor. For example, in order to specify to select()
to return if data becomes available on standard in, we would write FD_SET(STDIN_FILENO, &readset);
, assuming that the variable readset
is of type fd_set
and that it is subsequently provided as the second argument to select()
.
The FD_ZERO()
macro takes one argument: a file descriptor set to clear, and is usually called before all calls to FD_SET()
and after all processing has been done on the previous iteration of a loop. This macro clears the specified file descriptor set to ready it for preparation for the next call to select()
.
Finally, we discuss the first argument to select()
: nfds
. This argument specifies the maximum number of file descriptors that select()
may potentially need to look at to query all of the file descriptors that have been set across all three provided file descriptor sets. For example, if in the read set you specify file descriptors 0 and 1; in the write set you specify file descriptors 7 and 10, and in the exception set you specify file descriptor 4, the value of nfds
would be 11, since select()
needs to potentially look at 11 descriptors (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) in its processing. Another way to remember this argument is to find the largest integer value across all of the file descriptors that you'll set, and then add 1 to it. In the example above, we see that the maximum of 0, 1, 7, 10, and 4 is 10, so the value of nfds
is going to be 11.
Example "pseudocode" of the original scenario, programmed using select()
:
// set up
int fd1 = open(...);
int fd2 = open(...);
int fd3 = open(...);
fd_set readset;
int nfds = (fd1 > fd2 && fd1 > fd3) ? fd1 : ((fd2 > fd1 && fd2 > fd3) ? fd2 : fd3);
nfds += 1;
// enter event-handling loop
while (1) {
FD_ZERO(&readset);
FD_SET(fd1, &readset);
FD_SET(fd2, &readset);
FD_SET(fd3, &readset);
// blocks until something becomes available to read on fd1, fd2, or fd3
select(nfds, &readset, NULL, NULL, NULL);
// if something available on fd1, handle it
if (FD_ISSET(fd1, &readset)) {
handle_fd1();
}
// if something available on fd2, handle it
if (FD_ISSET(fd2, &readset)) {
handle_fd2();
}
// if something available on fd3, handle it
if (FD_ISSET(fd3, &readset)) {
handle_fd3();
}
}
Files are file descriptors are used literally everywhere in Runtime. It would be exhaustive and pointless to list them all here. See the pages on the various Runtime components and the other systems pages for a more refined list of where certain types of I/O is used in Runtime.
- Important
- Advanced/Specific