Terminally confused (part six)

In part four, I described terminal ioctls in detail, but I skipped over a few calls that have to do with job control, a topic so important that it deserves its own part.

Job control describes a number of features that are provided by the combination of the shell and the terminal. The best example is perhaps the ability to suspend jobs by pressing Ctrl+Z and then later resume them using fg or bg. (These are shell builtins, so see your shell’s manual page for documentation.) You’ll most likely be familiar with many of the features I will describe below.

The process is not the largest organizational unit of execution in a Unix-like system. Processes are grouped into process groups, often abbreviated as “pgrp”. A process group is identified by its process group ID, abbreviated PGID. A PGID has the same type in C as a PID does (that is, pid_t).

The process init(8), the initial process of the system, starts out with PID=PGID=1. It is the process group leader for process group 1. A process always initially inherits the PGID of its parent. However, a process that is not a process group leader can join another process group using setpgid(2), or it can create an entirely new process group containing only itself, becoming the process group leader of the new process group. (This system call can also be used on a child of the calling process.) Note that if a process group leader dies, then the process group will be without a leader until it is dissolved (i.e., until all other processes in the group either die or move to other process groups). A process can discover the PGID of itself or any other process by calling getpgid(2).

The process group is the low-level implementation of the high-level concept of a job. Whenever you start a process from the shell, the shell will place it in a different process group from itself. That is, the shell will fork(2), the child process will call setpgid(2) to separate itself from the shell’s process group, and then it will call exec(2) to launch the requested program. This makes the command you launched from the shell into a job. If you start a pipeline of processes, then they will all be placed in the same process group (but different from the shell’s), so that they all form a single job. You can kill an entire process group using a negative number; for example, kill -13447 sends a SIGTERM to all processes in group 13447. The shell will also artificially generate a job ID for each job you start; the first job you start from a given shell will be 1, and so on. You can use the shell’s builtin kill (as opposed to /bin/kill) with these job IDs, preceding them with a percent sign; so kill %1 will kill the job with ID 1. The shell builtin jobs prints out all jobs associated with a shell. The shell builtin disown %J removes job J from the shell’s job table, but does not kill the job.

To see the PGID of a process, use something like ps -o s,ppid,pid,pgid,args. (Note that POSIX is unclear on whether you can specify s, but it works on my system.) The output looks something like:

S  4541 13243 13243 /bin/bash
T 13243 13336 13336 vim foo.c
S 13243 13447 13447 strace -p 931
S 13243 13448 13447 grep --color=auto ioctl
R 13243 18348 18348 ps -o s,ppid,pid,pgid,args

(Note that ps(1) prints out itself while it is running.)

Observe that these five processes belong to four different process groups. All these processes, including the shell, have their own process group, except for strace and grep, which both belong to process group 13447. This is because I started them in a pipeline, strace -p 931 | grep ioctl > ~/931_dbg &. Notice that all the processes other than the shell have the shell’s PID as their PPID, because I started them from the shell. By convention, when a process starts a new process group, the PGID equals the PID. Notice that bash‘s PGID equals its PID, and the same is true of vim and ps. For strace and grep, which both belong to pgrp 13447, the former’s PID is used as the PGID. This is because bash presumably starts vim first and creates a new process group for it, making its PGID match its PID, and then it starts grep, making it join the previously created new process group.

The S column shows you the status of the process. Since this printout was generated while ps was running, obviously it should have the “R” status (for “running”). The bash process is in the “S” state (“sleeping”) because ps is in the foreground and it is wait(2)ing for it to exit. strace is sleeping because it ptrace(2)ing process 931 and waiting for it to make a system call, and grep is sleeping because it is blocked on a read(2) or something like that, waiting for strace to produce output. The vim process was stopped when I pressed Ctrl+Z, so it has status code T, for “terminal stop” or something similar (more on this soon).

It turns out that the process group is also not the largest organizational unit of execution on a Unix-like system. Process groups are further grouped into sessions, in the sense that two processes in different sessions can’t be in the same process group. When the system starts up, init(8) is initially the only process and is therefore the session leader of the session it’s in. When a process fork(2)s, the child always ends up in the same session as the parent. A process may move itself into a new session using the setsid(2) call, becoming the session leader of the new session. It should be clear that any session leader is also a process group leader (because, when its session was first created, it was the only process in the only process group in that session), but the reverse is not necessarily true. A process is not allowed to move to a new session if it is already a process group leader, because this would change its process group as well. (This implies that session leaders cannot move into new sessions; after all, a session leader is also a process group leader.) When a session leader dies, the session will be without a leader until it is dissolved.

So far, based on what I’ve said, it seems that sessions are really quite similar to process groups. However, there are some differences.

First, in POSIX, there is no such thing as a session ID, or SID. You might think that setsid(2) returns the ID of the new session that the process is placed into, but POSIX actually says that it returns the process group ID of the new process group that the process is placed into. On the other hand, in Linux, sessions do get IDs, and the setsid(2) call does return the SID of the new session. This complies with POSIX because SIDs have type pid_t just like PGIDs and after a successful setsid(2) has occurred, PID=PGID=SID for the calling process. In POSIX, the getsid(2) function actually returns the PGID of the leader of the session of the target process; in Linux, it returns the SID, and again, this conforms because PGID=SID for a session leader.

Second, a process can’t move its children (or any other processes) into new sessions. (Nothing stops it from fork(2)ing and having the child call setsid(2), but if the child has already exec(2)ed another program, the parent is out of luck, unless it does something sneaky like ptrace(2)ing the child and forcing it to call setsid(2).) As a matter of fact, if a process tries to use setpgid(2) to move itself or one of its children into a process group that is not in its session, then it will fail. (Incidentally, if a process tries to move one of its children into a different process group, but that child has already moved itself into a different session, the result will be failure in this case as well.)

Third, sessions have a different meaning from process groups. I said that the meaning of a process group is a job. Well, the meaning of a session is… a session; that is, the informal concept of a session, and the formal concept (a set of processes with the same SID), coincide. Think of the informal meaning of “session”: when you log in to a system, say, on a virtual terminal or through ssh, you’re “starting a session” for yourself, and when you log out, you’re “ending a session”. Well, it turns out that getty(8) is a session leader, and recall that after you’ve logged in, it becomes your login shell (and is still the session leader); so you’ve been granted access to that session (in the technical sense), and you can now launch programs from the shell, which will inherit the shell’s session ID. When you log out, the shell might kill the other processes in the session, if any, then exit, thus ending your login session, as well as dissolving the session (in the technical sense). (It might also just kill its own jobs, or nothing at all; for example, bash appears not to kill anything when you log out.) So we can infer that processes have to call setsid(2) in order to let you log in; for example, getty(8) does so shortly after it starts, and sshd(8) must fork(2) and the child calls setsid(2) to start a session. Furthermore, if a process wants to make sure that it isn’t killed by the shell when the session it was started from ends, then it should call setsid(2). Programs that are designed to start daemons, such as apache2ctl when called with start as its argument, will do this. Also, you can use setsid(1) (non-POSIX) to force a process to be started in a new session1. (Once a process leaves a session, it can never return.)

So far, it might seem that sessions are just a tool that shells use to keep track of their descendants, and you may be wondering why the kernel bothers to keep track of session IDs itself. But that’s because I haven’t talked about terminals in this part yet, and it is terminals that really give purpose to sessions. You see, every session either has a controlling tty (abbreviated ctty) or none, and a given tty controls at most one session. The terminal driver in Linux remembers somewhere which SID each terminal is controlling, and a process can find out the PGID of the leader of the session controlled by a terminal using tcgetsid(fd), where fd is a file descriptor to the tty. In Linux, the implementation of this function calls ioctl(2) with request code TIOCGSID. (This function is actually useless, because it only works for the process’s own ctty, but in that case it will just return the same thing as getsid(0), where 0 refers to the calling process.) The Linux kernel also remembers the device number (which has type dev_t) of the terminal controlling each process group, if any, and exposes this in /proc/pid/stat. (Zero indicates that the process is a daemon, that is, has no controlling terminal). The init(8) process’s session starts out with no ctty.

POSIX explicitly declines to specify how a session can acquire a ctty. It says that if a session leader that currently has no ctty successfully open(2)s a tty, and does not specify the O_NOCTTY flag, then that tty might become the ctty for that session. But if a session already has a ctty, then opening further ttys will not change the ctty; if the tty already controls a session, then it will not control another; if a non-session leader opens a tty, then it does not become the ctty; and if the O_NOCTTY is used, then the opened terminal does not become the ctty. POSIX also says nothing about whether a session can ever free itself from the ctty’s control. (But a process can always free itself from a ctty’s control by moving itself into another session, though maybe it has to fork(2) first.)

In Linux, when a session leader for a session with no ctty opens a tty with no session, and does not use the O_NOCTTY flag, that tty does become the ctty for that session, subject to some restrictions. In particular, opening the system console does not make it into your ctty, nor will the foreground virtual terminal become your ctty if you open /dev/tty0. This latter restriction does not apply if a process opens a numbered virtual terminal that simply happens to be the foreground virtual terminal at the time. (See kernel source.)

When a process opens /dev/tty, it receives a file descriptor for its session’s ctty. This will always succeed as long as a ctty exists, because the permissions on /dev/tty are rw-rw-rw-. It will succeed even if, say, you logged in as root and then started a process as an unprivileged user and that process called open("/dev/tty", ...). If a process’s session has no ctty, this call fails. (POSIX doesn’t specify which error you get, but it’s ENXIO on Linux.)

In Linux, a session leader can also acquire a ctty by calling ioctl(fd, TIOCSCTTY, force), where fd is a file descriptor to the terminal. Again, this only works if the session currently has no ctty. This is not subject to the restriction of open(2) wherein you cannot get the system console or /dev/tty0 as your ctty. (I tried this on my system, and it appears that if a session does acquire either the system console or /dev/tty0 as its ctty in this way, then the effect will be the same as if it had asked for whichever numbered virtual terminal was foreground at the time as its ctty.) The force argument can be set to 1 in order to steal a ctty from another session. Only a process with root privilege (actually, the CAP_SYS_ADMIN capability) can do this.

In Linux, furthermore, a process can get rid of its ctty with ioctl(fd, TIOCNOTTY), where fd must be a file descriptor to the current ctty. Curiously, a process that is not a session leader can do this, and it will stay in the same process group session nevertheless; so the former ctty will continue to control that session, except for that process. This basically has no effect, other than that that process will no longer be able to open /dev/tty and will no longer be allowed to make system calls that require it to have a file descriptor to its own ctty (see below). If a session leader does this, on the other hand, the tty relinquishes control of the entire session, that is, all processes in the session lose their controlling terminal.

If a process successfully calls setsid(2), it will have no ctty, since it is in a new session and that new session has not yet acquired a ctty.

Okay, so I’ve told you a lot about how processes and sessions acquire cttys and divest themselves of cttys, but I still haven’t told you what it means for a tty to be controlling a session. First of all, though, I’ll say that it appears to be entirely meaningless (on Linux, at least) for a tty to control a single process. That’s why a non-session-leader using TIOCNOTTY to get rid of its ctty isn’t really accomplishing anything.

Anyway, when a terminal begins to control a session, it selects the process group that the session leader is in, and designates it as the foreground process group for that session, with all others, if any, being background process groups. A process can find out the foreground PGID of its own ctty by calling tcgetpgrp(fd), which, on Linux, internally calls ioctl(2) with the TIOCGPGRP request code. A process can also change the foreground process group of its own ctty by calling tcsetpgrp(fd, pgid), which, on Linux, internally calls ioctl(2) with the TIOCSPGRP request code. When it does this, it must select a process group that is in the session controlled by that tty. Note that if all the processes in the foreground group of a session die, then the session will have no foreground group until some other process in that session explicitly sets the foreground group.

The foreground process group of a ctty corresponds to the concept of a foreground job in your shell. In particular:

  • When you start a job from the shell without the & character, the shell places the job’s pgrp into the foreground, which means that the shell’s process group is relegated to the background. The shell wait(2)s for the job to die or be stopped, and then it places itself back into the foreground. If you start a job from the shell with & at the end, then it runs in the background and the shell stays in the foreground. This, incidentally, is why you can’t specify & for individual processes in a pipeline, but must put it at the end, after the last process in the pipeline—it applies to the entire job.
  • When you press Ctrl+C, all processes in the foreground pgrp receive SIGINT. The default action is to terminate them. (Note that you can change the character that does this or you can disable this feature entirely, as discussed in part four.) The shell will be notified via wait(2).
  • When you press Ctrl+\, all processes in the foreground pgrp receive SIGQUIT. Again, the default action is to terminate. The shell will be notified via wait(2).
  • When you press Ctrl+Z, all processes in the foreground pgrp receive SIGTSTP. The default action is to stop them. If a process is stopped in this way, its status in the output of ps(1) will be “T” instead of “S”. The shell will be notified via wait(2), and probably print out a message telling you that the job has stopped.
  • The shell builtin fg will move a stopped job back into the foreground and resume it. The shell builtin bg will resume a stopped job, but will not put it into the foreground; the shell then stays in the foreground and the resumed job will be in the background.
  • When a process in the background tries to read from the terminal, it gets a SIGTTIN signal. The default action is to stop the process. When a process in the background tries to write to the terminal, it gets a SIGTTOU signal if the TOSTOP flag is set for the terminal. (If this flag is clear, the write succeeds.)

Furthermore, if a session leader dies, the session loses its controlling tty. When a session loses its ctty, whether because of this, or because the session leader used TIOCNOTTY, or because another session stole the ctty, the entire foreground process group receives SIGHUP. POSIX states that after a process loses its controlling tty, the system may deny it access to that tty (unless it reopens it as a different file descriptor), without specifying how. On my system the result is an EIO if a process tries to read or write the ctty using a file descriptor that was opened before it lost its ctty. So if I start a process like cat(1) in the background and then close the shell, it will probably exit after it notices that a read or write failed. On the other hand, if I start something like sleep 100 &, then this process will survive if the shell dies abruptly, or if I log out of a bash shell. And if I start setsid sleep 100 & and subsequently disown that job, the process will survive however the shell dies (unless 100 seconds pass and it exits normally first).

If we use the command line ps -o tty,stat,ppid,pid,pgid,sid,args, we get output like this, showing us session IDs:

pts/0    Ss    4541 13243 13243 13243 /bin/bash
pts/0    T    13243 13336 13336 13243 vim foo.c
pts/0    S    13243 13447 13447 13243 strace -p 931
pts/0    S    13243 13448 13447 13243 grep --color=auto ioctl
pts/0    R+   13243 28620 28620 13243 ps -o tty,stat,ppid,pid,pgid,sid,args

Note that the “STAT” column is non-POSIX, and is a long form of “S”. If the STAT has “s” as its second column, that process is the leader of its session; as expected, this is the shell. A “+” indicates that a process is in the foreground. I started ps(1) in the foreground, so obviously it should report itself as being in the foreground when it runs. All five processes belong to the same session, whose ID is, predictably, that of the shell, the initial process of the session. The controlling tty is shown in the “TT” column. (The tty /dev/pts/0 is a pseudoterminal. I will discuss pseudoterminals in the next part.)

Thus we see that job control is not a feature of the shell alone, or the terminal alone, but rather the shell and the terminal working together. If you start a shell on the system console, it will advise you that there will be no job control, since it won’t have any ctty. (On my system, I can’t do this simply by putting the system in runlevel 1, but I can if I force it to get stuck booting, which drops me into the initramfs busybox prompt.) This is an unfortunate state of affairs, because if you accidentally start a job that won’t stop by itself, then you can’t stop it by pressing Ctrl+C, Ctrl+\, or Ctrl+Z. If you aren’t logged in on any other terminals, you’re screwed and have to reboot the system. (Well, actually, you might be able to kill it with the magic SysRq key ;) )

A daemon, by the way, is a process that has no controlling tty. The usual code snippet you find floating around the internet for daemonizing a process looks like this:

if (fork()) exit(0);
if (fork()) exit(0);

You should now understand how this code works. We know that setsid(2) will get rid of the process’s ctty by moving it into a new session. The initial fork(2) and exit(2) ensure that setsid(2) will succeed, by guaranteeing that the surviving process is not a session leader. (This, of course, is because it is the child, and it inherits the parent’s SID, so it cannot possibly be the session leader.) The second fork(2) and exit(2) ensure that the surviving process after the setsid(2) is also not a session leader. This guarantees that it will never accidentally acquire a new ctty if it should happen to open(2) a tty that isn’t currently controlling any session. The library function daemon(3) will do all of this automatically for you.

That sums up part six. In part seven, I’ll talk about pseudoterminals, two-sided terminal-like devices that I’ve been mentioning here and there.

1 The commands setsid(1), disown, and nohup(1) all differ subtly. setsid(1) runs a program in a new session, and that’s it. disown removes a job from the shell’s job table, and that’s it. nohup(1) does neither, but it arranges for the process not to die when it receives a SIGHUP. See also answer by Michael Mrozek and comment by Gilles on Stack Overflow.


About Brian

Hi! I'm Brian Bi. As of November 2014 I live in Sunnyvale, California, USA and I'm a software engineer at Google. Besides code, I also like math, physics, chemistry, and some other miscellaneous things.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s