4.  The Jail Partitioning Solution

      Jail neatly side-steps the majority of these problems through partitioning. Rather than introduce additional fine-grained access control mechanism, we partition a FreeBSD environment (processes, file system, network resources) into a management environment, and optionally subset Jail environments. In doing so, we simultaneously maintain the existing UNIX security model, allowing multiple users and a privileged root user in each jail, while limiting the scope of root's activities to his jail. Consequently the administrator of a FreeBSD machine can partition the machine into separate jails, and provide access to the super-user account in each of these without losing control of the over-all environment.

      A process in a partition is referred to as ``in jail''. When a FreeBSD system is booted up after a fresh install, no processes will be in jail. When a process is placed in a jail, it, and any descendents of the process created after the jail creation, will be in that jail. A process may be in only one jail, and after creation, it can not leave the jail. Jails are created when a privileged process calls the jail(2) syscall, with a description of the jail as an argument to the call. Each call to jail(2) creates a new jail; the only way for a new process to enter the jail is by inheriting access to the jail from another process already in that jail. Processes may never leave the jail they created, or were created in.

Fig. 1 -- Schematic diagram of machine with two configured jails

      Membership in a jail involves a number of restrictions: access to the file name-space is restricted in the style of chroot(2), the ability to bind network resources is limited to a specific IP address, the ability to manipulate system resources and perform privileged operations is sharply curtailed, and the ability to interact with other processes is limited to only processes inside the same jail.

      Jail takes advantage of the existing chroot(2) behaviour to limit access to the file system name-space for jailed processes. When a jail is created, it is bound to a particular file system root. Processes are unable to manipulate files that they cannot address, and as such the integrity and confidentiality of files outside of the jail file system root are protected. Traditional mechanisms for breaking out of chroot(2) have been blocked. In the expected and documented configuration, each jail is provided with its exclusive file system root, and standard FreeBSD directory layout, but this is not mandated by the implementation.

      Each jail is bound to a single IP address: processes within the jail may not make use of any other IP address for outgoing or incoming connections; this includes the ability to restrict what network services a particular jail may offer. As FreeBSD distinguishes attempts to bind all IP addresses from attempts to bind a particular address, bind requests for all IP addresses are redirected to the individual Jail address. Some network functionality associated with privileged calls are wholesale disabled due to the nature of the functionality offered, in particular facilities which would allow ``spoofing'' of IP numbers or disruptive traffic to be generated have been disabled.

      Processes running without root privileges will notice few, if any differences between a jailed environment or un-jailed environment. Processes running with root privileges will find that many restrictions apply to the privileged calls they may make. Some calls will now return an access error -- for example, an attempt to create a device node will now fail. Others will have a more limited scope than normal -- attempts to bind a reserved port number on all available addresses will result in binding only the address associated with the jail. Other calls will succeed as normal: root may read a file owned by any uid, as long as it is accessible through the jail file system name-space.

      Processes within the jail will find that they are unable to interact or even verify the existence of processes outside the jail -- processes within the jail are prevented from delivering signals to processes outside the jail, as well as connecting to those processes with debuggers, or even see them in the sysctl or process file system monitoring mechanisms. Jail does not prevent, nor is it intended to prevent, the use of covert channels or communications mechanisms via accepted interfaces -- for example, two processes may communicate via sockets over the IP network interface. Nor does it attempt to provide scheduling services based on the partition; however, it does prevent calls that interfere with normal process operation.

      As a result of these attempts to retain the standard FreeBSD API and framework, almost all applications will run unaffected. Standard system services such as Telnet, FTP, and SSH all behave normally, as do most third party applications, including the popular Apache web server.