페이지

2022년 4월 23일 토요일

1.8 Distributerd Systems

 A distributed system is collection of physically separate, possibly heterogeneour computer systems that are networked to provide users with access to the various resources that the system maintains. Access to a shared resource increases computation speed, functionallity, data availability, and reliability. Some operating systems generalize network access as a form of file access, with the details of networking contained in the network interface's device driver. 

Others make users specifically invoke networks functions. Generally, systems contain a mix of the two modes-for example FTP and NFS. The protocols that create a distributed system can greatly affect that system's utility and popularity.

A network, in the simplest terms, is a communication path between two or more systems. Distributed systems depend on networking for their functionality. Networks vary by the protocols used, the distances between nodes, and the transport media. TCP/IP is the most common network protocol, and it provides the fundamental architecture of the Internet. Most operating systems support TCP/IP, including all general=-purpose ones, Some systems support proprietary protocols to suit their needs. For an operating system, it is necessary only that a network protocol have an interface device-a network adapter, for example - with a device direver to manage it,  as well as software to handle data. These concepts are discussed throughout this book.

Networks are characterized based on the distances between their nodes. A local-area network(LAN) connects computers within a room, a building, or a compus. A wide-area network(WAN) uaually links buildings, cities, or conuntries. A global company may have a WAN to connect it offices worldwide, for example. These networks may run one protocol or several protocols. The continuing advent of new technologies brings about new forms of networks. For example, a metropolitan-area entwork(MAN) could link buildings within a city. BlueTooth and 80.11 devices use wireless technology to communicate over a distance of serveral feet, in essence creating a personal-area network(PAN) between a phone and a headset or a smartphone and a desktop computer. 

The media to carry networks are equally varied. They include copper wires, fiber strands, and wireless transmissions between satellites, microwave dished, and radios. When computing devices are connected to cellular phones, they create a network. Even very short-range infrared communication can be used for networking. Even very short-range infrared communication can be used for networking. At a rudimentary level, whenever computers communicate, they use or create a network. These networks also vary in their performance and reliability.

Some operating systems have taken the concept of networks and idstributed systems further than notion of providing network connectivity. A network operating system is an operating system that provides features such as file sharing across the network, along with a comunicaste, they use or create a network. These networks also vary in their performance and reliability.

Some operating systems have taken the concept of entworks and distributed systems further than the notion of providing network connnectivity. A network operatins system is an operating system that provides features such as file sharing across the network, along with a communication scheme that allows different processes on different computers to exchange messages. A computer running a network operating system acts autonomously from all able to communicate iwth other networked computer. A distributed operating system provides a less autonomous environment. The different computers communicate closely enough to provide the illusion that only a single operateing system controls the network,. We cover cojmputer nedtworks and distributed system in Chapter 19.

1.7 Virtualization

 Virtualizations is a technology that allows us to abstract the hardware of a single computer (the CPU, memory, disk drivers, network interface cards, and so forth) into several different execution environments, thereby creating the illusion that each separate envirnment is running on its own private computer. These environments can be viewed as different individual operating systems(for example, Windows and UNIX) that may gbe running at the same time and may interact with each other. A user of a virtual machine can switch among the various operating systems in the same way a user can switch among the various operating systems in the same way a user can switch among the various processes runing concurrently in a single operating system.

Virtualization allows operating systems to run as application within other operating systems. At first blush, there seems to be littel reson for such functionality. But the virtualization industry is vast and growing. which is a testament to its utility and importance.

Broadly speaking, virtualization software is one members of a class that also includes emulation. Emulation, which involves simulating computer hardware in software, is typically used when the source CPU type is different from the target CPU type. For example, when Apple switched from the IBM Power CPU to the Intel x86 CPU for its desktop and laptop computers, it included an cmulation facility called "Rosetta," which allowed applications comiled for the IMB CPU to run on th eIntel CPU. That sme concept can be extended to allow an entire operating system written for one platform to run on another. Emulation comes at a heavy price, however. Every machine-level instruction that runs natively on the ousrce system must be translated to the equivalent function on the target system, frequently resulting in serveral target instructions. If the source and target CPUs have similar performance levels, the emulated code may run much more slowly than the native code.

With virtualization, in contrast, an operating system that is natively compiled for a particular CPU architecture runs within another operating system also native to that CPU. Virtualization first came about on IBM mainframes as a method for multiple users to run tasks concurrently. Running multiple virtual machines allowed (and still allows) many users to run tasks on a system designed for a single user. Lather, in response to problems with running multiple Microsoft Windows applications on the Intel x86 CPU, VMware created a new virtualization technology in the form of an application that ran on Windows. That application ran one or more guest copies of Windows or other native x86 operating systems, each running its own applications. (See Figure 1.16)

Windows was the host operating ssytem, and the VMware application was the virtual machine manager(VMM). The VMM runs the guest operating systems, manages their resource use, and protectes each guest from the others.

Even though modern operating systems are fully capable of running multiple applications reliably, the use of virtualization continues to grow. On laptops and desktops, a VMM allows athe user to install multiple operating systems for exploration or to run applications written for operating systems other than the native host. For example, an Apple laptop running macos on the x86 CPUcan run a Windows 10 guest to allow execution of Windows applications. Companies writing software for multiple operating systems can use virtualization to run all of those operating systems on a single physical server for development, testing, and debugging. Within data centers, virtualization has become a common method of executing and managing computing environments. VMMs like VMware ESXand Citrix XenServer no longer run on host operating systems but rather are the host operating systems, providing services and resource management to virtual machine processes.

With this text, we provide a Linux virtual machine that allows you to run Linux-as well as the development tools we provide-on your personnal system regardless of your host operating system. Full details of the features and implementation of virtualization can be found in Chapter 18.

1.6 Security and Protection

 If a computer system has multiple users and allows the concurrent execution of multiple processes, then access to data must be regulated. For that purpose, mechanisms ensure that files, memory segments, CPU, and other resources can be operated on by only those processes that have gained proper authorization from the operating system. For example, memory-addressing hardware ensure that a process can execute only within its own address space. The timer ensures that no process can gain control of the CPU without eventually relinquishing control. Device-control registers are not accessible to suers, so the integrity of the various peripheral devices is protected.

Protection, then, is any mechanism for controlling the access or processes or users to the resources defined by a computer system. This mechanism must provide means to specify the controls to be imposed and to enforce the contorls.

Protection can improve reliablity by detecting latent errors at the interfaces between componet subsystems. Early detection fo interface errors can oftern prevent ocntamination of a healthy subsystem by another subsystem that is malfunctioning. Furthermore, an upprotected resource cannot defend against use(or misuse) by an unauthorized or incompetent suer. A protection-oriented system provides a means to distinguish hetween authorized and unauthorized usage, as we discuss in Chapter 17.

A system can have adequate porotection but still be prone to failure and allow inappropriate access. Consider a user whose authentication information (her means of identifying herself to the system) is stolen. Her data could be copied or deleted, even though file and memory proction are working. It is the job of security to defend a system from external and interanl attacks. Such attacks spread across a huge range and include virusers and worms, denial-of-service attacks (which use all of  a system's resources and so keep legitimate user out of the system). identity theeft, and theft of service (unauthorized use of a system). Prevention of some of these attacks is considered an operating-system function on some systems, whuile other systems leave it to policy or additional software. Due to the alarming rise in security incidenits, operaing-system security features are a fast0growing area of research and implementation. We discuss security in Chapter 16.

Protection and security require the system to be able to distinguish among all its users. Most operating systems maintain a list of user anmes and associated user identifier (user IDs). InWindows parlance, this is a security ID(SID). These numerical IDs are unique,  one per user. When a user logs in to the system, the authentication stage determines the appropriate userID for the user. That user ID is associated with all of the user';s processes and threads. When an ID needs to be readable by a user, it is translated back to the user name via the user name list.

In some circumstances, we wish to distinguish among sets of users rather than individual users. For example, the owner of a file on a UNIX system may be allowed to issue all operations on that file, whereas a selected set of users may be allowed only to read the file. To accomplish this, we need to define a group name and the set of users belonging to that group. Group functionality can be implemented as a system-wide list of group names and group identifier. A user can be in one or more groups, depending on operating-system design decisions. The user's group IDs are also included in every associated process and thread.

In the conurse of normal system sue, the user ID and group ID for a suer are sufficient. However, a user sometimes needs to escalate privilieges to gain extra permissions for an activity. The user may need access to a device that is restricted, for example. Operating systems provide various methods to allow privilege escalation. On UNIZ, for instance, the sctuid attribute on a program causes that program to run with the user ID of the owner of the file, rather than the current user's ID of the owner of the file, rather than the current user' ID. The process runs with this effective UID until it turns off the extra privileges or terminates.


1.5.6 I/O System Management

 One of the purposes of an operating system is to hide the peculiarities of specific hardware devices from the user. For example, in UNIX, the peculiarities of I/O devices are hidden from the bulk of the operating system itself by the I/O subsystem. The I/O subsystem consists of several components:

- A memory-management component that includes buffering, caching, and spooling

- A general device-driver interface

- Drivers for specific hardware devices

Only the device direver knows the peculiarities of the specific device to which it is assigned.

We discussed earlier in this chapter how interrupt handlers and device drivers are used in the construction of efficient I/O subsystems. In Chapter 12, we discuss how the I/O subsystem interfaces to the other system components, manage devices, transfers data, and detects I/O completion.


1.5.5 Cache Management

 Caching is an important principle of computer systems. Here's how it works. Information is normally kept in some storage system (such as main memory). As it is used., it is copied into a faster storage system-the cache-on a temporary basis. When we need a particular piece of information, we first check whether it is in the cache. If it is, we use the information directly from the cache.

If it is not, we use the information from the source, putting a copy in the cache under the assumption that we will need it again soon.

In addition, internal programmable registers provide a high-speed cache for main memory. The programmer (or compiler) implements the registerallocation and register-replacement algorithms to decide which iunformation to keep in registers and which to keep in main memory.

Other caches are implemented totally in hardware. For instance, most system have an instruction cache to hold the instructions expected to be executed next. Without this cache, the CPU would thave to wait serveral cycles while an instruction was tfetched from main memory. For similar resons, most system have one or more high-speed data caches in the memory hierarchy. We are not concerned with these hardware-only cache in this text, since they are outside the control of the operating system.

Because cache have limited size, cache management is an important design problem. Careful selection of the cache size and of a replacement policy can result in greatly increased performance, as you can see by examining Figure 1.14. Replacement algorithms for software-controlled caches are discussed in Caphter 10.

The movement of information between levels of implicit. depending on the hardware design and the controlling operating-system software. For instance, data transfer from cache to CPU and registers is usually a hardware function, with no operating-system intervention. In contrast, transfer of data from disk to memory is usually controlled by the operating system.

In a hierarchical storage structure, the same data may appear in different levels of the storage system. For example, suppose that an integer A that is to be incremented by 1 is located in file B, and file B resides on hard disk. The increment operating proceeds by first issuing an I/O operation is followed by dcopying A to the cache and to an interanl register. Thus, the copy of A appears  internal register (see Figure 1.15). IOnce the increment takes place in the internal register, the value of A differs in the various storage systems. The value of A becomes the sma only after the new value of A is written from the interanl register back to the hard disk.

In a computing environment where only one process executes at a time, this arrangement poses no difficulties, since an access to integer A will always be to the copy at the highest level of the hierarchy. However, in a multitasking environment, where the CPU  is switched back and forth among various processes, extreme care must be taken to ensure that, if several processes wish to access A, then each of these processes will obtain the most recently updated value of A. 

The situation becomes more complicated in amultiprocesssor environment where, in addition to maintaining interal registers, each of the CPUs also contains a local cache (refer back to Figure 1.8). In such an environment, a copy of A may exist simultaneously in several caches. Since the various CPUs can all exeute in parallel, we must make sure that an update to the value of A in one cache is immediately reflected in all other caches where A  resides. This situation is called cache coherency, and it is usually a hardware issue(handled below the operating-system level).

In a distributed environment, the situation becomes even more complex. In this environment, several copies(or replicas)of the same file can be kept on different computers. Since the various replicas may be accessed and updated concurrently, some distributed systems ensure that, when a replica is updated in one place, all other replicas are brought up to date as soon as possible. There are various ways to achieve this gurantee, as we discuss in Chapter 19.


1.5.4 Mass-Storage Management

 As we have already seen, the computer system must provide secondary storage to back up main memory. Most modern computer systems use HDDs and NVM devices as the principal on-line storage media for both programs and data. Most programs-including compilers, web browsers, word processors, and games-are stored on these devices until loaded into memory. The processing. Hence, the proper management of secondary storage is of central importance to a computer system. The operating system is responsible for the following activities in connection with secondary storage management:

- Mounting and unmounting

- Free-space management

- Storage allocation

- Disk scheduling

- Partitioning

- Protection

Because secondary storage is used frequently and extensively, it must be used efficiently. The entire speed of operation of a computer may hinge on the speeds of the seconddary storage subsystem and the algorithms that manipulate that subsystem.

At the same time, there are amny users for storage that is slower and lower in cost (and sometimes higher in capacity) than secondary storage. Backups of disk data, storage of seldom-used data, and long-term archival storage are some examples. Magnetic tape drivers and their tapes and CD DVD and Blu-ray drivers and platters are typical tertiary storage devices.

Tertiary storage is not crucial to system performance, but it still must be managed. Some operating systems take on this task, while others leave tertiary-storage management to application programs. Some of the functions that operating systems can provide include mounting and unmounting media in devices, allocating and freeing the devices for exclusive use by processes, and migrating data from secondary to tertiary storage.

Techniques for secondary storatge and tertiary storage management are discussed in Chapter 11.

2022년 4월 22일 금요일

1.5.3 File-System Management

 To make the computer system convenient for users, the operating system provides a uniform. logical view of information storage. The operating system abstracts from the physical properties of its storage devices to define a logical storage unit, the fil. The operating system maps files onto physical media and accesses these files via the storage devices.

File management it one of the most visible components of an operating system. Computers can store information on several different types of physical media. Secondary storage is the most common, but tertiary storage is also possible. Each of these media has its own characteristics and physical organization. Most are controlled by a device, such as a disk direve, that also has its own unique characteristics. These propertes include access speed, capacity, data-transfer rate, and access method (sequential or random).

A file is a collection of related information defined by its creator. Commonly, files represent programs (both source and object forms) and data. Data files may be numeric, alphabetic, alphanumeric, or binary. File may be freeform(for example, text files), or they may be formatted rigidly (for example, fixed fields such as an mp3 music file). Clearly, the concept of a file is an extremely general one.

The operating system implements the abstract concept of a file by managing mass storage media and the devices that control them. In addition, files are normally organized into directories to make them easier to use. Finally, when multiple users have access to files, it may be desirable to control which user may access a file and how that user amy access if (for example, read, write, append).

The operating system is responsible for the following activities in connection with file management:

- Creating and deleting files

- Creating and deleting directories to organize files

- Supporting primitives for manaipulating files and directories

- mapping files onto mass storage

- Backing up files on stable (nonvolatile) storage media

File-management techniques are discussed in Chapter 13, Chapter 14, and Chapter 15.