Misplaced Pages

Computer multitasking

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In computing , multitasking is the concurrent execution of multiple tasks (also known as processes ) over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end. As a result, a computer executes segments of multiple tasks in an interleaved manner, while the tasks share common processing resources such as central processing units (CPUs) and main memory . Multitasking automatically interrupts the running program, saving its state (partial results, memory contents and computer register contents) and loading the saved state of another program and transferring control to it. This " context switch " may be initiated at fixed time intervals ( pre-emptive multitasking ), or the running program may be coded to signal to the supervisory software when it can be interrupted ( cooperative multitasking ).

#923076

126-400: Multitasking does not require parallel execution of multiple tasks at exactly the same time; instead, it allows more than one task to advance over a given period of time. Even on multiprocessor computers, multitasking allows many more tasks to be run than there are CPUs. Multitasking is a common feature of computer operating systems since at least the 1960s. It allows more efficient use of

252-435: A swap file or swap partition is a way for the operating system to provide more memory than is physically available by keeping portions of the primary memory in secondary storage . While multitasking and memory swapping are two completely unrelated techniques, they are very often used together, as swapping memory allows more tasks to be loaded at the same time. Typically, a multitasking system allows another process to run when

378-428: A system call to perform a block I/O write operation, then the system call might execute the following instructions: While the writing takes place, the operating system will context switch to other processes as normal. When the device finishes writing, the device will interrupt the currently running process by asserting an interrupt request . The device will also place an integer onto the data bus. Upon accepting

504-403: A certain point. Amdahl's Law has limitations, including assumptions of fixed workload, neglecting inter-process communication and synchronization overheads, primarily focusing on computational aspect and ignoring extrinsic factors such as data persistence, I/O operations, and memory access overheads. Gustafson's law and Universal Scalability Law give a more realistic assessment of

630-645: A computer even if they are not compatible with the base operating system. A library operating system (libOS) is one in which the services that a typical operating system provides, such as networking, are provided in the form of libraries and composed with a single application and configuration code to construct a unikernel : a specialized (only the absolute necessary pieces of code are extracted from libraries and bound together ), single address space , machine image that can be deployed to cloud or embedded environments. The operating system code and application code are not executed in separated protection domains (there

756-446: A constant value for large numbers of processing elements. The maximum potential speedup of an overall system can be calculated by Amdahl's law . Amdahl's Law indicates that optimal performance improvement is achieved by balancing enhancements to both parallelizable and non-parallelizable components of a task. Furthermore, it reveals that increasing the number of processors yields diminishing returns, with negligible speedup gains beyond

882-585: A development of MULTICS for a single user. Because UNIX's source code was available, it became the basis of other, incompatible operating systems, of which the most successful were AT&T 's System V and the University of California 's Berkeley Software Distribution (BSD). To increase compatibility, the IEEE released the POSIX standard for operating system application programming interfaces (APIs), which

1008-484: A large legal settlement was paid. In the twenty-first century, Windows continues to be popular on personal computers but has less market share of servers. UNIX operating systems, especially Linux, are the most popular on enterprise systems and servers but are also used on mobile devices and many other computer systems. On mobile devices, Symbian OS was dominant at first, being usurped by BlackBerry OS (introduced 2002) and iOS for iPhones (from 2007). Later on,

1134-442: A library with no protection between applications, such as eCos . A hypervisor is an operating system that runs a virtual machine . The virtual machine is unaware that it is an application and operates as if it had its own hardware. Virtual machines can be paused, saved, and resumed, making them useful for operating systems research, development, and debugging. They also enhance portability by enabling applications to be run on

1260-445: A mainstream programming task. In 2012 quad-core processors became standard for desktop computers , while servers have 10+ core processors. From Moore's law it can be predicted that the number of cores per processor will double every 18–24 months. This could mean that after 2020 a typical processor will have dozens or hundreds of cores, however in reality the standard is somewhere in the region of 4 to 16 cores, with some designs having

1386-447: A malformed machine instruction . However, the most common error conditions are division by zero and accessing an invalid memory address . Users can send messages to the kernel to modify the behavior of a currently running process. For example, in the command-line environment , pressing the interrupt character (usually Control-C ) might terminate the currently running process. To generate software interrupts for x86 CPUs,

SECTION 10

#1732787239924

1512-574: A mix of performance and efficiency cores (such as ARM's big.LITTLE design) due to thermal and design constraints. An operating system can ensure that different tasks and user programs are run in parallel on the available cores. However, for a serial software program to take full advantage of the multi-core architecture the programmer needs to restructure and parallelize the code. A speed-up of application software runtime will no longer be achieved through frequency scaling, instead programmers will need to parallelize their software code to take advantage of

1638-520: A multi-core processor can issue multiple instructions per clock cycle from multiple instruction streams. IBM 's Cell microprocessor , designed for use in the Sony PlayStation 3 , is a prominent multi-core processor. Each core in a multi-core processor can potentially be superscalar as well—that is, on every clock cycle, each core can issue multiple instructions from one thread. Simultaneous multithreading (of which Intel's Hyper-Threading

1764-403: A node), or n-dimensional mesh . Parallel computers based on interconnected networks need to have some kind of routing to enable the passing of messages between nodes that are not directly connected. The medium used for communication between the processors is likely to be hierarchical in large multiprocessor machines. Parallel computers can be roughly classified according to the level at which

1890-410: A parallel program are often called threads . Some parallel computer architectures use smaller, lightweight versions of threads known as fibers , while others use bigger versions known as processes . However, "threads" is generally accepted as a generic term for subtasks. Threads will often need synchronized access to an object or other resource , for example when they must update a variable that

2016-455: A particular application's memory is stored, or even whether or not it has been allocated yet. In modern operating systems, memory which is accessed less frequently can be temporarily stored on a disk or other media to make that space available for use by other programs. This is called swapping , as an area of memory can be used by multiple programs, and what that memory area contains can be swapped or exchanged on demand. Virtual memory provides

2142-467: A peripheral, the central processing unit (CPU) would have to stop executing program instructions while the peripheral processed the data. This was usually very inefficient. Multiprogramming is a computing technique that enables multiple programs to be concurrently loaded and executed into a computer's memory, allowing the CPU to switch between them swiftly. This optimizes CPU utilization by keeping it engaged with

2268-402: A peripheral. As there were no users waiting at an interactive terminal, this was no problem: users handed in a deck of punched cards to an operator, and came back a few hours later for printed results. Multiprogramming greatly reduced wait times when multiple batches were being processed. Early multitasking systems used applications that voluntarily ceded time to one another. This approach, which

2394-543: A pipelined processor is a RISC processor, with five stages: instruction fetch (IF), instruction decode (ID), execute (EX), memory access (MEM), and register write back (WB). The Pentium 4 processor had a 35-stage pipeline. Most modern processors also have multiple execution units . They usually combine this feature with pipelining and thus can issue more than one instruction per clock cycle ( IPC > 1 ). These processors are known as superscalar processors. Superscalar processors differ from multi-core processors in that

2520-421: A problem, an algorithm is constructed and implemented as a serial stream of instructions. These instructions are executed on a central processing unit on one computer. Only one instruction may execute at a time—after that instruction is finished, the next one is executed. Parallel computing, on the other hand, uses multiple processing elements simultaneously to solve a problem. This is accomplished by breaking

2646-503: A program does not interfere with memory already in use by another program. Since programs time share, each program must have independent access to memory. Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary use of the kernel 's memory manager, and do not exceed their allocated memory. This system of memory management is almost never seen any more, since programs often contain bugs which can cause them to exceed their allocated memory. If

SECTION 20

#1732787239924

2772-408: A program fails, it may cause memory used by one or more other programs to be affected or overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the operation of the operating system itself. With cooperative memory management, it takes only one misbehaved program to crash the system. Memory protection enables the kernel to limit a process' access to

2898-440: A program tries to access memory that is not accessible memory, but nonetheless has been allocated to it, the kernel is interrupted (see § Memory management ) . This kind of interrupt is typically a page fault . When the kernel detects a page fault it generally adjusts the virtual memory range of the program which triggered it, granting it access to the memory requested. This gives the kernel discretionary power over where

3024-450: A result from instruction 2. It violates condition 1, and thus introduces a flow dependency. In this example, there are no dependencies between the instructions, so they can all be run in parallel. Bernstein's conditions do not allow memory to be shared between different processes. For that, some means of enforcing an ordering between accesses is necessary, such as semaphores , barriers or some other synchronization method . Subtasks in

3150-471: A result, shared memory computer architectures do not scale as well as distributed memory systems do. Processor–processor and processor–memory communication can be implemented in hardware in several ways, including via shared (either multiported or multiplexed ) memory, a crossbar switch , a shared bus or an interconnect network of a myriad of topologies including star , ring , tree , hypercube , fat hypercube (a hypercube with more than one processor at

3276-421: A significant amount of CPU time. Direct memory access (DMA) is an architecture feature to allow devices to bypass the CPU and access main memory directly. (Separate from the architecture, a device may perform direct memory access to and from main memory either directly or via a bus.) When a computer user types a key on the keyboard, typically the character appears immediately on the screen. Likewise, when

3402-412: A single address space ), or distributed memory (in which each processing element has its own local address space). Distributed memory refers to the fact that the memory is logically distributed, but often implies that it is physically distributed as well. Distributed shared memory and memory virtualization combine the two approaches, where the processing element has its own local memory and access to

3528-595: A single machine, while clusters , MPPs , and grids use multiple computers to work on the same task. Specialized parallel computer architectures are sometimes used alongside traditional processors, for accelerating specific tasks. In some cases parallelism is transparent to the programmer, such as in bit-level or instruction-level parallelism, but explicitly parallel algorithms , particularly those that use concurrency, are more difficult to write than sequential ones, because concurrency introduces several new classes of potential software bugs , of which race conditions are

3654-402: A specific moment in time. Hard real-time systems require exact timing and are common in manufacturing , avionics , military, and other similar uses. With soft real-time systems, the occasional missed event is acceptable; this category often includes audio or multimedia systems, as well as smartphones. In order for hard real-time systems be sufficiently exact in their timing, often they are just

3780-500: A sufficient amount of memory bandwidth exists. A distributed computer (also known as a distributed memory multiprocessor) is a distributed memory computer system in which the processing elements are connected by a network. Distributed computers are highly scalable. The terms " concurrent computing ", "parallel computing", and "distributed computing" have a lot of overlap, and no clear distinction exists between them. The same system may be characterized both as "parallel" and "distributed";

3906-521: A task independently. On the other hand, concurrency enables a program to deal with multiple tasks even on a single CPU core; the core switches between tasks (i.e. threads ) without necessarily completing each one. A program can have both, neither or a combination of parallelism and concurrency characteristics. Parallel computers can be roughly classified according to the level at which the hardware supports parallelism, with multi-core and multi-processor computers having multiple processing elements within

Computer multitasking - Misplaced Pages Continue

4032-423: A task runs until it must wait for an external event or until the operating system's scheduler forcibly swaps the running task out of the CPU. Real-time systems such as those designed to control industrial robots, require timely processing; a single processor might be shared between calculations of machine movement, communications, and user interface. Often multitasking operating systems include measures to change

4158-512: A time from multiple threads. A symmetric multiprocessor (SMP) is a computer system with multiple identical processors that share memory and connect via a bus . Bus contention prevents bus architectures from scaling. As a result, SMPs generally do not comprise more than 32 processors. Because of the small size of the processors and the significant reduction in the requirements for bus bandwidth achieved by large caches, such symmetric multiprocessors are extremely cost-effective, provided that

4284-417: A user moves a mouse , the cursor immediately moves across the screen. Each keystroke and mouse movement generates an interrupt called Interrupt-driven I/O . An interrupt-driven I/O occurs when a process causes an interrupt for every character or word transmitted. Devices such as hard disk drives , solid-state drives , and magnetic tape drives can transfer data at a rate high enough that interrupting

4410-499: A variant to threads, named fibers , that are scheduled cooperatively. On operating systems that do not provide fibers, an application may implement its own fibers using repeated calls to worker functions. Fibers are even more lightweight than threads, and somewhat easier to program with, although they tend to lose some or all of the benefits of threads on machines with multiple processors . Some systems directly support multithreading in hardware . Essential to any multitasking system

4536-453: A variation of the classic reader/writer problem . The writer receives a pipe from the shell for its output to be sent to the reader's input stream. The command-line syntax is alpha | bravo . alpha will write to the pipe when its computation is ready and then sleep in the wait queue. bravo will then be moved to the ready queue and soon will read from its input stream. The kernel will generate software interrupts to coordinate

4662-562: Is remote direct memory access , which enables each CPU to access memory belonging to other CPUs. Multicomputer operating systems often support remote procedure calls where a CPU can call a procedure on another CPU, or distributed shared memory , in which the operating system uses virtualization to generate shared memory that does not physically exist. A distributed system is a group of distinct, networked computers—each of which might have their own operating system and file system. Unlike multicomputers, they may be dispersed anywhere in

4788-484: Is a change away from the currently running process. Similarly, both hardware and software interrupts execute an interrupt service routine . Software interrupts may be normally occurring events. It is expected that a time slice will occur, so the kernel will have to perform a context switch . A computer program may set a timer to go off after a few seconds in case too much data causes an algorithm to take too long. Software interrupts may be error conditions, such as

4914-416: Is a programming language construct that allows one thread to take control of a variable and prevent other threads from reading or writing it, until that variable is unlocked. The thread holding the lock is free to execute its critical section (the section of a program that requires exclusive access to some variable), and to unlock the data when it is finished. Therefore, to guarantee correct program execution,

5040-429: Is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level , instruction-level , data , and task parallelism . Parallelism has long been employed in high-performance computing , but has gained broader interest due to

5166-422: Is difficult to define, but has been called "the layer of software that manages a computer's resources for its users and their applications ". Operating systems include the software that is always running, called a kernel —but can include other software as well. The two other types of programs that can run on a computer are system programs —which are associated with the operating system, but may not be part of

Computer multitasking - Misplaced Pages Continue

5292-516: Is equivalent to an entirely sequential program. The single-instruction-multiple-data (SIMD) classification is analogous to doing the same operation repeatedly over a large data set. This is commonly done in signal processing applications. Multiple-instruction-single-data (MISD) is a rarely used classification. While computer architectures to deal with this were devised (such as systolic arrays ), few applications that fit this class materialized. Multiple-instruction-multiple-data (MIMD) programs are by far

5418-465: Is known as burst buffer , which is typically built from arrays of non-volatile memory physically distributed across multiple I/O nodes. Computer architectures in which each element of main memory can be accessed with equal latency and bandwidth are known as uniform memory access (UMA) systems. Typically, that can be achieved only by a shared memory system, in which the memory is not physically distributed. A system that does not have this property

5544-429: Is known as a non-uniform memory access (NUMA) architecture. Distributed memory systems have non-uniform memory access. Computer systems make use of caches —small and fast memories located close to the processor which store temporary copies of memory values (nearby in both the physical and logical sense). Parallel computer systems have difficulties with caches that may store the same value in more than one location, with

5670-443: Is only a single application running, at least conceptually, so there is no need to prevent interference between applications) and OS services are accessed via simple library calls (potentially inlining them based on compiler thresholds), without the usual overhead of context switches , in a way similarly to embedded and real-time OSes. Note that this overhead is not negligible: to the direct cost of mode switching it's necessary to add

5796-419: Is shared between them. Without synchronization, the instructions between the two threads may be interleaved in any order. For example, consider the following program: If instruction 1B is executed between 1A and 3A, or if instruction 1A is executed between 1B and 3B, the program will produce incorrect data. This is known as a race condition . The programmer must use a lock to provide mutual exclusion . A lock

5922-399: Is still used today on RISC OS systems. As a cooperatively multitasked system relies on each process regularly giving up time to other processes on the system, one poorly designed program can consume all of the CPU time for itself, either by performing extensive calculations or by busy waiting ; both would cause the whole system to hang . In a server environment, this is a hazard that makes

6048-499: Is supported by most UNIX systems. MINIX was a stripped-down version of UNIX, developed in 1987 for educational uses, that inspired the commercially available, free software Linux . Since 2008, MINIX is used in controllers of most Intel microchips , while Linux is widespread in data centers and Android smartphones. The invention of large scale integration enabled the production of personal computers (initially called microcomputers ) from around 1980. For around five years,

6174-473: Is that they do not load user-installed software. Consequently, they do not need protection between different applications, enabling simpler designs. Very small operating systems might run in less than 10 kilobytes , and the smallest are for smart cards . Examples include Embedded Linux , QNX , VxWorks , and the extra-small systems RIOT and TinyOS . A real-time operating system is an operating system that guarantees to process events or data by or at

6300-429: Is the best known) was an early form of pseudo-multi-coreism. A processor capable of concurrent multithreading includes multiple execution units in the same processing unit—that is it has a superscalar architecture—and can issue multiple instructions per clock cycle from multiple threads. Temporal multithreading on the other hand includes a single execution unit in the same processing unit and can issue one instruction at

6426-529: Is the characteristic of a parallel program that "entirely different calculations can be performed on either the same or different sets of data". This contrasts with data parallelism, where the same calculation is performed on the same or different sets of data. Task parallelism involves the decomposition of a task into sub-tasks and then allocating each sub-task to a processor for execution. The processors would then execute these sub-tasks concurrently and often cooperatively. Task parallelism does not usually scale with

SECTION 50

#1732787239924

6552-435: Is the part of the operating system that provides protection between different applications and users. This protection is key to improving reliability by keeping errors isolated to one program, as well as security by limiting the power of malicious software and protecting private data, and ensuring that one program cannot monopolize the computer's resources. Most operating systems have two modes of operation: in user mode ,

6678-400: Is to safely and effectively share access to system resources. Access to memory must be strictly managed to ensure that no process can inadvertently or deliberately read or write to memory locations outside the process's address space. This is done for the purpose of general system stability and data integrity, as well as data security. In general, memory access management is a responsibility of

6804-542: The CP/M (Control Program for Microcomputers) was the most popular operating system for microcomputers. Later, IBM bought the DOS (Disk Operating System) from Microsoft . After modifications requested by IBM, the resulting system was called MS-DOS (MicroSoft Disk Operating System) and was widely used on IBM microcomputers. Later versions increased their sophistication, in part by borrowing features from UNIX. Apple 's Macintosh

6930-681: The Classic Mac OS . In 2001 Apple switched to the NeXTSTEP -influenced Mac OS X . A similar model is used in Windows 9x and the Windows NT family , where native 32-bit applications are multitasked preemptively. 64-bit editions of Windows, both for the x86-64 and Itanium architectures, no longer support legacy 16-bit applications, and thus provide preemptive multitasking for all supported applications. Another reason for multitasking

7056-504: The INT assembly language instruction is available. The syntax is INT X , where X is the offset number (in hexadecimal format) to the interrupt vector table . To generate software interrupts in Unix-like operating systems, the kill(pid,signum) system call will send a signal to another process. pid is the process identifier of the receiving process. signum is

7182-480: The Sinclair QL followed in 1984, but it was not a big success. Commodore's Amiga was released the following year, offering a combination of multitasking and multimedia capabilities. Microsoft made preemptive multitasking a core feature of their flagship operating system in the early 1990s when developing Windows NT 3.1 and then Windows 95 . In 1988 Apple offered A/UX as a UNIX System V -based alternative to

7308-498: The personal computer market, as of September 2024 , Microsoft Windows holds a dominant market share of around 73%. macOS by Apple Inc. is in second place (15%), Linux is in third place (5%), and ChromeOS is in fourth place (2%). In the mobile sector (including smartphones and tablets ), as of September 2023 , Android's share is 68.92%, followed by Apple's iOS and iPadOS with 30.42%, and other operating systems with .66%. Linux distributions are dominant in

7434-420: The transistor in the mid-1950s, mainframes began to be built. These still needed professional operators who manually do what a modern operating system would do, such as scheduling programs to run, but mainframes still had rudimentary operating systems such as Fortran Monitor System (FMS) and IBSYS . In the 1960s, IBM introduced the first series of intercompatible computers ( System/360 ). All of them ran

7560-423: The 1970s until about 1986, speed-up in computer architecture was driven by doubling computer word size —the amount of information the processor can manipulate per cycle. Increasing the word size reduces the number of instructions the processor must execute to perform an operation on variables whose sizes are greater than the length of the word. For example, where an 8-bit processor must add two 16-bit integers ,

7686-429: The CPU (" CPU bound "). In primitive systems, the software would often " poll ", or " busywait " while waiting for requested input (such as disk, keyboard or network input). During this time, the system was not performing useful work. With the advent of interrupts and preemptive multitasking, I/O bound processes could be "blocked", or put on hold, pending the arrival of the necessary data, allowing other processes to utilize

SECTION 60

#1732787239924

7812-410: The CPU for every byte or word transferred, and having the CPU transfer the byte or word between the device and memory, would require too much CPU time. Data is, instead, transferred between the device and memory independently of the CPU by hardware such as a channel or a direct memory access controller; an interrupt is delivered only when all the data is transferred. If a computer program executes

7938-474: The CPU to re-enter supervisor mode , placing the kernel in charge. This is called a segmentation violation or Seg-V for short, and since it is both difficult to assign a meaningful result to such an operation, and because it is usually a sign of a misbehaving program, the kernel generally resorts to terminating the offending program, and reports the error. Windows versions 3.1 through ME had some level of memory protection, but programs could easily circumvent

8064-566: The CPU. As the arrival of the requested data would generate an interrupt, blocked processes could be guaranteed a timely return to execution. Possibly the earliest preemptive multitasking OS available to home users was Microware 's OS-9 , available for computers based on the Motorola 6809 such as the TRS-80 Color Computer 2 , with the operating system supplied by Tandy as an upgrade for disk-equipped systems. Sinclair QDOS on

8190-484: The above program can be rewritten to use locks: One thread will successfully lock variable V, while the other thread will be locked out —unable to proceed until V is unlocked again. This guarantees correct execution of the program. Locks may be necessary to ensure correct program execution when threads must serialize access to resources, but their use can greatly slow a program and may affect its reliability . Locking multiple variables using non-atomic locks introduces

8316-538: The application program, which then interacts with the user and with hardware devices. However, in some systems an application can request that the operating system execute another application within the same process, either as a subroutine or in a separate thread, e.g., the LINK and ATTACH facilities of OS/360 and successors . An interrupt (also known as an abort , exception , fault , signal , or trap ) provides an efficient way for most operating systems to react to

8442-418: The average time per instruction. Maintaining everything else constant, increasing the clock frequency decreases the average time it takes to execute an instruction. An increase in frequency thus decreases runtime for all compute-bound programs. However, power consumption P by a chip is given by the equation P = C × V × F , where C is the capacitance being switched per clock cycle (proportional to

8568-462: The computer hardware; when a program is waiting for some external event such as a user input or an input/output transfer with a peripheral to complete, the central processor can still be used with another program. In a time-sharing system, multiple human operators use the same processor as if it was dedicated to their use, while behind the scenes the computer is serving many users by multitasking their individual programs. In multiprogramming systems,

8694-453: The computer's memory. Various methods of memory protection exist, including memory segmentation and paging . All methods require some level of hardware support (such as the 80286 MMU), which does not exist in all computers. In both segmentation and paging, certain protected mode registers specify to the CPU what memory address it should allow a running program to access. Attempts to access other addresses trigger an interrupt, which causes

8820-444: The costs associated with merging data from multiple processes. Specifically, inter-process communication and synchronization can lead to overheads that are substantially higher—often by two or more orders of magnitude—compared to processing the same data on a single thread. Therefore, the overall improvement should be carefully evaluated. From the advent of very-large-scale integration (VLSI) computer-chip fabrication technology in

8946-471: The details of how interrupt service routines behave vary from operating system to operating system. However, several interrupt functions are common. The architecture and operating system must: A software interrupt is a message to a process that an event has occurred. This contrasts with a hardware interrupt — which is a message to the central processing unit (CPU) that an event has occurred. Software interrupts are similar to hardware interrupts — there

9072-457: The easiest to parallelize. Michael J. Flynn created one of the earliest classification systems for parallel (and sequential) computers and programs, now known as Flynn's taxonomy . Flynn classified programs and computers by whether they were operating using a single set or multiple sets of instructions, and whether or not those instructions were using a single set or multiple sets of data. The single-instruction-single-data (SISD) classification

9198-489: The entire environment unacceptably fragile. Preemptive multitasking allows the computer system to more reliably guarantee to each process a regular "slice" of operating time. It also allows the system to deal rapidly with important external events like incoming data, which might require the immediate attention of one or another process. Operating systems were developed to take advantage of these hardware capabilities and run multiple processes preemptively. Preemptive multitasking

9324-422: The environment. Interrupts cause the central processing unit (CPU) to have a control flow change away from the currently running program to an interrupt handler , also known as an interrupt service routine (ISR). An interrupt service routine may cause the central processing unit (CPU) to have a context switch . The details of how a computer processes an interrupt vary from architecture to architecture, and

9450-467: The execution of tasks, particularly useful when one program is waiting for I/O operations to complete. The Bull Gamma 60 , initially designed in 1957 and first released in 1960, was the first computer designed with multiprogramming in mind. Its architecture featured a central memory and a Program Distributor feeding up to twenty-five autonomous processing units with code and data, and allowing concurrent operation of multiple clusters. Another such computer

9576-410: The hardware checks that the software is only executing legal instructions, whereas the kernel has unrestricted powers and is not subject to these checks. The kernel also manages memory for other processes and controls access to input/output devices. The operating system provides an interface between an application program and the computer hardware, so that an application program can interact with

9702-493: The hardware only by obeying rules and procedures programmed into the operating system. The operating system is also a set of services which simplify development and execution of application programs. Executing an application program typically involves the creation of a process by the operating system kernel , which assigns memory space and other resources, establishes a priority for the process in multi-tasking systems, loads program binary code into memory, and initiates execution of

9828-547: The hardware supports parallelism. This classification is broadly analogous to the distance between basic computing nodes. These are not mutually exclusive; for example, clusters of symmetric multiprocessors are relatively common. A multi-core processor is a processor that includes multiple processing units (called "cores") on the same chip. This processor differs from a superscalar processor, which includes multiple execution units and can issue multiple instructions per clock cycle from one instruction stream (thread); in contrast,

9954-481: The idea that the most efficient way for cooperating processes to exchange data would be to share their entire memory space. Thus, threads are effectively processes that run in the same memory context and share other resources with their parent processes , such as open files. Threads are described as lightweight processes because switching between threads does not involve changing the memory context. While threads are scheduled preemptively, some operating systems provide

10080-454: The increasing computing power of multicore architectures. Main article: Amdahl's law Optimally, the speedup from parallelization would be linear—doubling the number of processing elements should halve the runtime, and doubling it a second time should again halve the runtime. However, very few parallel algorithms achieve optimal speedup. Most of them have a near-linear speedup for small numbers of processing elements, which flattens out into

10206-418: The indirect pollution of important processor structures (like CPU caches , the instruction pipeline , and so on) which affects both user-mode and kernel-mode performance. The first computers in the late 1940s and 1950s were directly programmed either with plugboards or with machine code inputted on media such as punch cards , without programming languages or operating systems. After the introduction of

10332-404: The interrupt request, the operating system will: When the writing process has its time slice expired, the operating system will: With the program counter now reset, the interrupted process will resume its time slice. Among other things, a multiprogramming operating system kernel must be responsible for managing all system memory which is currently in use by the programs. This ensures that

10458-608: The introduction of 32-bit processors, which has been a standard in general-purpose computing for two decades. Not until the early 2000s, with the advent of x86-64 architectures, did 64-bit processors become commonplace. A computer program is, in essence, a stream of instructions executed by a processor. Without instruction-level parallelism, a processor can only issue less than one instruction per clock cycle ( IPC < 1 ). These processors are known as subscalar processors. These instructions can be re-ordered and combined into groups which are then executed in parallel without changing

10584-431: The kernel—and applications—all other software. There are three main purposes that an operating system fulfills: With multiprocessors multiple CPUs share memory. A multicomputer or cluster computer has multiple CPUs, each of which has its own memory . Multicomputers were developed because large multiprocessors are difficult to engineer and prohibitively expensive; they are universal in cloud computing because of

10710-400: The memory allocated to a different one. Around the same time, teleprinters began to be used as terminals so multiple users could access the computer simultaneously. The operating system MULTICS was intended to allow hundreds of users to access a large computer. Despite its limited adoption, it can be considered the precursor to cloud computing . The UNIX operating system originated as

10836-511: The memory on non-local processors. Accesses to local memory are typically faster than accesses to non-local memory. On the supercomputers , distributed shared memory space can be implemented using the programming model such as PGAS . This model allows processes on one compute node to transparently access the remote memory of another compute node. All compute nodes are also connected to an external shared memory system via high-speed interconnect, such as Infiniband , this external shared memory system

10962-442: The most common type of parallel programs. According to David A. Patterson and John L. Hennessy , "Some machines are hybrids of these categories, of course, but this classic model has survived because it is simple, easy to understand, and gives a good first approximation. It is also—perhaps because of its understandability—the most widely used scheme." Parallel computing can incur significant overhead in practice, primarily due to

11088-498: The most common. Communication and synchronization between the different subtasks are typically some of the greatest obstacles to getting optimal parallel program performance. A theoretical upper bound on the speed-up of a single program as a result of parallelization is given by Amdahl's law , which states that it is limited by the fraction of time for which the parallelization can be utilised. Traditionally, computer software has been written for serial computation . To solve

11214-408: The need to use it. A general protection fault would be produced, indicating a segmentation violation had occurred; however, the system would often crash anyway. The use of virtual memory addressing (such as paging or segmentation) means that the kernel can choose what memory each program may use at any given time, allowing the operating system to use the same memory locations for multiple tasks. If

11340-447: The number of transistors whose inputs change), V is voltage , and F is the processor frequency (cycles per second). Increases in frequency increase the amount of power used in a processor. Increasing processor power consumption led ultimately to Intel 's May 8, 2004 cancellation of its Tejas and Jayhawk processors, which is generally cited as the end of frequency scaling as the dominant computer architecture paradigm. To deal with

11466-408: The open-source Android operating system (introduced 2008), with a Linux kernel and a C library ( Bionic ) partially based on BSD code, became most popular. The components of an operating system are designed to ensure that various parts of a computer function cohesively. With the de facto obsoletion of DOS , all user software must interact with the operating system to access hardware. The kernel

11592-420: The operating system acts as an intermediary between programs and the computer hardware, although the application code is usually executed directly by the hardware and frequently makes system calls to an OS function or is interrupted by it. Operating systems are found on many devices that contain a computer – from cellular phones and video game consoles to web servers and supercomputers . In

11718-471: The operating system kernel, in combination with hardware mechanisms that provide supporting functionalities, such as a memory management unit (MMU). If a process attempts to access a memory location outside its memory space, the MMU denies the request and signals the kernel to take appropriate actions; this usually results in forcibly terminating the offending process. Depending on the software and kernel design and

11844-824: The overhead from resource contention or communication dominates the time spent on other computation, further parallelization (that is, splitting the workload over even more threads) increases rather than decreases the amount of time required to finish. This problem, known as parallel slowdown , can be improved in some cases by software analysis and redesign. Applications are often classified according to how often their subtasks need to synchronize or communicate with each other. An application exhibits fine-grained parallelism if its subtasks must communicate many times per second; it exhibits coarse-grained parallelism if they do not communicate many times per second, and it exhibits embarrassing parallelism if they rarely or never have to communicate. Embarrassingly parallel applications are considered

11970-589: The parallel performance. Understanding data dependencies is fundamental in implementing parallel algorithms . No program can run more quickly than the longest chain of dependent calculations (known as the critical path ), since calculations that depend upon prior calculations in the chain must be executed in order. However, most algorithms do not consist of just a long chain of dependent calculations; there are usually opportunities to execute independent calculations in parallel. Let P i and P j be two program segments. Bernstein's conditions describe when

12096-441: The physical constraints preventing frequency scaling . As power consumption (and consequently heat generation) by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture , mainly in the form of multi-core processors . In computer science , parallelism and concurrency are two different things: a parallel program uses multiple CPU cores , each core performing

12222-421: The piping. Signals may be classified into 7 categories. The categories are: Input/output (I/O) devices are slower than the CPU. Therefore, it would slow down the computer if the CPU had to wait for each I/O to finish. Instead, a computer may implement interrupts for I/O completion, avoiding the need for polling or busy waiting. Some computers require an interrupt for each character or word, costing

12348-452: The possibility of incorrect program execution. These computers require a cache coherency system, which keeps track of cached values and strategically purges them, thus ensuring correct program execution. Bus snooping is one of the most common methods for keeping track of which values are being accessed (and thus should be purged). Designing large, high-performance cache coherence systems is a very difficult problem in computer architecture. As

12474-495: The possibility of program deadlock . An atomic lock locks multiple variables all at once. If it cannot lock all of them, it does not lock any of them. If two threads each need to lock the same two variables using non-atomic locks, it is possible that one thread will lock one of them and the second thread will lock the second variable. In such a case, neither thread can complete, and deadlock results. Many parallel programs require that their subtasks act in synchrony . This requires

12600-503: The priority of individual tasks, so that important jobs receive more processor time than those considered less significant. Depending on the operating system, a task might be as large as an entire application program, or might be made up of smaller threads that carry out portions of the overall program. A processor intended for use with multitasking operating systems may include special hardware to securely support multiple tasks, such as memory protection , and protection rings that ensure

12726-404: The problem into independent parts so that each processing element can execute its part of the algorithm simultaneously with the others. The processing elements can be diverse and include resources such as a single computer with multiple processors, several networked computers, specialized hardware, or any combination of the above. Historically parallel computing was used for scientific computing and

12852-462: The problem of power consumption and overheating the major central processing unit (CPU or processor) manufacturers started to produce power efficient processors with multiple cores. The core is the computing unit of the processor and in multi-core processors each core is independent and can access the same memory concurrently. Multi-core processors have brought parallel computing to desktop computers . Thus parallelization of serial programs has become

12978-564: The processor must first add the 8 lower-order bits from each integer using the standard addition instruction, then add the 8 higher-order bits using an add-with-carry instruction and the carry bit from the lower order addition; thus, an 8-bit processor requires two instructions to complete a single operation, where a 16-bit processor would be able to complete the operation with a single instruction. Historically, 4-bit microprocessors were replaced with 8-bit, then 16-bit, then 32-bit microprocessors. This trend generally came to an end with

13104-566: The processors in a typical distributed system run concurrently in parallel. Operating system An operating system ( OS ) is system software that manages computer hardware and software resources, and provides common services for computer programs . Time-sharing operating systems schedule tasks for efficient use of the system and may also include accounting software for cost allocation of processor time , mass storage , peripherals, and other resources. For hardware functions such as input and output and memory allocation ,

13230-629: The result of the program. This is known as instruction-level parallelism. Advances in instruction-level parallelism dominated computer architecture from the mid-1980s until the mid-1990s. All modern processors have multi-stage instruction pipelines . Each stage in the pipeline corresponds to a different action the processor performs on that instruction in that stage; a processor with an N -stage pipeline can have up to N different instructions at different stages of completion and thus can issue one instruction per clock cycle ( IPC = 1 ). These processors are known as scalar processors. The canonical example of

13356-516: The running process hits a point where it has to wait for some portion of memory to be reloaded from secondary storage. Processes that are entirely independent are not much trouble to program in a multitasking environment. Most of the complexity in multitasking systems comes from the need to share computer resources between tasks and to synchronize the operation of co-operating tasks. Various concurrent computing techniques are used to avoid potential problems caused by multiple tasks attempting to access

13482-418: The same operating system— OS/360 —which consisted of millions of lines of assembly language that had thousands of bugs . The OS/360 also was the first popular operating system to support multiprogramming , such that the CPU could be put to use on one job while another was waiting on input/output (I/O). Holding multiple jobs in memory necessitated memory partitioning and safeguards against one job accessing

13608-445: The same resource. Bigger systems were sometimes built with a central processor(s) and some number of I/O processors , a kind of asymmetric multiprocessing . Over the years, multitasking systems have been refined. Modern operating systems generally include detailed mechanisms for prioritizing processes, while symmetric multiprocessing has introduced new complexities and capabilities. Parallel computing Parallel computing

13734-449: The second segment produces a variable needed by the first segment. The third and final condition represents an output dependency: when two segments write to the same location, the result comes from the logically last executed segment. Consider the following functions, which demonstrate several kinds of dependencies: In this example, instruction 3 cannot be executed before (or even in parallel with) instruction 2, because instruction 3 uses

13860-619: The server and supercomputing sectors. Other specialized classes of operating systems (special-purpose operating systems), such as embedded and real-time systems, exist for many applications. Security-focused operating systems also exist. Some operating systems have low system requirements (e.g. light-weight Linux distribution ). Others may have higher system requirements. Some operating systems require installation or may come pre-installed with purchased computers ( OEM -installation), whereas others may run directly from media (i.e. live CD ) or flash memory (i.e. USB stick). An operating system

13986-473: The several execution units are not entire processors (i.e. processing units). Instructions can be grouped together only if there is no data dependency between them. Scoreboarding and the Tomasulo algorithm (which is similar to scoreboarding but makes use of register renaming ) are two of the most common techniques for implementing out-of-order execution and instruction-level parallelism. Task parallelisms

14112-400: The signal number (in mnemonic format) to be sent. (The abrasive name of kill was chosen because early implementations only terminated the process.) In Unix-like operating systems, signals inform processes of the occurrence of asynchronous events. To communicate asynchronously, interrupts are required. One reason a process needs to asynchronously communicate to another process solves

14238-417: The simulation of scientific problems, particularly in the natural and engineering sciences , such as meteorology . This led to the design of parallel hardware and software, as well as high performance computing . Frequency scaling was the dominant reason for improvements in computer performance from the mid-1980s until 2004. The runtime of a program is equal to the number of instructions multiplied by

14364-432: The size of a problem. Superword level parallelism is a vectorization technique based on loop unrolling and basic block vectorization. It is distinct from loop vectorization algorithms in that it can exploit parallelism of inline code , such as manipulating coordinates, color channels or in loops unrolled by hand. Main memory in a parallel computer is either shared memory (shared between all processing elements in

14490-400: The size of the machine needed. The different CPUs often need to send and receive messages to each other; to ensure good performance, the operating systems for these machines need to minimize this copying of packets . Newer systems are often multiqueue —separating groups of users into separate queues —to reduce the need for packet copying and support more concurrent users. Another technique

14616-805: The specific error in question, the user may receive an access violation error message such as "segmentation fault". In a well designed and correctly implemented multitasking system, a given process can never directly access memory that belongs to another process. An exception to this rule is in the case of shared memory; for example, in the System V inter-process communication mechanism the kernel allocates memory to be mutually shared by multiple processes. Such features are often used by database management software such as PostgreSQL. Inadequate memory protection mechanisms, either due to flaws in their design or poor implementations, allow for security vulnerabilities that may be potentially exploited by malicious software. Use of

14742-412: The supervisory software cannot be damaged or subverted by user-mode program errors. The term "multitasking" has become an international term, as the same word is used in many other languages such as German, Italian, Dutch, Romanian, Czech, Danish and Norwegian. In the early days of computing, CPU time was expensive, and peripherals were very slow. When the computer ran a program that needed access to

14868-445: The two are independent and can be executed in parallel. For P i , let I i be all of the input variables and O i the output variables, and likewise for P j . P i and P j are independent if they satisfy Violation of the first condition introduces a flow dependency, corresponding to the first segment producing a result used by the second segment. The second condition represents an anti-dependency, when

14994-576: The use of a barrier . Barriers are typically implemented using a lock or a semaphore . One class of algorithms, known as lock-free and wait-free algorithms , altogether avoids the use of locks and barriers. However, this approach is generally difficult to implement and requires correctly designed data structures. Not all parallelization results in speed-up. Generally, as a task is split up into more and more threads, those threads spend an ever-increasing portion of their time communicating with each other or waiting on each other for access to resources. Once

15120-473: The world. Middleware , an additional software layer between the operating system and applications, is often used to improve consistency. Although it functions similarly to an operating system, it is not a true operating system. Embedded operating systems are designed to be used in embedded computer systems , whether they are internet of things objects or not connected to a network. Embedded systems include many household appliances. The distinguishing factor

15246-419: Was enhanced by the arrival of virtual memory and virtual machine technology, which enabled individual programs to make use of memory and operating system resources as if other concurrently running programs were, for all practical purposes, nonexistent. Multiprogramming gives no guarantee that a program will run in a timely manner. Indeed, the first program may very well run for hours without needing access to

15372-473: Was eventually supported by many computer operating systems , is known today as cooperative multitasking. Although it is now rarely used in larger systems except for specific applications such as CICS or the JES2 subsystem, cooperative multitasking was once the only scheduling scheme employed by Microsoft Windows and classic Mac OS to enable multiple applications to run simultaneously. Cooperative multitasking

15498-696: Was implemented in the PDP-6 Monitor and Multics in 1964, in OS/360 MFT in 1967, and in Unix in 1969, and was available in some operating systems for computers as small as DEC's PDP-8; it is a core feature of all Unix-like operating systems, such as Linux , Solaris and BSD with its derivatives , as well as modern versions of Windows. At any specific time, processes can be grouped into two categories: those that are waiting for input or output (called " I/O bound "), and those that are fully utilizing

15624-723: Was in the design of real-time computing systems, where there are a number of possibly unrelated external activities needed to be controlled by a single processor system. In such systems a hierarchical interrupt system is coupled with process prioritization to ensure that key activities were given a greater share of available process time . As multitasking greatly improved the throughput of computers, programmers started to implement applications as sets of cooperating processes (e. g., one process gathering input data, one process processing input data, one process writing out results on disk). This, however, required some tools to allow processes to efficiently exchange data. Threads were born from

15750-484: Was the LEO III , first released in 1961. During batch processing , several different programs were loaded in the computer memory, and the first one began to run. When the first program reached an instruction waiting for a peripheral, the context of this program was stored away, and the second program in memory was given a chance to run. The process continued until all programs finished running. The use of multiprogramming

15876-406: Was the first popular computer to use a graphical user interface (GUI). The GUI proved much more user friendly than the text-only command-line interface earlier operating systems had used. Following the success of Macintosh, MS-DOS was updated with a GUI overlay called Windows . Windows later was rewritten as a stand-alone operating system, borrowing so many features from another ( VAX VMS ) that

#923076