The program status word ( PSW ) is a register that performs the function of a status register and program counter , and sometimes more. The term is also applied to a copy of the PSW in storage. This article only discusses the PSW in the IBM System/360 and its successors, and follows the IBM convention of numbering bits starting with 0 as the leftmost (most significant) bit.
38-472: Although certain fields within the PSW may be tested or set by using non-privileged instructions, testing or setting the remaining fields may only be accomplished by using privileged instructions. Contained within the PSW are the two bit condition code , representing zero, positive, negative, overflow, and similar flags of other architectures' status registers . Conditional branch instructions test this encoded as
76-417: A with b and Jump to c if Equal. The result of the test is not saved for subsequent instructions. Another alternative to the status register is for processor instructions to deposit status information in a general-purpose register when the program requests it. MIPS , AMD 29000 , DEC Alpha , and RISC-V are examples of architectures that provide comparison instructions that store the comparison result in
114-406: A calculation that has not yet occurred. Various processors may stall, may attempt branch prediction , and may be able to begin to execute two different program sequences ( eager execution ), each assuming the branch is or is not taken, discarding all work that pertains to the incorrect guess. A processor with an implementation of branch prediction that usually makes correct predictions can minimize
152-400: A four bit value, with each bit representing a test of one of the four condition code values, 2 + 2 + 2 + 2. (Since IBM uses big-endian bit numbering, mask value 8 selects code 0, mask value 4 selects code 1, mask value 2 selects code 2, and mask value 1 selects code 3.) The 64-bit PSW describes (among other things) In the early instances of the architecture (System/360 and early System/370),
190-400: A general-purpose register, as a single bit or a numeric value of 0 or 1. Conditional branches act based on the value in the general-purpose register. Usually, comparison instructions test equality or signed/unsigned magnitude. To test for other conditions, a program uses an equivalence formula. For example, MIPS has no "carry bit" but a program performing multiple-word addition can test whether
228-408: A like way, it might use more total energy, while using less energy per instruction. Out of order CPUs can usually do more instructions per second because they can do several instructions at once. In a pipelined computer, the control unit arranges for the flow to start, continue, and stop as a program commands. The instruction data is usually passed in pipeline registers from one stage to the next, with
266-406: A series of sequential steps (the eponymous " pipeline ") performed by different processor units with different parts of instructions processed in parallel. In a pipelined computer, instructions flow through the central processing unit (CPU) in stages. For example, it might have one stage for each step of the von Neumann cycle : Fetch the instruction, fetch the operands, do the instruction, write
304-409: A single-word addition of registers overflowed by testing whether the sum is lower than an operand: The sltu instruction sets tmp to 1 or 0 based on the specified comparison of its two other operands. (Here, the general-purpose register tmp is not used as a status register to govern a conditional jump; rather, the possible value of 1, indicating carry from the low-order addition, is added to
342-413: A somewhat separated piece of control logic for each stage. The control unit also assures that the instruction in each stage does not harm the operation of instructions in other stages. For example, if two stages must use the same piece of data, the control logic assures that the uses are done in the correct sequence. When operating efficiently, a pipelined computer will have an instruction in each stage. It
380-412: Is in use most of the time. In contrast, out of order computers usually have large amounts of idle logic at any given instant. Similar calculations usually show that a pipelined computer uses less energy per instruction. However, a pipelined computer is usually more complex and more costly than a comparable multicycle computer. It typically has more logic gates, registers and a more complex control unit. In
418-478: Is stalled for one cycle, as is the red instruction after it. Because of the bubble (the blue ovals in the illustration), the processor's Decode circuitry is idle during cycle 3. Its Execute circuitry is idle during cycle 4 and its Write-back circuitry is idle during cycle 5. When the bubble moves out of the pipeline (at cycle 6), normal execution resumes. But everything now is one cycle late. It will take 8 cycles (cycle 1 through 8) rather than 7 to completely execute
SECTION 10
#1732790067015456-419: Is the list of instructions waiting to be executed, the bottom gray box is the list of instructions that have had their execution completed, and the middle white box is the pipeline. The execution is as follows: A pipelined processor may deal with hazards by stalling and creating a bubble in the pipeline, resulting in one or more cycles in which nothing useful happens. In the illustration at right, in cycle 3,
494-441: Is then working on all of those instructions at the same time. It can finish about one instruction for each cycle of its clock. But when a program switches to a different sequence of instructions, the pipeline sometimes must discard the data in process and restart. This is called a "stall." Much of the design of a pipelined computer prevents interference between the stages and reduces stalls. The number of dependent steps varies with
532-424: The machine code instructions executing on the processor. The status register lets an instruction take action contingent on the outcome of a previous instruction. Typically, flags in the status register are modified as effects of arithmetic and bit manipulation operations. For example, a Z bit may be set if the result of the operation is zero and cleared if it is nonzero. Other classes of instructions may also modify
570-524: The x86 architecture , flags in the program status word (PSW) register in the IBM System/360 architecture through z/Architecture , and the application program status register (APSR) in the ARM Cortex-A architecture. The status register is a hardware register that contains information about the state of the processor . Individual bits are implicitly or explicitly read and/or written by
608-471: The Extract PSW instruction (EPSW). On all but 360/20 , the PSW has the following formats. S/360 Extended PSW format only applies to the 360/67 with bit 8 of control register 6 set. Condition code register A status register , flag register , or condition code register ( CCR ) is a collection of status flag bits for a processor . Examples of such registers include FLAGS register in
646-450: The XMP line of supercomputers, using pipelining for both multiply and add/subtract functions. Later, Star Technologies added parallelism (several pipelined functions working in parallel), developed by Roger Chen. In 1984, Star Technologies added the pipelined divide circuit developed by James Bradley. By the mid-1980s, pipelining was used by many different companies around the world. Pipelining
684-407: The compiler could be designed to generate machine code that avoids hazards. In some early DSP and RISC processors, the documentation advises programmers to avoid such dependencies in adjacent and nearly adjacent instructions (called delay slots ), or declares that the second instruction uses an old value rather than the desired value (in the example above, the processor might counter-intuitively copy
722-547: The earlier instruction. Some CPU architectures, such as the MIPS and Alpha , do not use a dedicated flag register. Others do not implicitly set and/or read flags. Such machines either do not pass implicit status information between instructions at all, or they pass it in an explicitly selected general purpose register. A status register may often have other fields as well, such as more specialized flags, interrupt enable bits, and similar types of information. During an interrupt,
760-402: The flags to indicate status. For example, a string instruction may do so to indicate whether the instruction terminated because it found a match/mismatch or because it found the end of the string. The flags are read by a subsequent conditional instruction so that the specified action (depending on the processor, a jump, call, return, or so on) occurs only if the flags indicate a specified result of
798-412: The following two register instructions to a hypothetical processor: If the processor has the 5 steps listed in the initial illustration (the 'Basic five-stage pipeline' at the start of the article), instruction 1 would be fetched at time t 1 and its execution would be complete at t 5 . Instruction 2 would be fetched at t 2 and would be complete at t 6 . The first instruction might deposit
SECTION 20
#1732790067015836-463: The fourth clock cycle (the green column), the earliest instruction is in MEM stage, and the latest instruction has not yet entered the pipeline. In computer engineering , instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming instructions into
874-432: The high-order word.) This scheme becomes less convenient when adding three or more words, as there are two additions when computing b + c + tmp , either of which may generate a carry, which must be detected with two sltu instructions. Fortunately, those two carries may be added to each other without risk of overflow, so the situation stabilizes at five instructions per word added. Instruction pipeline In
912-435: The incremented number into R5 as its fifth step (register write back) at t 5 . But the second instruction might get the number from R5 (to copy to R6) in its second step (instruction decode and register fetch) at time t 3 . It seems that the first instruction would not have incremented the value by then. The above code invokes a hazard. Writing computer programs in a compiled language might not raise these concerns, as
950-501: The instruction address was 24 bits; in later instances (XA/370), the instruction address was 31 bits plus a mode bit (24 bit addressing mode if zero; 31 bit addressing mode if one) for a total of 32 bits. In the present instances of the architecture ( z/Architecture ), the instruction address is 64 bits and the PSW itself is 128 bits. The PSW may be loaded by the LOAD PSW instruction ( LPSW or LPSWE). Its contents may be examined with
988-457: The machine architecture. For example: As the pipeline is made "deeper" (with a greater number of dependent steps), a given step can be implemented with simpler circuitry, which may let the processor clock run faster. Such pipelines may be called superpipelines. A processor is said to be fully pipelined if it can fetch an instruction on every cycle. Thus, if some instructions or conditions require delays that inhibit fetching new instructions,
1026-459: The next one begins: A branch out of the normal instruction sequence often involves a hazard. Unless the processor can give effect to the branch in a single time cycle, the pipeline will continue fetching instructions sequentially. Such instructions cannot be allowed to take effect because the programmer has diverted control to another part of the program. A conditional branch is even more problematic. The processor may or may not branch, depending on
1064-407: The other edge. This allows more CPU throughput than a multicycle computer at a given clock rate , but may increase latency due to the added overhead of the pipelining process itself. Also, even though the electronic logic has a fixed maximum speed, a pipelined computer can be made faster or slower by varying the number of stages in the pipeline. With more stages, each stage does less work, and so
1102-411: The performance penalty from branching. However, if branches are predicted poorly, it may create more work for the processor, such as flushing from the pipeline the incorrect code path that has begun execution before resuming execution at the correct location. Programs written for a pipelined processor deliberately avoid branching to minimize possible loss of speed. For example, the programmer can handle
1140-583: The processor cannot decode the purple instruction, perhaps because the processor determines that decoding depends on results produced by the execution of the green instruction. The green instruction can proceed to the Execute stage and then to the Write-back stage as scheduled, but the purple instruction is stalled for one cycle at the Fetch stage. The blue instruction, which was due to be fetched during cycle 3,
1178-758: The processor is not fully pipelined. Seminal uses of pipelining were in the ILLIAC II project and the IBM Stretch project, though a simple version was used earlier in the Z1 in 1939 and the Z3 in 1941. Pipelining began in earnest in the late 1970s in supercomputers such as vector processors and array processors. One of the early supercomputers was the Cyber series built by Control Data Corporation. Its main architect, Seymour Cray , later headed Cray Research. Cray developed
Program status word - Misplaced Pages Continue
1216-411: The result of a previous instruction. In pipelined processors, such as superscalar and speculative processors, this can create hazards that slow processing or require extra hardware to work around them. Some very long instruction word processors dispense with the status flags. A single instruction both performs a test and indicates on which outcome of that test to take an action, such as Compare
1254-462: The results. A pipelined computer usually has "pipeline registers" after each stage. These store information from the instruction and calculations so that the logic gates of the next stage can do the next step. This arrangement lets the CPU complete an instruction on each clock cycle. It is common for even-numbered stages to operate on one edge of the square-wave clock, while odd-numbered stages operate on
1292-414: The stage has fewer delays from the logic gates and could run at a higher clock rate. A pipelined model of computer is often the most economical, when cost is measured as logic gates per instruction per second. At each instant, an instruction is in only one pipeline stage, and on average, a pipeline stage is less costly than a multicycle computer. Also, when made well, most of the pipelined computer's logic
1330-496: The status of the thread currently executing can be preserved (and later recalled) by storing the current value of the status register along with the program counter and other active registers into the machine stack or some other reserved area of memory. This is a list of the most common CPU status register flags, implemented in almost all modern processors. On some processors, the status register also contains flags such as these: Status flags enable an instruction to act based on
1368-421: The unincremented value), or declares that the value it uses is undefined. The programmer may have unrelated work that the processor can do in the meantime; or, to ensure correct results, the programmer may insert NOPs into the code, partly negating the advantages of pipelining. Pipelined processors commonly use three techniques to work as expected when the programmer assumes that each instruction completes before
1406-483: The usual case with sequential execution and branch only on detecting unusual cases. Using programs such as gcov to analyze code coverage lets the programmer measure how often particular branches are actually executed and gain insight with which to optimize the code. In some cases, a programmer can handle both the usual case and unusual case with branch-free code . To the right is a generic pipeline with four stages: fetch, decode, execute and write-back. The top gray box
1444-484: Was not limited to supercomputers. In 1976, the Amdahl Corporation 's 470 series general purpose mainframe had a 7-step pipeline, and a patented branch prediction circuit. The model of sequential execution assumes that each instruction completes before the next one begins; this assumption is not true on a pipelined processor. A situation where the expected result is problematic is known as a hazard . Imagine
#14985