Am386 - Misplaced Pages

The Am386 CPU is a 100%-compatible clone of the Intel 80386 design released by AMD in March 1991. It sold millions of units, positioning AMD as a legitimate competitor to Intel , rather than being merely a second source for x86 CPUs (then termed 8086-family ).

#565434

101-554: While the AM386 CPU was essentially ready to be released prior to 1991, Intel kept it tied up in court. Intel learned of the Am386 when both companies hired employees with the same name who coincidentally stayed at the same hotel, which accidentally forwarded a package for AMD to Intel's employee. AMD had previously been a second-source manufacturer of Intel's Intel 8086 , Intel 80186 and Intel 80286 designs, and AMD's interpretation of

202-618: A 64 KB (one segment) stack in memory supported by computer hardware . Only words (two bytes) can be pushed to the stack. The stack grows toward numerically lower addresses, with SS:SP pointing to the most recently pushed item. There are 256 interrupts , which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the return address . The original Intel 8086 and 8088 have fourteen 16- bit registers. Four of them (AX, BX, CX, DX) are general-purpose registers (GPRs), although each may have an additional purpose; for example, only CX can be used as

303-579: A backward compatible version of this functionality on the same microprocessor as the main processor. In addition to this, modern x86 designs also contain a SIMD -unit (see SSE below) where instructions can work in parallel on (one or two) 128-bit words, each containing two or four floating-point numbers (each 64 or 32 bits wide respectively), or alternatively, 2, 4, 8 or 16 integers (each 64, 32, 16 or 8 bits wide respectively). The presence of wide SIMD registers means that existing x86 processors can load or store up to 128 bits of memory data in

404-403: A compatible design) and the scalability of x86 chips in the form of modern multi-core CPUs, is underlining x86 as an example of how continuous refinement of established industry standards can resist the competition from completely new architectures. The table below lists processor models and model series implementing various architectures in the x86 family, in chronological order. Each line item

505-539: A counter with the loop instruction. Each can be accessed as two separate bytes (thus BX's high byte can be accessed as BH and low byte as BL). Two pointer registers have special roles: SP (stack pointer) points to the "top" of the stack , and BP (base pointer) is often used to point at some other place in the stack, typically above the local variables (see frame pointer ). The registers SI, DI, BX and BP are address registers , and may also be used for array indexing. One of four possible 'segment registers' (CS, DS, SS and ES)

606-430: A fully static CMOS version for battery powered devices, manufactured using Intel's CHMOS processes. The original chip measured 33 mm² and minimum feature size was 3.2 μm. The MUL and DIV instructions were very slow due to being microcoded so x86 programmers usually just used the bit shift instructions for multiplying and dividing instead. The 8086 was die-shrunk to 2 μm in 1981; this version also corrected

707-476: A major change to the architecture referred to as X86S (formerly known as X86-S). The S in X86S stands for "simplification", which aims to remove support for legacy execution modes and instructions. A processor implementing this proposal would start execution directly in long mode and would only support 64-bit operating systems. 32-bit code would only be supported for user applications running in ring 3, and would use

808-436: A mathematical coprocessor to add hardware/microcode-based floating-point performance. The Intel 8087 was the standard math coprocessor for the 8086 and 8088, operating on 80-bit numbers. Manufacturers like Cyrix (8087-compatible) and Weitek ( not 8087-compatible) eventually came up with high-performance floating-point coprocessors that competed with the 8087. The clock frequency was originally limited to 5 MHz, but

909-547: A memory location. However, this memory operand may also be the destination (or a combined source and destination), while the other operand, the source, can be either register or immediate. Among other factors, this contributes to a code size that rivals eight-bit machines and enables efficient use of instruction cache memory. The relatively small number of general registers (also inherited from its 8-bit ancestors) has made register-relative addressing (using small immediate offsets) an important method of accessing operands, especially on

1010-560: A more complex micro-op which fits the execution model better and thus can be executed faster or with fewer machine resources involved. Another way to try to improve performance is to cache the decoded micro-operations, so the processor can directly access the decoded micro-operations from a special cache, instead of decoding them again. Intel followed this approach with the Execution Trace Cache feature in their NetBurst microarchitecture (for Pentium 4 processors) and later in

1111-436: A plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186 , 80286 , 80386 and 80486 . Colloquially, their names were "186", "286", "386" and "486". The term is not synonymous with IBM PC compatibility , as this implies a multitude of other computer hardware . Embedded systems and general-purpose computers used x86 chips before

SECTION 10

#1732772608566

1212-411: A program small enough to fit in one segment. Far pointers are 32-bit segment:offset pairs resolving to 20-bit external addresses. Some compilers also support huge pointers, which are like far pointers except that pointer arithmetic on a huge pointer treats it as a linear 20-bit pointer, while pointer arithmetic on a far pointer wraps around within its 16-bit offset without touching the segment part of

1313-496: A response to the successful 8080-compatible Zilog Z80 , the x86 line soon grew in features and processing power. Today, x86 is ubiquitous in both stationary and portable personal computers, and is also used in midrange computers , workstations , servers, and most new supercomputer clusters of the TOP500 list. A large amount of software , including a large list of x86 operating systems are using x86-based hardware. Modern x86

1414-535: A second source production of the Intel chip, but as a reverse engineered pin compatible version. In fact, it was AMD's first entry in the x86 market other than as a second source for Intel. AMD 386SX processors were available at higher clock speeds at the time they were introduced and still cheaper than the Intel 386SX. Produced in 0.8 μm technology and using a static core, their clock speed could be dropped down to 0 MHz, consuming just some mWatts. Power consumption

1515-591: A single ALU cycle (instead of two, via internal carry, as in the 8080 and 8085), speeding up such instructions considerably. Combined with orthogonalizations of operations versus operand types and addressing modes , as well as other enhancements, this made the performance gain over the 8080 or 8085 fairly significant, despite cases where the older chips may be faster (see below). As can be seen from these tables, operations on registers and immediates were fast (between 2 and 4 cycles), while memory-operand instructions and jumps were quite slow; jumps took more cycles than on

1616-670: A single instruction and also perform bitwise operations (although not integer arithmetic ) on full 128-bits quantities in parallel. Intel's Sandy Bridge processors added the Advanced Vector Extensions (AVX) instructions, widening the SIMD registers to 256 bits. The Intel Initial Many Core Instructions implemented by the Knights Corner Xeon Phi processors, and the AVX-512 instructions implemented by

1717-457: A single segment, just as in most 8-bit based processors, and can be used to build .com files for instance. Precompiled libraries often come in several versions compiled for different memory models. According to Morse et al.,. the designers actually contemplated using an 8-bit shift (instead of 4-bit), in order to create a 16 MB physical address space. However, as this would have forced segments to begin on 256-byte boundaries, and 1 MB

1818-542: A small 18-pin "memory package", which ruled out the use of a separate address bus (Intel was primarily a DRAM manufacturer at the time). Two years later, Intel launched the 8080 , employing the new 40-pin DIL packages originally developed for calculator ICs to enable a separate address bus. It had an extended instruction set that is source-compatible (not binary compatible ) with the 8008 and also included some 16-bit instructions to make programming easier. The 8080 device

1919-423: A small program (less than 64 KB) can be loaded starting at a fixed offset (such as 0000) in its own segment, avoiding the need for relocation , with at most 15 bytes of alignment waste. Compilers for the 8086 family commonly support two types of pointer , near and far . Near pointers are 16-bit offsets implicitly associated with the program's code or data segment and so can be used only within parts of

2020-422: A stack register bug in the original 3.5 μm chips. Later 1.5 μm and CMOS variants were outsourced to other manufacturers and not developed in-house. The architecture was defined by Stephen P. Morse with some help from Bruce Ravenel (the architect of the 8087) in refining the final revisions. Logic designer Jim McKevitt and John Bayliss were the lead engineers of the hardware-level development team and Bill Pohlman

2121-470: Is allowed for almost all instructions. The largest native size for integer arithmetic and memory addresses (or offsets ) is 16, 32 or 64 bits depending on architecture generation (newer processors include direct support for smaller integers as well). Multiple scalar values can be handled simultaneously via the SIMD unit present in later generations, as described below. Immediate addressing offsets and immediate data may be expressed as 8-bit quantities for

SECTION 20

#1732772608566

2222-675: Is assumed, specifically, that the DS and ES segments address the same region of memory. Although partly shadowed by other design choices in this particular chip, the multiplexed address and data buses limit performance slightly; transfers of 16-bit or 8-bit quantities are done in a four-clock memory access cycle, which is faster on 16-bit, although slower on 8-bit quantities, compared to many contemporary 8-bit based CPUs. As instructions vary from one to six bytes, fetch and execution are made concurrent and decoupled into separate units (as it remains in today's x86 processors): The bus interface unit feeds

2323-691: Is characterized by significantly improved or commercially successful processor microarchitecture designs. At various times, companies such as IBM , VIA , NEC , AMD , TI , STM , Fujitsu , OKI , Siemens , Cyrix , Intersil , C&T , NexGen , UMC , and DM&P started to design or manufacture x86 processors (CPUs) intended for personal computers and embedded systems. Other companies that designed or manufactured x86 or x87 processors include ITT Corporation , National Semiconductor , ULSI System Technology, and Weitek . Such x86 implementations were seldom simple copies but often employed different internal microarchitectures and different solutions at

2424-453: Is copied one byte (8-bit character) at a time. The example code uses the BP (base pointer) register to establish a call frame , an area on the stack that contains all of the parameters and local variables for the execution of the subroutine. This kind of calling convention supports reentrant and recursive code and has been used by Algol-like languages since the late 1950s. A flat memory model

2525-425: Is limited to 64 KB, simply because internal address/index registers are only 16 bits wide. Programming over 64 KB memory boundaries involves adjusting the segment registers (see below); this difficulty existed until the 80386 architecture introduced wider (32-bit) registers (the memory management hardware in the 80286 did not help in this regard, as its registers are still only 16 bits wide). Some of

2626-463: Is one of the two modes only available in long mode . The addressing modes were not dramatically changed from 32-bit mode, except that addressing was extended to 64 bits, virtual addresses are now sign extended to 64 bits (in order to disallow mode bits in virtual addresses), and other selector details were dramatically reduced. In addition, an addressing mode was added to allow memory references relative to RIP (the instruction pointer ), to ease

2727-587: Is relatively uncommon in embedded systems , however, and small low power applications (using tiny batteries), and low-cost microprocessor markets, such as home appliances and toys, lack significant x86 presence. Simple 8- and 16-bit based architectures are common here, as well as simpler RISC architectures like RISC-V , although the x86-compatible VIA C7 , VIA Nano , AMD 's Geode , Athlon Neo and Intel Atom are examples of 32- and 64-bit designs used in some relatively low-power and low-cost segments. There have been several attempts, including by Intel, to end

2828-476: Is supported in hardware ; 16-bit words are pushed onto the stack, and the top of the stack is pointed to by SS:SP. There are 256 interrupts , which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the return addresses . The 8086 has 64 K of 8-bit (or alternatively 32 K of 16-bit word) I/O port space. The 8086 has a 16-bit flags register . Nine of these condition code flags are active, and indicate

2929-515: Is that the 8086 also introduced some new instructions (not present in the 8080 and 8085) to better support stack-based high-level programming languages such as Pascal and PL/M ; some of the more useful instructions are push mem-op , and ret size , supporting the "Pascal calling convention " directly. (Several others, such as push immed and enter , were added in the subsequent 80186, 80286, and 80386 processors.) A 64 KB (one segment) stack growing towards lower addresses

3030-491: Is used to form a memory address. In the original 8086 / 8088 / 80186 / 80188 every address was built from a segment register and one of the general purpose registers. For example ds:si is the notation for an address formed as [16 * ds + si] to allow 20-bit addressing rather than 16 bits, although this changed in later processors. At that time only certain combinations were supported. The FLAGS register contains flags such as carry flag , overflow flag and zero flag . Finally,

3131-458: The fstsw instruction, and it is common to simply use some of its bits for branching by copying it into the normal FLAGS. In the Intel 80286 , to support protected mode , three special registers hold descriptor table addresses (GDTR, LDTR, IDTR ), and a fourth task register (TR) is used for task switching. The 80287 is the floating-point coprocessor for the 80286 and has the same registers as

Am386 - Misplaced Pages Continue

3232-525: The 6x86 was significantly faster than the Pentium on integer code. AMD later managed to grow into a serious contender with the K6 set of processors, which gave way to the very successful Athlon and Opteron . There were also other contenders, such as Centaur Technology (formerly IDT ), Rise Technology , and Transmeta . VIA Technologies ' energy efficient C3 and C7 processors, which were designed by

3333-496: The 80486 and all subsequent x86 models, the floating-point processing unit (FPU) is integrated on-chip. The Pentium MMX added eight 64-bit MMX integer vector registers (MM0 to MM7, which share lower bits with the 80-bit-wide FPU stack). With the Pentium III , Intel added a 32-bit Streaming SIMD Extensions (SSE) control/status register (MXCSR) and eight 128-bit SSE floating-point registers (XMM0 to XMM7). Starting with

3434-418: The 8086 family ) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088 . The 8086 was introduced in 1978 as a fully 16-bit extension of 8-bit Intel's 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by

3535-406: The 8088 and 80286 were still in common use, the term x86 usually represented any 8086-compatible CPU. Today, however, x86 usually implies binary compatibility with the 32-bit instruction set of the 80386 . This is due to the fact that this instruction set has become something of a lowest common denominator for many modern operating systems and also probably because the term became common after

3636-479: The ALGOL -family of languages, including Pascal and PL/M . According to principal architect Stephen P. Morse , this was a result of a more software-centric approach. Other enhancements included microcode instructions for the multiply and divide assembly language instructions. Designers also anticipated coprocessors , such as 8087 and 8089 , so the bus structure was designed to be flexible. The first revision of

3737-573: The AMD Opteron processor, the x86 architecture extended the 32-bit registers into 64-bit registers in a way similar to how the 16 to 32-bit extension took place. An R -prefix (for "register") identifies the 64-bit registers (RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP), and eight additional 64-bit general registers (R8–R15) were also introduced in the creation of x86-64 . Also, eight more SSE vector registers (XMM8–XMM15) were added. However, these extensions are only usable in 64-bit mode, which

3838-653: The Centaur company, were sold for many years following their release in 2005. Centaur's 2008 design, the VIA Nano , was their first processor with superscalar and speculative execution . It was introduced at about the same time (in 2008) as Intel introduced the Intel Atom , its first "in-order" processor after the P5 Pentium . Many additions and extensions have been added to the original x86 instruction set over

3939-461: The machine code format was expanded. To provide backward compatibility, segments with executable code can be marked as containing either 16-bit or 32-bit instructions. Special prefixes allow inclusion of 32-bit instructions in a 16-bit segment or vice versa. The 80386 had an optional floating-point coprocessor, the 80387 ; it had eight 80-bit wide registers: st(0) to st(7), like the 8087 and 80287. The 80386 could also use an 80287 coprocessor. With

4040-488: The "16-bit microprocessor" identity of the 8086. A 20-bit external address bus provides a 1 MiB physical address space (2 = 1,048,576 x 1 byte ). This address space is addressed by means of internal memory "segmentation". The data bus is multiplexed with the address bus in order to fit all of the control lines into a standard 40-pin dual in-line package . It provides a 16-bit I/O address bus, supporting 64 KB of separate I/O space. The maximum linear address space

4141-545: The 1998–1999 Lunar Prospector . For the packaging, the Intel 8086 was available both in ceramic and plastic DIP packages. Compatible—and, in many cases, enhanced—versions were manufactured by Fujitsu , Harris / Intersil , OKI , Siemens , Texas Instruments , NEC , Mitsubishi , and AMD . For example, the NEC V20 and NEC V30 pair were hardware-compatible with the 8088 and 8086 even though NEC made original Intel clones μPD8088D and μPD8086D respectively, but incorporated

Am386 - Misplaced Pages Continue

4242-429: The 386DX-40 performs nearly on par with a 25 MHz 486 due to the 486 needing fewer clock cycles per instruction, thanks to its tighter pipelining (more overlapping of internal processing) in combination with an on-chip CPU cache . However, its 32-bit 40 MHz data bus gave the 386DX-40 comparatively good memory and I/O performance. In 1991 AMD also introduced advanced versions of the 386SX processor – again not as

4343-484: The 40th anniversary of the Intel 8086, called the Intel Core i7-8086K . In 1972, Intel launched the 8008 , Intel's first 8-bit microprocessor. It implemented an instruction set designed by Datapoint Corporation with programmable CRT terminals in mind, which also proved to be fairly general-purpose. The device needed several additional ICs to produce a functional computer, in part due to it being packaged in

4444-464: The 8086 itself. The 8086 has eight more-or-less general 16-bit registers (including the stack pointer but excluding the instruction pointer, flag register and segment registers). Four of them, AX, BX, CX, DX, can also be accessed as 8-bit register pairs (see figure) while the other four, SI, DI, BP, SP, are 16-bit only. Due to a compact encoding inspired by 8-bit processors, most instructions are one-address or two-address operations, which means that

4545-631: The 8086 through both industrial espionage and reverse engineering . The resulting chip, K1810VM86 , was binary and pin-compatible with the 8086. i8086 and i8088 were respectively the cores of the Soviet-made PC-compatible EC1831 and EC1832 desktops. (EC1831 is the EC identification of IZOT 1036C and EC1832 is the EC identification of IZOT 1037C, developed and manufactured in Bulgaria. EC stands for Единая Система.) However,

4646-471: The 8087 with the same data formats. With the advent of the 32-bit 80386 processor, the 16-bit general-purpose registers, base registers, index registers, instruction pointer, and FLAGS register , but not the segment registers, were expanded to 32 bits. The nomenclature represented this by prefixing an " E " (for "extended") to the register names in x86 assembly language . Thus, the AX register corresponds to

4747-481: The Am386 were sold well into the mid-1990s; at the end as budget motherboards for those who were only interested in running MS-DOS or Windows 3.1x applications. The Am386 and its low-power successors were also popular choices for embedded systems , for a much longer period than their life span as PC processors. Intel 8086 The 8086 (also called iAPX 86 ) is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it

4848-634: The Decoded Stream Buffer (for Core-branded processors since Sandy Bridge). Transmeta used a completely different method in their Crusoe x86 compatible CPUs. They used just-in-time translation to convert x86 instructions to the CPU's native VLIW instruction set. Transmeta argued that their approach allows for more power efficient designs since the CPU can forgo the complicated decode step of more traditional x86 implementations. Addressing modes for 16-bit processor modes can be summarized by

4949-589: The EC1831 computer (IZOT 1036C) had significant hardware differences from the IBM PC prototype. The EC1831 was the first PC-compatible computer with dynamic bus sizing (US Pat. No 4,831,514). Later some of the EC1831 principles were adopted in PS/2 (US Pat. No 5,548,786) and some other machines (UK Patent Application, Publication No. GB-A-2211325, Published June 28, 1989). X86 x86 (also known as 80x86 or

5050-877: The Knights Landing Xeon Phi processors and by Skylake-X processors, use 512-bit wide SIMD registers. During execution , current x86 processors employ a few extra decoding steps to split most instructions into smaller pieces called micro-operations. These are then handed to a control unit that buffers and schedules them in compliance with x86-semantics so that they can be executed, partly in parallel, by one of several (more or less specialized) execution units . These modern x86 designs are thus pipelined , superscalar , and also capable of out of order and speculative execution (via branch prediction , register renaming , and memory dependence prediction ), which means they may execute multiple (partial or complete) x86 instructions simultaneously, and not necessarily in

5151-540: The PC-compatible market started , some of them before the IBM PC (1981) debut. As of June 2022 , most desktop and laptop computers sold are based on the x86 architecture family, while mobile categories such as smartphones or tablets are dominated by ARM . At the high end, x86 continues to dominate computation-intensive workstation and cloud computing segments. In the 1980s and early 1990s, when

SECTION 50

#1732772608566

5252-498: The address. To avoid the need to specify near and far on numerous pointers, data structures, and functions, compilers also support "memory models" which specify default pointer sizes. The tiny (max 64K), small (max 128K), compact (data > 64K), medium (code > 64K), large (code,data > 64K), and huge (individual arrays > 64K) models cover practical combinations of near, far, and huge pointers for code and data. The tiny model means that code and data are shared in

5353-434: The advanced but delayed 5k86 ( K5 ), which, internally, was closely based on AMD's earlier 29K RISC design; similar to NexGen 's Nx586 , it used a strategy such that dedicated pipeline stages decode x86 instructions into uniform and easily handled micro-operations , a method that has remained the basis for most x86 designs to this day. Some early versions of these microprocessors had heat dissipation problems. The 6x86

5454-415: The art, had been planned for 2021; as of March 2022 the release had not taken place, however. The instruction set architecture has twice been extended to a larger word size. In 1985, Intel released the 32-bit 80386 (later known as i386) which gradually replaced the earlier 16-bit chips in computers (although typically not in embedded systems ) during the following years; this extended programming model

5555-486: The contract, made up in 1982, was that it covered all derivatives of them. Intel, however, claimed that the contract only covered the 80286 and prior processors and forbade AMD the right to manufacture 80386 CPUs in 1987. After a few years in the courtrooms, AMD finally won the case and the right to sell their Am386 in March 1991. This also paved the way for competition in the 80386 -compatible 32-bit CPU market and so lowered

5656-457: The control pins, which carry essential signals for all external operations, have more than one function depending upon whether the device is operated in min or max mode. The former mode is intended for small single-processor systems, while the latter is for medium or large systems using more than one processor (a kind of multiprocessor mode). Maximum mode is required when using an 8087 or 8089 coprocessor. The voltage on pin 33 (MN/ MX ) determines

5757-496: The cost of owning a PC. While Intel's 386 CPUs had topped out at 33 MHz in 1989, AMD introduced 40 MHz versions of both its 386DX and 386SX out of the gate, extending the lifespan of the architecture. In the following two years the AMD 386DX-40 saw popularity with small manufacturers of PC clones and with budget-minded computer enthusiasts because it offered near- 80486 performance at a much lower price than an actual 486. Generally

5858-483: The current state of the processor: Carry flag (CF), Parity flag (PF), Auxiliary carry flag (AF), Zero flag (ZF), Sign flag (SF), Trap flag (TF), Interrupt flag (IF), Direction flag (DF), and Overflow flag (OF). Also referred to as the status word, the layout of the flags register is as follows: There are also four 16-bit segment registers (see figure) that allow the 8086 CPU to access one megabyte of memory in an unusual way. Rather than concatenating

5959-569: The electronic and physical levels. Quite naturally, early compatible microprocessors were 16-bit, while 32-bit designs were developed much later. For the personal computer market, real quantities started to appear around 1990 with i386 and i486 compatible processors, often named similarly to Intel's original chips. After the fully pipelined i486 , in 1993 Intel introduced the Pentium brand name (which, unlike numbers, could be trademarked ) for their new set of superscalar x86 designs. With

6060-489: The equivalence of different segment:offset pairs. In practice the use of "huge" pointers and similar mechanisms was widespread and the flat 32-bit addressing made possible with the 32-bit offset registers in the 80386 eventually extended the limited addressing range in a more general way. The instruction stream is fetched from memory as words and is addressed internally by the processor to the byte level as necessary. An instruction stream queuing mechanism allows up to 6 bytes of

6161-401: The execution units with the decode steps opens up possibilities for more analysis of the (buffered) code stream, and therefore permits detection of operations that can be performed in parallel, simultaneously feeding more than one execution unit. The latest processors also do the opposite when appropriate; they combine certain x86 sequences (such as a compare followed by a conditional jump) into

SECTION 60

#1732772608566

6262-549: The first two actively produce modern 64-bit designs, leading to what has been called a "duopoly" of Intel and AMD in x86 processors. However, in 2014 the Shanghai-based Chinese company Zhaoxin , a joint venture between a Chinese company and VIA Technologies, began designing VIA based x86 processors for desktops and laptops. The release of its newest "7" family of x86 processors (e.g. KX-7000), which are not quite as fast as AMD or Intel chips but are still state of

6363-528: The formula: Addressing modes for 32-bit x86 processor modes can be summarized by the formula: Addressing modes for the 64-bit processor mode can be summarized by the formula: Instruction relative addressing in 64-bit code (RIP + displacement, where RIP is the instruction pointer register ) simplifies the implementation of position-independent code (as used in shared libraries in some operating systems). The 8086 had 64 KB of eight-bit (or alternatively 32 K-word of 16-bit ) I/O space, and

6464-399: The frequently occurring cases or contexts where a −128..127 range is enough. Typical instructions are therefore 2 or 3 bytes in length (although some are much longer, and some are single-byte). To further conserve encoding space, most registers are expressed in opcodes using three or four bits, the latter via an opcode prefix in 64-bit mode, while at most one operand to an instruction can be

6565-501: The implementation of position-independent code , used in shared libraries in some operating systems. SIMD registers XMM0–XMM15 (XMM0–XMM31 when AVX-512 is supported). SIMD registers YMM0–YMM15 (YMM0–YMM31 when AVX-512 is supported). Lower half of each of the YMM registers maps onto the corresponding XMM register. SIMD registers ZMM0–ZMM31. Lower half of each of the ZMM registers maps onto

6666-408: The instruction pointer (IP) points to the next instruction that will be fetched from memory and then executed; this register cannot be directly accessed (read or written) by a program. The Intel 80186 and 80188 are essentially an upgraded 8086 or 8088 CPU, respectively, with on-chip peripherals added, and they have the same CPU registers as the 8086 and 8088 (in addition to interface registers for

6767-420: The instruction set and high level architecture was ready after about three months, and as almost no CAD tools were used, four engineers and 12 layout people were simultaneously working on the chip. The 8086 took a little more than two years from idea to working product, which was considered fast for a complex design in the 1970s. The 8086 was sequenced using a mixture of random logic and microcode and

6868-523: The instruction set of the 80186 along with some (but not all) of the 80186 speed enhancements, providing a drop-in capability to upgrade both instruction set and processing speed without manufacturers having to modify their designs. Such relatively simple and low-power 8086-compatible processors in CMOS are still used in embedded systems. The electronics industry of the Soviet Union was able to replicate

6969-517: The instruction stream to be queued while waiting for decoding and execution. The queue acts as a First-In-First-Out (FIFO) buffer, from which the Execution Unit (EU) extracts instruction bytes as required. Whenever there is space for at least two bytes in the queue, the BIU will attempt a word fetch memory cycle. If the queue is empty (following a branch instruction, for example), the first byte into

7070-452: The instruction stream to the execution unit through a 6-byte prefetch queue (a form of loosely coupled pipelining ), speeding up operations on registers and immediates , while memory operations became slower (four years later, this performance problem was fixed with the 80186 and 80286 ). However, the full (instead of partial) 16-bit architecture with a full width ALU meant that 16-bit arithmetic instructions could now be performed with

7171-441: The introduction of the 80386 in 1985. A few years after the introduction of the 8086 and 8088, Intel added some complexity to its naming scheme and terminology as the "iAPX" of the ambitious but ill-fated Intel iAPX 432 processor was tried on the more successful 8086 family of chips, applied as a kind of system-level prefix. An 8086 system, including coprocessors such as 8087 and 8089 , and simpler Intel-specific system chips,

7272-520: The last versions in HMOS were specified for 10 MHz. HMOS-III and CMOS versions were manufactured for a long time (at least a while into the 1990s) for embedded systems , although its successor, the 80186 / 80188 (which includes some on-chip peripherals), has been more popular for embedded use. The 80C86, the CMOS version of the 8086, was used in the GRiDPad , Toshiba T1200 , HP 110 , and finally

7373-447: The lower 16 bits of the new 32-bit EAX register, SI corresponds to the lower 16 bits of ESI, and so on. The general-purpose registers, base registers, and index registers can all be used as the base in addressing modes, and all of those registers except for the stack pointer can be used as the index in addressing modes. Two new segment registers (FS and GS) were added. With a greater number of registers, instructions and operands,

7474-580: The manager for the project. The legacy of the 8086 is enduring in the basic instruction set of today's personal computers and servers; the 8086 also lent its last two digits to later extended versions of the design, such as the Intel 286 and the Intel 386 , all of which eventually became known as the x86 family. (Another reference is that the PCI Vendor ID for Intel devices is 8086 h .) All internal registers, as well as internal and external data buses, are 16 bits wide, which firmly established

7575-604: The market dominance of the "inelegant" x86 architecture designed directly from the first simple 8-bit microprocessors. Examples of this are the iAPX 432 (a project originally named the Intel 8800 ), the Intel 960 , Intel 860 and the Intel/Hewlett-Packard Itanium architecture. However, the continuous refinement of x86 microarchitectures , circuitry and semiconductor manufacturing would make it hard to replace x86 in many segments. AMD's 64-bit extension of x86 (which Intel eventually responded to with

7676-399: The mode. Changing the state of pin 33 changes the function of certain other pins, most of which have to do with how the CPU handles the (local) bus. The mode is usually hardwired into the circuit and therefore cannot be changed by software. The workings of these modes are described in terms of timing diagrams in Intel datasheets and manuals. In minimum mode, all control signals are generated by

7777-473: The name EM64T and finally using Intel 64. Microsoft and Sun Microsystems / Oracle also use term "x64", while many Linux distributions , and the BSDs also use the "amd64" term. Microsoft Windows, for example, designates its 32-bit versions as "x86" and 64-bit versions as "x64", while installation files of 64-bit Windows versions are required to be placed into a directory called "AMD64". In 2023, Intel proposed

7878-525: The on-chip FPU of the 486DX. This made the Am386DX a suboptimal choice for scientific applications and CAD using floating point intensive calculations. However, both were niche markets in the early 1990s and the chip sold well, first as a mid-range contender, and then as a budget chip. Although motherboards using the older 386 CPUs often had limited memory expansion possibilities and therefore struggled under Windows 95 's memory requirements, boards using

7979-454: The peripherals). The 8086, 8088, 80186, and 80188 can use an optional floating-point coprocessor, the 8087 . The 8087 appears to the programmer as part of the CPU and adds eight 80-bit wide registers, st(0) to st(7), each of which can hold numeric data in one of seven formats: 32-, 64-, or 80-bit floating point, 16-, 32-, or 64-bit (binary) integer, and 80-bit packed decimal integer. It also has its own 16-bit status register accessible through

8080-479: The queue immediately becomes available to the EU. Small programs could ignore the segmentation and just use plain 16-bit addressing. This allows 8-bit software to be quite easily ported to the 8086. The authors of most DOS implementations took advantage of this by providing an Application Programming Interface very similar to CP/M as well as including the simple .com executable file format, identical to CP/M. This

8181-441: The result is stored in one of the operands. At most one of the operands can be in memory, but this memory operand can also be the destination , while the other operand, the source , can be either register or immediate . A single memory location can also often be used as both source and destination which, among other factors, further contributes to a code density comparable to (and often better than) most eight-bit machines at

8282-418: The same order as given in the instruction stream. Some Intel CPUs ( Xeon Foster MP , some Pentium 4 , and some Nehalem and later Intel Core processors) and AMD CPUs (starting from Zen ) are also capable of simultaneous multithreading with two threads per core ( Xeon Phi has four threads per core). Some Intel CPUs support transactional memory ( TSX ). When introduced, in the mid-1990s, this method

8383-443: The same simplified segmentation as long mode. The x86 architecture is a variable instruction length, primarily " CISC " design with emphasis on backward compatibility . The instruction set is not typical CISC, however, but basically an extended version of the simple eight-bit 8008 and 8080 architectures. Byte-addressing is enabled and words are stored in memory with little-endian byte order. Memory access to unaligned addresses

8484-546: The segment register with the address register, as in most processors whose address space exceeds their register size, the 8086 shifts the 16-bit segment only four bits left before adding it to the 16-bit offset (16×segment + offset), therefore producing a 20-bit external (or effective or physical) address from the 32-bit segment:offset pair. As a result, each external address can be referred to by 2 = 4096 different segment:offset pairs. Although considered complicated and cumbersome by many programmers, this scheme also has advantages;

8585-555: The simple 8080 and 8085 , and the 8088 (used in the IBM PC) was additionally hampered by its narrower bus. The reasons why most memory related instructions were slow were threefold: However, memory access performance was drastically enhanced with Intel's next generation of 8086 family CPUs. The 80186 and 80286 both had dedicated address calculation hardware, saving many cycles, and the 80286 also had separate (non-multiplexed) address and data buses. The 8086/8088 could be connected to

8686-454: The stack. Much work has therefore been invested in making such accesses as fast as register accesses—i.e., a one cycle instruction throughput, in most circumstances where the accessed data is available in the top-level cache. A dedicated floating-point processor with 80-bit internal registers, the 8087 , was developed for the original 8086 . This microprocessor subsequently developed into the extended 80387 , and later processors incorporated

8787-452: The time such as the PDP-11 , VAX , 68000 , 32016 , etc. On the other hand, being more regular than the rather minimalistic but ubiquitous 8-bit microprocessors such as the 6502 , 6800 , 6809 , 8085 , MCS-48 , 8051 , and other contemporary accumulator-based machines, it is significantly easier to construct an efficient code generator for the 8086 architecture. Another factor for this

8888-411: The time. The degree of generality of most registers is much greater than in the 8080 or 8085. However, 8086 registers were more specialized than in most contemporary minicomputers and are also used implicitly by some instructions. While perfectly sensible for the assembly programmer, this makes register allocation for compilers more complicated compared to more orthogonal 16-bit and 32-bit processors of

8989-490: The x86 naming scheme now legally cleared, other x86 vendors had to choose different names for their x86-compatible products, and initially some chose to continue with variations of the numbering scheme: IBM partnered with Cyrix to produce the 5x86 and then the very efficient 6x86 (M1) and 6x86 MX ( MII ) lines of Cyrix designs, which were the first x86 microprocessors implementing register renaming to enable speculative execution . AMD meanwhile designed and manufactured

9090-484: The years, almost consistently with full backward compatibility . The architecture family has been implemented in processors from Intel, Cyrix , AMD , VIA Technologies and many other companies; there are also open implementations, such as the Zet SoC platform (currently inactive). Nevertheless, of those, only Intel, AMD, VIA Technologies, and DM&P Electronics hold x86 architectural licenses, and from these, only

9191-537: Was also affected by a few minor compatibility problems, the Nx586 lacked a floating-point unit (FPU) and (the then crucial) pin-compatibility, while the K5 had somewhat disappointing performance when it was (eventually) introduced. Customer ignorance of alternatives to the Pentium series further contributed to these designs being comparatively unsuccessful, despite the fact that the K5 had very good Pentium compatibility and

9292-436: Was considered very large for a microprocessor around 1976, the idea was dismissed. Also, there were not enough pins available on a low cost 40-pin package for the additional four address bus pins. In principle, the address space of the x86 series could have been extended in later processors by increasing the shift value, as long as applications obtained their segments from the operating system and did not make assumptions about

9393-485: Was eventually replaced by the depletion-load -based 8085 (1977), which used a single +5 V power supply instead of the three different operating voltages of earlier chips. Other well known 8-bit microprocessors that emerged during these years are Motorola 6800 (1974), General Instrument PIC16X (1975), MOS Technology 6502 (1975), Zilog Z80 (1976), and Motorola 6809 (1978). The 8086 project started in May 1976 and

9494-399: Was implemented using depletion-load nMOS circuitry with approximately 20,000 active transistors (29,000 counting all ROM and PLA sites). It was soon moved to a new refined nMOS manufacturing process called HMOS (for High performance MOS) that Intel originally developed for manufacturing of fast static RAM products. This was followed by HMOS-II, HMOS-III versions, and, eventually,

9595-423: Was important when the 8086 and MS-DOS were new, because it allowed many existing CP/M (and other) applications to be quickly made available, greatly easing acceptance of the new platform. The following 8086 assembly source code is for a subroutine named _strtolower that copies a null-terminated ASCIIZ character string from one location to another, converting all alphabetic characters to lower case. The string

9696-699: Was originally intended as a temporary substitute for the ambitious and delayed iAPX 432 project. It was an attempt to draw attention from the less-delayed 16-bit and 32-bit processors of other manufacturers — Motorola , Zilog , and National Semiconductor . Whereas the 8086 was a 16-bit microprocessor, it used the same microarchitecture as Intel's 8-bit microprocessors (8008, 8080, and 8085). This allowed assembly language programs written in 8-bit to seamlessly migrate . New instructions and features — such as signed integers, base+offset addressing, and self-repeating operations — were added. Instructions were added to assist source code compilation of nested functions in

9797-403: Was originally referred to as the i386 architecture (like its first implementation) but Intel later dubbed it IA-32 when introducing its (unrelated) IA-64 architecture. In 1999–2003, AMD extended this 32-bit architecture to 64 bits and referred to it as x86-64 in early documents and later as AMD64 . Intel soon adopted AMD's architectural extensions under the name IA-32e, later using

9898-428: Was released. The Intel 8088 , released July 1, 1979, is a slightly modified chip with an external 8-bit data bus (allowing the use of cheaper and fewer supporting ICs ), and is notable as the processor used in the original IBM PC design. The 8086 gave rise to the x86 architecture, which eventually became Intel's most successful line of processors. On June 5, 2018, Intel released a limited-edition CPU celebrating

9999-436: Was sometimes referred to as a "RISC core" or as "RISC translation", partly for marketing reasons, but also because these micro-operations share some properties with certain types of RISC instructions. However, traditional microcode (used since the 1950s) also inherently shares many of the same properties; the new method differs mainly in that the translation to micro-operations now occurs asynchronously. Not having to synchronize

10100-478: Was thereby described as an iAPX 86 system. There were also terms iRMX (for operating systems), iSBC (for single-board computers), and iSBX (for multimodule boards based on the 8086-architecture), all together under the heading Microsystem 80 . However, this naming scheme was quite temporary, lasting for a few years during the early 1980s. Although the 8086 was primarily developed for embedded systems and small multi-user or single-user computers, largely as

10201-416: Was up to 35% lower than with Intel's design and even lower than the 386SL's, making the AMD 386SX the ideal chip for both desktop and mobile computers. The SXL versions featured advanced power management functions and used even less power. Floating point performance of the Am386 could be boosted with the addition of a 80387DX or 80387SX coprocessor , although performance would still not approach that of

#565434