Misplaced Pages

Intel X58

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The Intel X58 ( codenamed Tylersburg ) is an Intel chip designed to connect Intel processors with Intel QuickPath Interconnect (QPI) interface to peripheral devices. Supported processors implement the Nehalem microarchitecture and therefore have an integrated memory controller (IMC), so the X58 does not have a memory interface. Initially supported processors were the Core i7 , but the chip also supported Nehalem and Westmere-based Xeon processors.

#621378

49-536: The QuickPath architecture differs considerably from earlier Intel architectures, and is much closer to AMD's HyperTransport architecture. Except for the lack of a memory interface, the X58 is similar to the traditional northbridge : it communicates with the processor(s) via the high bandwidth QuickPath Interconnect , it communicates with the southbridge via Direct Media Interface (DMI), and it communicates with high bandwidth peripherals via PCI Express (PCIe). The X58

98-582: A backplane . Many of the computers were based on the First Draft of a Report on the EDVAC report published in 1945. In what became known as the Von Neumann architecture , a central control unit and arithmetic logic unit (ALU, which he called the central arithmetic part) were combined with computer memory and input and output functions to form a stored program computer . The Report presented

147-468: A control bus to determine its operation. The technique was developed to reduce costs and improve modularity, and although popular in the 1970s and 1980s, more modern computers use a variety of separate buses adapted to more specific needs. The system level bus (as distinct from a CPU's internal datapath busses) connects the CPU to memory and I/O devices. Typically a system level bus is designed for use as

196-468: A bidirectional bus, often implemented as a three-state bus . To prevent bus contention on the address bus, a bus arbiter selects which particular bus master is allowed to drive the address bus during this bus cycle. Intel has used the term Dual Independent Bus (DIB) for two different purposes. The first one came when Intel changed from a single local bus to the DIB, using the external front-side bus to

245-531: A four-port, 1000  Mbit /s Ethernet router needs a maximum 8000 Mbit/s of internal bandwidth (1000 Mbit/s × 4 ports × 2 directions)—HyperTransport greatly exceeds the bandwidth this application requires. However a 4 + 1 port 10 Gb router would require 100 Gbit/s of internal bandwidth. Add to that 802.11ac 8 antennas and the WiGig 60 GHz standard (802.11ad) and HyperTransport becomes more feasible (with anywhere between 20 and 24 lanes used for

294-426: A general organization and theoretical model of the computer, however, not the implementation of that model. Soon designs integrated the control unit and ALU into what became known as the central processing unit (CPU). Computers in the 1950s and 1960s were generally constructed in an ad-hoc fashion. For example, the CPU, memory, and input/output units were each one or more cabinets connected by cables. Engineers used

343-500: A microprocessor using a HyperTransport interface was released by the HyperTransport Consortium. It is known as H yper T ransport e X pansion ( HTX ). Using a reversed instance of the same mechanical connector as a 16-lane PCI Express slot (plus an x1 connector for power pins), HTX allows development of plug-in cards that support direct access to a CPU and DMA to the system RAM . The initial card for this slot

392-467: A module that allows FPGAs to plug directly into the Opteron socket. AMD started an initiative named Torrenza on September 21, 2006, to further promote the usage of HyperTransport for plug-in cards and coprocessors . This initiative opened their "Socket F" to plug-in boards such as those from XtremeData and DRC. A connector specification that allows a slot-based peripheral to have direct connection to

441-403: A more complex motherboard layout but with fewer bottlenecks. HTX 3.1 at 26 GB/s can serve as a unified bus for as many as four DDR4 sticks running at the fastest proposed speeds. Beyond that DDR4 RAM may require two or more HTX 3.1 buses diminishing its value as unified transport. HyperTransport comes in four versions—1.x, 2.0, 3.0, and 3.1—which run from 200   MHz to 3.2 GHz. It

490-525: A multi-socket motherboard or form a ring-like connection (processor 1 to X58 to processor 2 back to processor 1). When used with the Intel Core i7 , the second QPI is usually unused (though, in principle, the second X58 might be daisy-chained on the board). When used with the "Gainestown" DP processor, which will have two QPIs, the X58 and the two processors may be connected in a triangle or ring. For MP processors such as "Beckton" with more than two QPIs,

539-434: A packet always contains a command field. Many packets contain a 40-bit address. An additional 32-bit control packet is prepended when 64-bit addressing is required. The data payload is sent after the control packet. Transfers are always padded to a multiple of 32 bits, regardless of their actual length. HyperTransport packets enter the interconnect in segments known as bit times. The number of bit times required depends on

SECTION 10

#1732787379622

588-594: A response from the receiver in the form of a "target done" response. Reads also require a response, containing the read data. HyperTransport supports the PCI consumer/producer ordering model. HyperTransport also facilitates power management as it is compliant with the Advanced Configuration and Power Interface specification. This means that changes in processor sleep states (C states) can signal changes in device states (D states), e.g. powering off disks when

637-602: A single system bus, starting with the S-100 bus in the Altair 8800 computer system in about 1975. The IBM PC used the Industry Standard Architecture (ISA) bus as its system bus in 1981. The passive backplanes of early models were replaced with the standard of putting the CPU and RAM on a motherboard , with only optional daughterboards or expansion cards in system bus slots. The Multibus became

686-460: A single system configuration as in one 16-bit link to another CPU and one 8-bit link to a peripheral device, which allows for a wider interconnect between CPUs , and a lower bandwidth interconnect to peripherals as appropriate. It also supports link splitting, where a single 16-bit link can be divided into two 8-bit links. The technology also typically has lower latency than other solutions due to its lower overhead. Electrically, HyperTransport

735-607: A standard of the Institute of Electrical and Electronics Engineers as IEEE standard 796 in 1983. Sun Microsystems developed the SBus in 1989 to support smaller expansion cards. The easiest way to implement symmetric multiprocessing was to plug in more than one CPU into the shared system bus, which was used through the 1980s. However, the shared bus quickly became the bottleneck and more sophisticated connection techniques were explored. Even in very simple systems, at various times

784-475: Is a bidirectional serial / parallel high- bandwidth , low- latency point-to-point link that was introduced on April 2, 2001. The HyperTransport Consortium is in charge of promoting and developing HyperTransport technology. HyperTransport is best known as the system bus architecture of AMD central processing units (CPUs) from Athlon 64 through AMD FX and the associated motherboard chipsets. HyperTransport has also been used by IBM and Apple for

833-492: Is allowed to drive the data bus during this bus cycle. In very simple systems, every instruction cycle starts with a READ memory cycle where program memory drives the instruction onto the data bus while the instruction register latches that instruction from the data bus. Some instructions continue with a WRITE memory cycle where the memory data register drives data onto the data bus into the chosen RAM or I/O device. Other instructions continue with another READ memory cycle where

882-479: Is also a DDR or " double data rate " connection, meaning it sends data on both the rising and falling edges of the clock signal . This allows for a maximum data rate of 6400 MT/s when running at 3.2 GHz. The operating frequency is autonegotiated with the motherboard chipset (North Bridge) in current computing. HyperTransport supports an autonegotiated bit width, ranging from 2 to 32 bits per link; there are two unidirectional links per HyperTransport bus. With

931-588: Is capable of 32-bit width links, that width is not currently utilized by any AMD processors. Some chipsets though do not even utilize the 16-bit width used by the processors. Those include the Nvidia nForce3 150, nForce3 Pro 150, and the ULi M1689—which use a 16-bit HyperTransport downstream link but limit the HyperTransport upstream link to 8 bits. There has been some marketing confusion between

980-610: Is defined to enable standardized functional test system interconnection. * AMD Athlon 64 , Athlon 64 FX, Athlon 64 X2 , Athlon X2, Athlon II , Phenom, Phenom II , Sempron , Turion series and later use one 16-bit HyperTransport link. AMD Athlon 64 FX ( 1207 ), Opteron use up to three 16-bit HyperTransport links. Common clock rates for these processor links are 800 MHz to 1 GHz (older single and multi socket systems on 754/939/940 links) and 1.6 GHz to 2.0 GHz (newer single socket systems on AM2+/AM3 links—most newer CPUs using 2.0   GHz). While HyperTransport itself

1029-533: Is not a memory controller hub (MCH), because it has no memory interface, so Intel calls it an I/O hub . This should not be confused with the similar term I/O controller hub ( ICH ) which has traditionally been used to refer to the southbridge chips. Intel documentation now refers to the southbridge as the Legacy I/O Controller Hub . The X58 has 36 PCIe lanes that are arranged in two ×16 links, DMI link and "spare"-based link. When used with

SECTION 20

#1732787379622

1078-459: Is obsolete in the modern personal and server computers, which instead use higher-performance interconnection technologies such as HyperTransport and Intel QuickPath Interconnect , while the system bus architecture continued to be used on simpler embedded microprocessors. The systems bus can even be internal to a single integrated circuit, producing a system-on-a-chip . Examples include AMBA , CoreConnect , and Wishbone . Direct Media Interface

1127-421: Is similar to low-voltage differential signaling (LVDS) operating at 1.2 V. HyperTransport 2.0 added post-cursor transmitter deemphasis . HyperTransport 3.0 added scrambling and receiver phase alignment as well as optional transmitter precursor deemphasis. HyperTransport is packet -based, where each packet consists of a set of 32-bit words, regardless of the physical width of the link. The first word in

1176-427: Is still possible to run more than two video cards in an SLI -configuration at fewer PCIe lane widths. The X58 chipset itself supports up to 36 PCI-Express 2.0 lanes, so it is possible to have two PCIe ×16 slots and one PCIe ×4 slot on the same motherboard. HyperTransport HyperTransport ( HT ), formerly known as Lightning Data Transport , is a technology for interconnection of computer processors . It

1225-477: The EPYC server CPUs is a superset of HyperTransport. The HORUS interconnect from Newisys extends this concept to larger clusters. The Aqua device from 3Leaf Systems virtualizes and interconnects CPUs, memory, and I/O. HyperTransport can also be used as a bus in routers and switches . Routers and switches have multiple network interfaces, and must forward data between these ports as fast as possible. For example,

1274-478: The ICH10 I/O Controller Hub with ×4 DMI connection the "spare" supports a separate ×4 PCIe connection. Future southbridge chips DMI may support a wider DMI. Each X58 QuickPath Interconnect uses 21 unidirectional differential pairs in each direction, for a total of 84 pins per QPI. At the highest bandwidth, each QPI can transfer up to 12.8 GB/s usable in each direction simultaneously using

1323-571: The Power Mac G5 machines, as well as a number of modern MIPS systems. The current specification HTX 3.1 remained competitive for 2014 high-speed (2666 and 3200  MT /s or about 10.4 GB/s and 12.8 GB/s) DDR4 RAM and slower (around 1 GB/s [1] similar to high end PCIe SSDs ULLtraDIMM flash RAM) technology —a wider range of RAM speeds on a common CPU bus than any Intel front-side bus . Intel technologies require each speed range of RAM to have its own interface, resulting in

1372-588: The Zen -based CPUs and Vega GPUs which were subsequently released in 2017. On Zen and Zen+ CPUs, the "SDF" data interconnects are run at the same frequency as the DRAM memory clock (MEMCLK), a decision made to remove the latency caused by different clock speeds. As a result, using a faster RAM module makes the entire bus faster. The links are 32-bit wide, as in HT, but 8 transfers are done per cycle (128-bit packets) compared to

1421-517: The cache coherence of shared data located in different caches have to be sent in broadcast (snooped) to check the other FSB's CPUs' cache state, reducing the available bandwidth. To reduce the coherency traffic, a snoop filter was included in the higher-end chipsets, in order to have cache state information available on-chipset. In 2007 Intel extended the idea of multiple buses in the 7300 chipset with four independent FSBs, calling it dedicated high-speed interconnects (DHSI). The system bus approach

1470-501: The front-side bus in their Opteron , Athlon 64 , Athlon II , Sempron 64 , Turion 64 , Phenom , Phenom II and FX families of microprocessors. Another use for HyperTransport is as an interconnect for NUMA multiprocessor computers. AMD used HyperTransport with a proprietary cache coherency extension as part of their Direct Connect Architecture in their Opteron and Athlon 64 FX ( Dual Socket Direct Connect (DSDC) Architecture ) line of processors. Infinity Fabric used with

1519-491: The CPU goes to sleep. HyperTransport 3.0 added further capabilities to allow a centralized power management controller to implement power management policies. The primary use for HyperTransport is to replace the Intel-defined front-side bus , which is different for every type of Intel processor. For instance, a Pentium cannot be plugged into a PCI Express bus directly, but must first go through an adapter to expand

Intel X58 - Misplaced Pages Continue

1568-402: The DRAM, to allow the higher clock speeds that DDR5 is capable of. UALink will utilize Infinity Fabric as the primary shared memory protocol. System bus A system bus is a single computer bus that connects the major components of a computer system, combining the functions of a data bus to carry information, an address bus to determine where it should be sent or read from, and

1617-501: The QPI protocol. The protocol transfers information in 80 bit flits which contain 8 bits of error correction, 8 bits of QPI routing information, and 64 bits of data. X58 PCIe ports support full PCIe 2.0 bandwidth (e.g., up to 8 GB/s including overheads per ×16 link) and each ×16 link may be divided into total 16 lanes in any combination of ×8, ×4, ×2 or ×1 ports. They also support all features of line-reserved wiring, which means that in

1666-588: The X58 is either connected to two processors, which in turn are connected in a "mesh" of QPIs to other processors or attached "in pairs" to two different processors. I/O for "remote" processors is relayed via the inter-processors QPI. X58 board manufacturers can build SLI -compatible Intel chipset boards by submitting their designs to nVidia for validation. However, users wishing to run more than two Nvidia video cards in PCIe ×16 will still need to purchase motherboards equipped with one or more nVidia nForce chipsets . It

1715-537: The advent of version 3.1, using full 32-bit links and utilizing the full HyperTransport 3.1 specification's operating frequency, the theoretical transfer rate is 25.6  GB /s (3.2 GHz × 2 transfers per clock cycle × 32 bits per link) per direction, or 51.2 GB/s aggregated throughput, making it faster than most existing bus standard for PC workstations and servers as well as making it faster than most bus standards for high-performance computing and networking. Links of various widths can be mixed together in

1764-407: The chosen RAM, program memory, or I/O device drives data onto the data bus while the memory data register latches that data from the data bus. More complex systems have a multi-master bus —not only do they have many devices that each drive the data bus, but also have many bus masters that each drive the address bus. The address bus as well as the data bus in bus snooping systems is required to be

1813-473: The combinations of (×16 + ×1/×8) slots, often used on the motherboards, not only ×1 or ×8 cards may be installed into the ×1/×8 slot, but ×4 cards should work as well (if not disallowed by the motherboard BIOS.) Unlike the front-side bus (FSB), QPI is a point-to-point interface and supports not only processor-chipset interface, but also processor-to-processor connection and chip-to-chip connection. The X58 has two QPIs and can directly connect to two processors on

1862-633: The common techniques of standardized bundles of wires and extended the concept as backplanes were used to hold printed circuit boards in these early machines. The name "bus" was already used for " bus bars " that carried electrical power to the various parts of electric machines, including early mechanical calculators. The advent of integrated circuits vastly reduced the size of each computer unit, and buses became more standardized. Standard modules could be interconnected in more uniform ways and were easier to develop and maintain. To provide even more modularity with reduced cost, memory and I/O buses (and

1911-447: The data bus is driven by the program memory, by RAM, and by I/O devices. To prevent bus contention on the data bus, at any one instant only one device drives the data bus. In very simple systems, only the data bus is required to be a bidirectional bus. In very simple systems, the memory address register always drives the address bus, the control unit always drives the control bus, and an address decoder selects which particular device

1960-478: The link width. HyperTransport also supports system management messaging, signaling interrupts, issuing probes to adjacent devices or processors, I/O transactions, and general data transactions. There are two kinds of write commands supported: posted and non-posted. Posted writes do not require a response from the target. This is usually used for high bandwidth devices such as uniform memory access traffic or direct memory access transfers. Non-posted writes require

2009-576: The main system memory and I/O devices, and the internal back-side bus to the L2 CPU cache . This was introduced in the Pentium Pro in 1995. In 2005 and 2006 Intel introduced the 8500 and 5000 chipsets, where DIB referred to the two front-side buses on a chipset, which doubles the system bandwidth compared to having just one FSB shared by all the CPUs. However, the information needed to guarantee

Intel X58 - Misplaced Pages Continue

2058-593: The needed bandwidth). The issue of latency and bandwidth between CPUs and co-processors has usually been the major stumbling block to their practical implementation. Co-processors such as FPGAs have appeared that can access the HyperTransport bus and become integrated on the motherboard. Current generation FPGAs from both main manufacturers ( Altera and Xilinx ) directly support the HyperTransport interface, and have IP Cores available. Companies such as XtremeData, Inc. and DRC take these FPGAs (Xilinx in DRC's case) and create

2107-468: The original 2. Electrical changes are made for higher power efficiency. On Zen 2 and Zen 3 CPUs, the IF bus is on a separate clock, either in a 1:1 or a 2:1 ratio to the DRAM clock. This avoids a limitation on desktop platforms where maximum DRAM speeds were in practice limited by the IF speed. The bus width has also been doubled. On Zen 4 and later CPUs, the IF bus is able to run at an asynchronous clock to

2156-446: The required control and power buses ) were sometimes combined into a single unified system bus. Modularity and cost became important as computers became small enough to fit in a single cabinet (and customers expected similar price reductions). Digital Equipment Corporation (DEC) further reduced cost for mass-produced minicomputers , and memory-mapped I/O into the memory bus, so that the devices appeared to be memory locations. This

2205-495: The system. The proprietary front-side bus must connect through adapters for the various standard buses, like AGP or PCI Express. These are typically included in the respective controller functions, namely the northbridge and southbridge . In contrast, HyperTransport is an open specification, published by a multi-company consortium. A single HyperTransport adapter chip will work with a wide spectrum of HyperTransport enabled microprocessors. AMD used HyperTransport to replace

2254-401: The use of HT referring to H yper T ransport and the later use of HT to refer to Intel 's Hyper-Threading feature on some Pentium 4 -based and the newer Nehalem and Westmere-based Intel Core microprocessors. Hyper-Threading is officially known as H yper- T hreading T echnology ( HTT ) or HT Technology . Because of this potential for confusion, the HyperTransport Consortium always uses

2303-513: The written-out form: "HyperTransport." Infinity Fabric ( IF ) is a superset of HyperTransport announced by AMD in 2016 as an interconnect for its GPUs and CPUs. It is also usable as interchip interconnect for communication between CPUs and GPUs (for Heterogeneous System Architecture ), an arrangement known as Infinity Architecture . The company said the Infinity Fabric would scale from 30   GB/s to 512   GB/s, and be used in

2352-594: Was implemented in the Unibus of the PDP-11 around 1969, eliminating the need for a separate I/O bus. Even computers such as the PDP-8 without memory-mapped I/O were soon implemented with a system bus, which allowed modules to be plugged into any slot. Some authors called this a new streamlined "model" of computer architecture. Many early microcomputers (with a CPU generally on a single integrated circuit ) were built with

2401-531: Was the QLogic InfiniPath InfiniBand HCA. IBM and HP , among others, have released HTX compliant systems. The original HTX standard is limited to 16   bits and 800   MHz. In August 2008, the HyperTransport Consortium released HTX3, which extends the clock rate of HTX to 2.6 GHz (5.2 GT/s, 10.7 GTi, 5.2 real GHz data rate, 3 MT/s edit rate) and retains backwards compatibility. The "DUT" test connector

#621378