Misplaced Pages

HyperTransport

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In computing and computer science , a processor or processing unit is an electrical component ( digital circuit ) that performs operations on an external data source, usually memory or some other data stream. It typically takes the form of a microprocessor , which can be implemented on a single or a few tightly integrated metal–oxide–semiconductor integrated circuit chips. In the past, processors were constructed using multiple individual vacuum tubes , multiple individual transistors , or multiple integrated circuits.

#330669

31-401: HyperTransport ( HT ), formerly known as Lightning Data Transport , is a technology for interconnection of computer processors . It is a bidirectional serial / parallel high- bandwidth , low- latency point-to-point link that was introduced on April 2, 2001. The HyperTransport Consortium is in charge of promoting and developing HyperTransport technology. HyperTransport is best known as

62-837: A control unit (CU), an arithmetic logic unit (ALU), and processor registers . In practice, CPUs in personal computers are usually also connected, through the motherboard , to a main memory bank, hard drive or other permanent storage , and peripherals , such as a keyboard and mouse . Graphics processing units (GPUs) are present in many computers and designed to efficiently perform computer graphics operations, including linear algebra . They are highly parallel, and CPUs usually perform better on tasks requiring serial processing. Although GPUs were originally intended for use in graphics, over time their application domains have expanded, and they have become an important piece of hardware for machine learning . There are several forms of processors specialized for machine learning. These fall under

93-545: A common CPU bus than any Intel front-side bus . Intel technologies require each speed range of RAM to have its own interface, resulting in a more complex motherboard layout but with fewer bottlenecks. HTX 3.1 at 26 GB/s can serve as a unified bus for as many as four DDR4 sticks running at the fastest proposed speeds. Beyond that DDR4 RAM may require two or more HTX 3.1 buses diminishing its value as unified transport. HyperTransport comes in four versions—1.x, 2.0, 3.0, and 3.1—which run from 200   MHz to 3.2 GHz. It

124-531: A four-port, 1000  Mbit /s Ethernet router needs a maximum 8000 Mbit/s of internal bandwidth (1000 Mbit/s × 4 ports × 2 directions)—HyperTransport greatly exceeds the bandwidth this application requires. However a 4 + 1 port 10 Gb router would require 100 Gbit/s of internal bandwidth. Add to that 802.11ac 8 antennas and the WiGig 60 GHz standard (802.11ad) and HyperTransport becomes more feasible (with anywhere between 20 and 24 lanes used for

155-500: A microprocessor using a HyperTransport interface was released by the HyperTransport Consortium. It is known as H yper T ransport e X pansion ( HTX ). Using a reversed instance of the same mechanical connector as a 16-lane PCI Express slot (plus an x1 connector for power pins), HTX allows development of plug-in cards that support direct access to a CPU and DMA to the system RAM . The initial card for this slot

186-467: A module that allows FPGAs to plug directly into the Opteron socket. AMD started an initiative named Torrenza on September 21, 2006, to further promote the usage of HyperTransport for plug-in cards and coprocessors . This initiative opened their "Socket F" to plug-in boards such as those from XtremeData and DRC. A connector specification that allows a slot-based peripheral to have direct connection to

217-434: A packet always contains a command field. Many packets contain a 40-bit address. An additional 32-bit control packet is prepended when 64-bit addressing is required. The data payload is sent after the control packet. Transfers are always padded to a multiple of 32 bits, regardless of their actual length. HyperTransport packets enter the interconnect in segments known as bit times. The number of bit times required depends on

248-460: A particular application domain during manufacturing. The Synergistic Processing Element or Unit (SPE or SPU) is a component in the Cell microprocessor. Processors based on different circuit technology have been developed. One example is quantum processors , which use quantum physics to enable algorithms that are impossible on classical computers (those using traditional circuitry). Another example

279-594: A response from the receiver in the form of a "target done" response. Reads also require a response, containing the read data. HyperTransport supports the PCI consumer/producer ordering model. HyperTransport also facilitates power management as it is compliant with the Advanced Configuration and Power Interface specification. This means that changes in processor sleep states (C states) can signal changes in device states (D states), e.g. powering off disks when

310-460: A single system configuration as in one 16-bit link to another CPU and one 8-bit link to a peripheral device, which allows for a wider interconnect between CPUs , and a lower bandwidth interconnect to peripherals as appropriate. It also supports link splitting, where a single 16-bit link can be divided into two 8-bit links. The technology also typically has lower latency than other solutions due to its lower overhead. Electrically, HyperTransport

341-479: Is also a DDR or " double data rate " connection, meaning it sends data on both the rising and falling edges of the clock signal . This allows for a maximum data rate of 6400 MT/s when running at 3.2 GHz. The operating frequency is autonegotiated with the motherboard chipset (North Bridge) in current computing. HyperTransport supports an autonegotiated bit width, ranging from 2 to 32 bits per link; there are two unidirectional links per HyperTransport bus. With

SECTION 10

#1732772742331

372-587: Is capable of 32-bit width links, that width is not currently utilized by any AMD processors. Some chipsets though do not even utilize the 16-bit width used by the processors. Those include the Nvidia nForce3 150, nForce3 Pro 150, and the ULi M1689—which use a 16-bit HyperTransport downstream link but limit the HyperTransport upstream link to 8 bits. There has been some marketing confusion between

403-610: Is defined to enable standardized functional test system interconnection. * AMD Athlon 64 , Athlon 64 FX, Athlon 64 X2 , Athlon X2, Athlon II , Phenom, Phenom II , Sempron , Turion series and later use one 16-bit HyperTransport link. AMD Athlon 64 FX ( 1207 ), Opteron use up to three 16-bit HyperTransport links. Common clock rates for these processor links are 800 MHz to 1 GHz (older single and multi socket systems on 754/939/940 links) and 1.6 GHz to 2.0 GHz (newer single socket systems on AM2+/AM3 links—most newer CPUs using 2.0   GHz). While HyperTransport itself

434-420: Is similar to low-voltage differential signaling (LVDS) operating at 1.2 V. HyperTransport 2.0 added post-cursor transmitter deemphasis . HyperTransport 3.0 added scrambling and receiver phase alignment as well as optional transmitter precursor deemphasis. HyperTransport is packet -based, where each packet consists of a set of 32-bit words, regardless of the physical width of the link. The first word in

465-477: The EPYC server CPUs is a superset of HyperTransport. The HORUS interconnect from Newisys extends this concept to larger clusters. The Aqua device from 3Leaf Systems virtualizes and interconnects CPUs, memory, and I/O. HyperTransport can also be used as a bus in routers and switches . Routers and switches have multiple network interfaces, and must forward data between these ports as fast as possible. For example,

496-588: The Zen -based CPUs and Vega GPUs which were subsequently released in 2017. On Zen and Zen+ CPUs, the "SDF" data interconnects are run at the same frequency as the DRAM memory clock (MEMCLK), a decision made to remove the latency caused by different clock speeds. As a result, using a faster RAM module makes the entire bus faster. The links are 32-bit wide, as in HT, but 8 transfers are done per cycle (128-bit packets) compared to

527-501: The front-side bus in their Opteron , Athlon 64 , Athlon II , Sempron 64 , Turion 64 , Phenom , Phenom II and FX families of microprocessors. Another use for HyperTransport is as an interconnect for NUMA multiprocessor computers. AMD used HyperTransport with a proprietary cache coherency extension as part of their Direct Connect Architecture in their Opteron and Athlon 64 FX ( Dual Socket Direct Connect (DSDC) Architecture ) line of processors. Infinity Fabric used with

558-410: The periodic table . Transistors made of a single sheet of silicon atoms one atom tall and other 2D materials have been researched for use in processors. Quantum processors have been created; they use quantum superposition to represent bits (called qubits ) instead of only an on or off state. Moore's law , named after Gordon Moore , is the observation and projection via historical trend that

589-637: The system bus architecture of AMD central processing units (CPUs) from Athlon 64 through AMD FX and the associated motherboard chipsets. HyperTransport has also been used by IBM and Apple for the Power Mac G5 machines, as well as a number of modern MIPS systems. The current specification HTX 3.1 remained competitive for 2014 high-speed (2666 and 3200  MT /s or about 10.4 GB/s and 12.8 GB/s) DDR4 RAM and slower (around 1 GB/s [1] similar to high end PCIe SSDs ULLtraDIMM flash RAM) technology—a wider range of RAM speeds on

620-540: The CPU goes to sleep. HyperTransport 3.0 added further capabilities to allow a centralized power management controller to implement power management policies. The primary use for HyperTransport is to replace the Intel-defined front-side bus , which is different for every type of Intel processor. For instance, a Pentium cannot be plugged into a PCI Express bus directly, but must first go through an adapter to expand

651-651: The DRAM, to allow the higher clock speeds that DDR5 is capable of. UALink will utilize Infinity Fabric as the primary shared memory protocol. Processor (computing) The term is frequently used to refer to the central processing unit (CPU), the main processor in a system. However, it can also refer to other coprocessors , such as a graphics processing unit (GPU). Traditional processors are typically based on silicon; however, researchers have developed experimental processors based on alternative materials such as carbon nanotubes , graphene , diamond , and alloys made of elements from groups three and five of

SECTION 20

#1732772742331

682-537: The advent of version 3.1, using full 32-bit links and utilizing the full HyperTransport 3.1 specification's operating frequency, the theoretical transfer rate is 25.6  GB /s (3.2 GHz × 2 transfers per clock cycle × 32 bits per link) per direction, or 51.2 GB/s aggregated throughput, making it faster than most existing bus standard for PC workstations and servers as well as making it faster than most bus standards for high-performance computing and networking. Links of various widths can be mixed together in

713-809: The category of AI accelerators (also known as neural processing units , or NPUs) and include vision processing units (VPUs) and Google 's Tensor Processing Unit (TPU). Sound chips and sound cards are used for generating and processing audio. Digital signal processors (DSPs) are designed for processing digital signals. Image signal processors are DSPs specialized for processing images in particular. Deep learning processors , such as neural processing units are designed for efficient deep learning computation. Physics processing units (PPUs) are built to efficiently make physics-related calculations, particularly in video games. Field-programmable gate arrays (FPGAs) are specialized circuits that can be reconfigured for different purposes, rather than being locked into

744-478: The link width. HyperTransport also supports system management messaging, signaling interrupts, issuing probes to adjacent devices or processors, I/O transactions, and general data transactions. There are two kinds of write commands supported: posted and non-posted. Posted writes do not require a response from the target. This is usually used for high bandwidth devices such as uniform memory access traffic or direct memory access transfers. Non-posted writes require

775-593: The needed bandwidth). The issue of latency and bandwidth between CPUs and co-processors has usually been the major stumbling block to their practical implementation. Co-processors such as FPGAs have appeared that can access the HyperTransport bus and become integrated on the motherboard. Current generation FPGAs from both main manufacturers ( Altera and Xilinx ) directly support the HyperTransport interface, and have IP Cores available. Companies such as XtremeData, Inc. and DRC take these FPGAs (Xilinx in DRC's case) and create

806-440: The number of transistors in integrated circuits, and therefore processors by extension, doubles every two years. The progress of processors has followed Moore's law closely. Central processing units (CPUs) are the primary processors in most computers. They are designed to handle a wide variety of general computing tasks rather than only a few domain-specific tasks. If based on the von Neumann architecture , they contain at least

837-466: The original 2. Electrical changes are made for higher power efficiency. On Zen 2 and Zen 3 CPUs, the IF bus is on a separate clock, either in a 1:1 or a 2:1 ratio to the DRAM clock. This avoids a limitation on desktop platforms where maximum DRAM speeds were in practice limited by the IF speed. The bus width has also been doubled. On Zen 4 and later CPUs, the IF bus is able to run at an asynchronous clock to

868-495: The system. The proprietary front-side bus must connect through adapters for the various standard buses, like AGP or PCI Express. These are typically included in the respective controller functions, namely the northbridge and southbridge . In contrast, HyperTransport is an open specification, published by a multi-company consortium. A single HyperTransport adapter chip will work with a wide spectrum of HyperTransport enabled microprocessors. AMD used HyperTransport to replace

899-401: The use of HT referring to H yper T ransport and the later use of HT to refer to Intel 's Hyper-Threading feature on some Pentium 4 -based and the newer Nehalem and Westmere-based Intel Core microprocessors. Hyper-Threading is officially known as H yper- T hreading T echnology ( HTT ) or HT Technology . Because of this potential for confusion, the HyperTransport Consortium always uses

930-510: The written-out form: "HyperTransport." Infinity Fabric ( IF ) is a superset of HyperTransport announced by AMD in 2016 as an interconnect for its GPUs and CPUs. It is also usable as interchip interconnect for communication between CPUs and GPUs (for Heterogeneous System Architecture ), an arrangement known as Infinity Architecture . The company said the Infinity Fabric would scale from 30   GB/s to 512   GB/s, and be used in

961-529: Was the QLogic InfiniPath InfiniBand HCA. IBM and HP , among others, have released HTX compliant systems. The original HTX standard is limited to 16   bits and 800   MHz. In August 2008, the HyperTransport Consortium released HTX3, which extends the clock rate of HTX to 2.6 GHz (5.2 GT/s, 10.7 GTi, 5.2 real GHz data rate, 3 MT/s edit rate) and retains backwards compatibility. The "DUT" test connector

HyperTransport - Misplaced Pages Continue

#330669