Misplaced Pages

Cray XK7

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

XK7 is a supercomputing platform, produced by Cray , launched on October 29, 2012. XK7 is the second platform from Cray to use a combination of central processing units ("CPUs") and graphical processing units ("GPUs") for computing; the hybrid architecture requires a different approach to programming to that of CPU-only supercomputers. Laboratories that host XK7 machines host workshops to train researchers in the new programming languages needed for XK7 machines. The platform is used in Titan , the world's second fastest supercomputer in the November 2013 list as ranked by the TOP500 organization. Other customers include the Swiss National Supercomputing Centre which has a 272 node machine and Blue Waters has a machine that has Cray XE6 and XK7 nodes that performs at approximately 1 petaFLOPS (10 floating-point operations per second ).

#932067

47-570: XK7 is scalable up to 500 cabinets, each contains 24  blades and each blade contains 4 nodes (1 CPU and 1 GPU per node). The CPUs available are of the 16-core AMD Opteron 6200 Interlagos series and the GPUs are of the Nvidia Tesla K20 Kepler series . Each CPU can be paired with either 16 or 32 GB of error-correcting code memory (ECC) while the GPUs have either 5 or 6 GB of ECC memory depending on

94-464: A motherboard – the dimensions, power supply type, location of mounting holes, number of ports on the back panel, etc. Specifically, in the IBM PC compatible industry, standard form factors ensure that parts are interchangeable across competing vendors and generations of technology, while in enterprise computing, form factors ensure that server modules fit into existing rackmount systems. Traditionally,

141-530: A storage area network (SAN) allows for an entirely disk-free blade, an example of which implementation is the Intel Modular Server System . Since blade enclosures provide a standard method for delivering basic services to computer devices, other types of devices can also utilize blade enclosures. Blades providing switching, routing, storage, SAN and fibre-channel access can slot into the enclosure to provide these services to all members of

188-757: A K20X GPU with 6 GB. The computer has a theoretical peak performance of 27.1 petaFLOPS but in the LINPACK benchmark used by the TOP500 organisation to rank supercomputers it performed at 17.59 petaFLOPS, enough to take first place on the November 2012 list. Titan uses 8.2 MW of electricity and is third on the Green500 list which ranks supercomputers by their energy efficiency. The National Center for Supercomputing Applications (NCSA) in Illinois has

235-416: A complete server, with its operating system and applications, on a single card/board/blade. These blades could then operate independently within a common chassis, doing the work of multiple separate server boxes more efficiently. In addition to the most obvious benefit of this packaging (less space consumption), additional efficiency benefits have become clear in power, cooling, management, and networking due to

282-446: A computer to be used in a multimedia system may need to be optimized for heat and size, with additional plug-in cards being less common. The smallest motherboards may sacrifice CPU flexibility in favor of a fixed manufacturer's choice. The E-ATX form factor is not standarized and may vary according to the motherboard manufacturer. Processor is placed closest to the fan. May contain a CNR board. (6.89 × 9.65 in) List

329-454: A full PC on them, including application oriented interfaces like audio, analog, or digital I/O in many cases. Also it's much easier to fit Pentium CPUs, whereas it's a tight squeeze (or expensive) to do so on a PC/104 SBC. Typically, EBX SBCs contain: the CPU; upgradeable RAM subassemblies (e.g., DIMM); Flash memory for solid state drive; multiple USB, serial, and parallel ports; onboard expansion via

376-429: A machine, Blue Waters , using a combination of Cray XE6 and XK7 nodes. The machine has 3072 XK7 nodes and 22,752 XE6 nodes. Each XE6 node has two Opteron 6276 and 32 GB of memory per CPU. The XK7 nodes also have Opteron 6276 CPUs with 32 GB of memory and a K20X GPU with 6 GB. Blue Waters has performed at over 1 petaFLOPS in benchmarks; however, the project managers do not believe in

423-605: A range of programming languages. The hybrid architecture requires different programming to conventional CPU-only supercomputers; Oak Ridge National Laboratory and the Swiss National Supercomputing Centre hold workshops to educate researchers on the new programming approach. The XK7 platform was announced on October 29, 2012 to coincide with the completion of Titan at Oak Ridge National Laboratory (ORNL). Titan has 18,688 XK7 nodes, each containing an Opteron 6274 CPU with 32 GB of memory and

470-533: A single function with a small real-time executive . The VMEbus architecture ( c.  1981 ) defined a computer interface that included implementation of a board-level computer installed in a chassis backplane with multiple slots for pluggable boards to provide I/O, memory, or additional computing. In the 1990s, the PCI Industrial Computer Manufacturers Group PICMG developed a chassis/blade structure for

517-602: Is 42U high, which limits the number of discrete computer devices directly mountable in a rack to 42 components. Blades do not have this limitation. As of 2014 , densities of up to 180 servers per blade system (or 1440 servers per rack) are achievable with blade systems. The enclosure (or chassis) performs many of the non-core computing services found in most computers. Non-blade systems typically use bulky, hot and space-inefficient components, and may duplicate these across many computers that may or may not perform at capacity. By locating these services in one place and sharing them among

SECTION 10

#1732791543933

564-548: Is a slower process, form factors do evolve regularly in response to changing demands. IBM's long-standing standard, AT (Advanced Technology), was superseded in 1995 by the current industry standard ATX (Advanced Technology Extended), which still governs the size and design of the motherboard in most modern PCs. The latest update to the ATX standard was released in 2007. A divergent standard by chipset manufacturer VIA called EPIA (also known as ITX, and not to be confused with EPIC)

611-507: Is based upon smaller form factors and its own standards. Differences between form factors are most apparent in terms of their intended market sector, and involve variations in size, design compromises and typical features. Most modern computers have very similar requirements, so form factor differences tend to be based upon subsets and supersets of these. For example, a desktop computer may require more sockets for maximum flexibility and many optional connectors and other features on board, whereas

658-416: Is because one can fit up to 128 blade servers in the same rack that will only hold 42 1U rack-mount servers. Blade servers generally include integrated or optional network interface controllers for Ethernet or host adapters for Fibre Channel storage systems or converged network adapter to combine storage and data via one Fibre Channel over Ethernet interface. In many blades, at least one interface

705-646: Is embedded on the motherboard and extra interfaces can be added using mezzanine cards . A blade enclosure can provide individual external ports to which each network interface on a blade will connect. Alternatively, a blade enclosure can aggregate network interfaces into interconnect devices (such as switches) built into the blade enclosure or in networking blades . While computers typically use hard disks to store operating systems, applications and data, these are not necessarily required locally. Many storage connection methods (e.g. FireWire , SATA , E-SATA , SCSI , SAS DAS , FC and iSCSI ) are readily moved outside

752-412: Is incomplete ATX case compatible: PC/104 is an embedded computer standard which defines both a form factor and computer bus. PC/104 is intended for embedded computing environments. Single-board computers built to this form factor are often sold by COTS vendors, which benefits users who want a customized rugged system, without months of design and paper work. The PC/104 form factor was standardized by

799-521: The heating, ventilation, and air conditioning problems that affect large conventional server farms. Developers first placed complete microcomputers on cards and packaged them in standard 19-inch racks in the 1970s, soon after the introduction of 8-bit microprocessors . This architecture was used in the industrial process control industry as an alternative to minicomputer -based control systems. Early models stored programs in EPROM and were limited to

846-831: The Networld+Interop show in May 2000. Patents were awarded for the Ketris blade server architecture . In October 2000 Ziatech was acquired by Intel Corp and the Ketris Blade Server systems would become a product of the Intel Network Products Group. PICMG expanded the CompactPCI specification with the use of standard Ethernet connectivity between boards across the backplane. The PICMG 2.16 CompactPCI Packet Switching Backplane specification

893-447: The November 2012 TOP500 list taking 91st place. Todi consumes 122 kW and is ranked fourth, one behind Titan, on the November 2012 Green500 list. Blade server A blade server is a stripped-down server computer with a modular design optimized to minimize the use of physical space and energy. Blade servers have many components removed to save space, minimize power consumption and other considerations, while still having all

940-593: The PC/104 Consortium in 1992. An IEEE standard corresponding to PC/104 was drafted as IEEE P996.1, but never ratified. The 5.75 × 8.0 in Embedded Board eXpandable (EBX) specification, which was derived from Ampro's proprietary Little Board form-factor, resulted from a collaboration between Ampro and Motorola Computer Group . As compared with PC/104 modules, these larger (but still reasonably embeddable) SBCs tend to have everything of

987-505: The ability to provision (power up, install operating systems and applications software) (e.g. a Web Servers) remotely from a Network Operations Center (NOC). The system architecture when this system was announced was called Ketris, named after the Ketri Sword , worn by nomads in such a way as to be drawn very quickly as needed. First envisioned by Dave Bottom and developed by an engineering team at Ziatech Corp in 1999 and demonstrated at

SECTION 20

#1732791543933

1034-417: The backplane (where server blades would plug-in) eliminating more than 160 cables in a single 84 Rack Unit high 19" rack. For a large data center tens of thousands of Ethernet cables, prone to failure would be eliminated. Further this architecture provided the capabilities to inventory modules installed in the system remotely in each system chassis without the blade servers operating. This architecture enabled

1081-404: The blade computers, the overall utilization becomes higher. The specifics of which services are provided varies by vendor. Computers operate over a range of DC voltages, but utilities deliver power as AC , and at higher voltages than required within computers. Converting this current requires one or more power supply units (or PSUs). To ensure that the failure of one power source does not affect

1128-420: The blade itself, and in the blade system as a whole. In a standard server-rack configuration, one rack unit or 1U —19 inches (480 mm) wide and 1.75 inches (44 mm) tall—defines the minimum possible size of any equipment. The principal benefit and justification of blade computing relates to lifting this restriction so as to reduce size requirements. The most common computer rack form-factor

1175-469: The c3000 which holds up to 8 half-height ProLiant line blades (also available in tower form), and the c7000 ( 10U ) which holds up to 16 half-height ProLiant blades. Dell 's product, the M1000e is a 10U modular enclosure and holds up to 16 half-height PowerEdge blade servers or 32 quarter-height blades. Motherboard form factor In computing , the motherboard form factor is the specification of

1222-454: The emerging Internet Data Centers where the manpower simply didn't exist to keep pace a new server architecture was needed. In 1998 and 1999 this new Blade Server Architecture was developed at Ziatech based on their Compact PCI platform to house as many as 14 "blade servers" in a standard 19" 9U high rack mounted chassis, allowing in this configuration as many as 84 servers in a standard 84 Rack Unit 19" rack. What this new architecture brought to

1269-570: The enclosure. Systems administrators can use storage blades where a requirement exists for additional local storage. Blade servers function well for specific purposes such as web hosting , virtualization , and cluster computing . Individual blades are typically hot-swappable . As users deal with larger and more diverse workloads, they add more processing power, memory and I/O bandwidth to blade servers. Although blade-server technology in theory allows for open, cross-vendor systems, most users buy modules, enclosures, racks and management tools from

1316-444: The functional components to be considered a computer . Unlike a rack-mount server, a blade server fits inside a blade enclosure , which can hold multiple blade servers, providing services such as power, cooling, networking, various interconnects and management. Together, blades and the blade enclosure form a blade system, which may itself be rack-mounted. Different blade providers have differing principles regarding what to include in

1363-602: The introduction of AGP and, more recently, PCI Express have influenced motherboard design. However, the standardized size and layout of motherboards have changed much more slowly and are controlled by their own standards. The list of components required on a motherboard changes far more slowly than the components themselves. For example, north bridge microchips have changed many times since their introduction with many manufacturers bringing out their own versions, but in terms of form factor standards, provisions for north bridges have remained fairly static for many years. Although it

1410-449: The latter sold its x86 server business to Lenovo in 2014 after selling its consumer PC line to Lenovo in 2005. In 2009, Cisco announced blades in its Unified Computing System product line, consisting of 6U high chassis, up to 8 blade servers in each chassis. It had a heavily modified Nexus 5K switch, rebranded as a fabric interconnect, and management software for the whole system. HP's initial line consisted of two chassis models,

1457-651: The model of GPU used. The nodes communicate with each other via the Gemini Interconnect; each Gemini chip services 2 nodes with a capacity of 160 GB/s. Depending on the components used, a full cabinet will consume between 45 and 54.1 kW of electricity which is converted into heat; thus the cabinets need cooling, either by air or water. XK7 based machines run the Cray Linux Environment which incorporates SUSE Linux Enterprise Server . Code to run on an XK7 machine can be written in

Cray XK7 - Misplaced Pages Continue

1504-441: The most significant specification is for that of the motherboard, which generally dictates the overall size of the case . Small form factors have been developed and implemented. A PC motherboard is the main circuit board within a typical desktop computer , laptop or server . Its main functions are as follows: As new generations of components have been developed, the standards of motherboards have changed too. For example,

1551-487: The number of PSUs required to provide a resilient power supply. The popularity of blade servers, and their own appetite for power, has led to an increase in the number of rack-mountable uninterruptible power supply (or UPS) units, including units targeted specifically towards blade servers (such as the BladeUPS ). During operation, electrical and mechanical components produce heat, which a system must dissipate to ensure

1598-411: The operation of the computer, even entry-level servers often have redundant power supplies, again adding to the bulk and heat output of the design. The blade enclosure's power supply provides a single power source for all blades within the enclosure. This single power source may come as a power supply in the enclosure or as a dedicated separate PSU supplying DC to multiple enclosures. This setup reduces

1645-447: The pooling or sharing of common infrastructure to support the entire chassis, rather than providing each of these on a per server box basis. In 2011, research firm IDC identified the major players in the blade market as HP , IBM , Cisco , and Dell . Other companies selling blade servers include Supermicro , Hitachi . The prominent brands in the blade server market are Supermicro , Cisco Systems , HPE , Dell and IBM , though

1692-557: The proper functioning of its components. Most blade enclosures, like most computing systems, remove heat by using fans . A frequently underestimated problem when designing high-performance computer systems involves the conflict between the amount of heat a system generates and the ability of its fans to remove the heat. The blade's shared power and cooling means that it does not generate as much heat as traditional servers. Newer blade-enclosures feature variable-speed fans and control logic, or even liquid cooling systems that adjust to meet

1739-525: The real world implementation in Internet Data Centers where thermal as well as other maintenance and operating cost had become prohibitively expensive, this blade server architecture with remote automated provisioning, health and performance monitoring and management would be a significantly less expensive operating cost. The first commercialized blade-server architecture was invented by Christopher Hipp and David Kirkeby , and their patent

1786-504: The relevance of the LINPACK benchmark used by the TOP500 organisation and therefore did not submit a benchmark test for ranking. The Swiss National Supercomputing Centre (CSCS) machine named Todi was upgraded to XK7 on October 22, 2012. Todi has 272 nodes with Opteron 6272 CPUs with 32 GB of memory and a K20X GPU with 6 GB. Todi has a theoretical peak performance of 393 teraFLOPS and performed at 274 teraFLOPS in

1833-589: The same vendor. Eventual standardization of the technology might result in more choices for consumers; as of 2009 increasing numbers of third-party software vendors have started to enter this growing field. Blade servers do not, however, provide the answer to every computing problem. One can view them as a form of productized server-farm that borrows from mainframe packaging, cooling, and power-supply technology. Very large computing tasks may still require server farms of blade servers, and because of blade servers' high power density, can suffer even more acutely from

1880-413: The server, though not all are used in enterprise-level installations. Implementing these connection interfaces within the computer presents similar challenges to the networking interfaces (indeed iSCSI runs over the network interface), and similarly these can be removed from the blade and presented individually or aggregated either on the chassis or through other blades . The ability to boot the blade from

1927-414: The system's cooling requirements. At the same time, the increased density of blade-server configurations can still result in higher overall demands for cooling with racks populated at over 50% full. This is especially true with early-generation blades. In absolute terms, a fully populated rack of blade servers is likely to require more cooling capacity than a fully populated rack of standard 1U servers. This

Cray XK7 - Misplaced Pages Continue

1974-492: The table was a set of new interfaces to the hardware specifically to provide the capability to remotely monitor the health and performance of all major replaceable modules that could be changed/replaced while the system was in operation. The ability to change/replace or add modules within the system while it is in operation is known as Hot-Swap. Unique to any other server system the Ketris Blade servers routed Ethernet across

2021-440: The telecom industry's need for a high availability and dense computing platform with extended product life (10+ years). While AdvancedTCA system and boards typically sell for higher prices than blade servers, the operating cost (manpower to manage and maintain) are dramatically lower, where operating cost often dwarf the acquisition cost for traditional servers. AdvancedTCA promote them for telecommunications customers, however in

2068-420: The then emerging Peripheral Component Interconnect bus PCI called CompactPCI . CompactPCI was actually invented by Ziatech Corp of San Luis Obispo, CA and developed into an industry standard. Common among these chassis-based computers was the fact that the entire chassis was a single system. While a chassis might include multiple computing elements to provide the desired level of performance and redundancy, there

2115-537: Was adopted in Sept 2001. This provided the first open architecture for a multi-server chassis. The Second generation of Ketris would be developed at Intel as an architecture for the telecommunications industry to support the build out of IP base telecom services and in particular the LTE (Long Term Evolution) Cellular Network build-out. PICMG followed with this larger and more feature-rich AdvancedTCA specification, targeting

2162-443: Was always one master board in charge, or two redundant fail-over masters coordinating the operation of the entire system. Moreover, this system architecture provided management capabilities not present in typical rack mount computers, much more like in ultra-high reliability systems, managing power supplies, cooling fans as well as monitoring health of other internal components. Demands for managing hundreds and thousands of servers in

2209-456: Was assigned to Houston-based RLX Technologies . RLX, which consisted primarily of former Compaq Computer Corporation employees, including Hipp and Kirkeby, shipped its first commercial blade server in 2001. RLX was acquired by Hewlett-Packard in 2005. The name blade server appeared when a card included the processor, memory, I/O and non-volatile program storage ( flash memory or small hard disk (s)). This allowed manufacturers to package

#932067