Misplaced Pages

Worldwide LHC Computing Grid

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The Worldwide LHC Computing Grid ( WLCG ), formerly (until 2006) the LHC Computing Grid ( LCG ), is an international collaborative project that consists of a grid-based computer network infrastructure incorporating over 170 computing centers in 42 countries, as of 2017. It was designed by CERN to handle the prodigious volume of data produced by Large Hadron Collider (LHC) experiments.

#693306

88-574: By 2012, data from over 300 trillion (3×10) LHC proton-proton collisions had been analyzed, and LHC collision data was being produced at approximately 25 petabytes per year. As of  2017 the LHC Computing Grid is the world's largest computing grid comprising over 170 computing facilities in a worldwide network across 42 countries scattered around the world that produce a massive distributed computing infrastructure with about 1,000,000 CPU cores, providing more than 10,000 physicists around

176-491: A real-time environment and fail if an operation is not completed in a specified amount of time. For example, computer-controlled anti-lock brakes must begin braking within a predictable and limited time period after the brake pedal is sensed or else failure of the brake will occur. Benchmarking takes all these factors into account by measuring the time a computer takes to run through a series of test programs. Although benchmarking shows strengths, it should not be how you choose

264-400: A 60-bit word without having to split a byte between one word and the next. If longer bytes were needed, 60 bits would, of course, no longer be ideal. With present applications, 1, 4, and 6 bits are the really important cases.     With 64-bit words, it would often be necessary to make some compromises, such as leaving 4 bits unused in a word when dealing with 6-bit bytes at

352-467: A 64-bit word length for Stretch. It also supports NSA 's requirement for 8-bit bytes. Werner's term "Byte" first popularized in this memo.     NB. This timeline erroneously specifies the birth date of the term "byte" as July 1956 , while Buchholz actually used the term as early as June 1956 .     [...] 60 is a multiple of 1, 2, 3, 4, 5, and 6. Hence bytes of length from 1 to 6 bits can be packed efficiently into

440-505: A Stretch designer, opened Chapter 2 of a book called Planning a Computer System: Project Stretch by stating, "Computer architecture, like other architecture, is the art of determining the needs of the user of a structure and then designing to meet those needs as effectively as possible within economic and technological constraints." Brooks went on to help develop the IBM System/360 line of computers, in which "architecture" became

528-465: A birth certificate. But I am sure that "byte" is coming of age in 1977 with its 21st birthday.     Many have assumed that byte, meaning 8 bits, originated with the IBM System/360, which spread such bytes far and wide in the mid-1960s. The editor is correct in pointing out that the term goes back to the earlier Stretch computer (but incorrect in that Stretch was the first, not

616-415: A computer capable of running a virtual machine needs virtual memory hardware so that the memory of different virtual computers can be kept separated. Computer organization and features also affect power consumption and processor cost. Once an instruction set and microarchitecture have been designed, a practical machine must be developed. This design process is called the implementation . Implementation

704-490: A computer system. The case of instruction set architecture can be used to illustrate the balance of these competing factors. More complex instruction sets enable programmers to write more space efficient programs, since a single instruction can encode some higher-level abstraction (such as the x86 Loop instruction ). However, longer and more complex instructions take longer for the processor to decode and can be more costly to implement effectively. The increased complexity from

792-427: A computer. Often the measured machines split on different measures. For example, one system might handle scientific applications quickly, while another might render video games more smoothly. Furthermore, designers may target and add special features to their products, through hardware or software, that permit a specific benchmark to execute quickly but do not offer similar advantages to general tasks. Power efficiency

880-476: A convenience, because 1024 is approximately 1000 . This definition was popular in early decades of personal computing , with products like the Tandon 5 1 ⁄ 4 -inch DD floppy format (holding 368 640 bytes) being advertised as "360 KB", following the 1024 -byte convention. It was not universal, however. The Shugart SA-400 5 1 ⁄ 4 -inch floppy disk held 109,375 bytes unformatted, and

968-469: A detailed analysis of the computer's organization. For example, in an SD card , the designers might need to arrange the card so that the most data can be processed in the fastest possible way. Computer organization also helps plan the selection of a processor for a particular project. Multimedia projects may need very rapid data access, while virtual machines may need fast interrupts. Sometimes certain tasks need additional components as well. For example,

SECTION 10

#1732772778694

1056-484: A full transmission unit usually additionally includes a start bit, 1 or 2 stop bits, and possibly a parity bit , and thus its size may vary from seven to twelve bits for five to eight bits of actual data. For synchronous communication the error checking usually uses bytes at the end of a frame .     Terms used here to describe the structure imposed by the machine design, in addition to bit , are listed below.      Byte denotes

1144-475: A group of bits used to encode a character, or the number of bits transmitted in parallel to and from input-output units. A term other than character is used here because a given character may be represented in different applications by more than one code, and different codes may use different numbers of bits (i.e., different byte sizes). In input-output transmission the grouping of bits may be completely arbitrary and have no relation to actual characters. (The term

1232-407: A higher clock rate may not necessarily have greater performance. As a result, manufacturers have moved away from clock speed as a measure of performance. Other factors influence speed, such as the mix of functional units , bus speeds, available memory, and the type and order of instructions in the programs. There are two main types of speed: latency and throughput . Latency is the time between

1320-407: A large instruction set also creates more room for unreliability when instructions interact in unexpected ways. The implementation involves integrated circuit design , packaging, power , and cooling . Optimization of the design requires familiarity with topics from compilers and operating systems to logic design and packaging. An instruction set architecture (ISA) is the interface between

1408-412: A noun defining "what the user needs to know". The System/360 line was succeeded by several compatible lines of computers, including the current IBM Z line. Later, computer users came to use the term in many less explicit ways. The earliest computer architectures were designed on paper and then directly built into the final hardware form. Later, computer architecture prototypes were physically built in

1496-406: A number of bits, treated as a unit, and usually representing a character or a part of a character.     NOTES:     1 The number of bits in a byte is fixed for a given data processing system.     2 The number of bits in a byte is usually 8.      We received the following from W Buchholz, one of the individuals who

1584-408: A proposed instruction set. Modern emulators can measure size, cost, and speed to determine whether a particular ISA is meeting its goals. Computer organization helps optimize performance-based products. For example, software engineers need to know the processing power of processors . They may need to optimize software in order to gain the most performance for the lowest price. This can require quite

1672-503: A single chip as possible. In the world of embedded computers , power efficiency has long been an important goal next to throughput and latency. Increases in clock frequency have grown more slowly over the past few years, compared to power reduction improvements. This has been driven by the end of Moore's Law and demand for longer battery life and reductions in size for mobile technology . This change in focus from higher clock rates to power consumption and miniaturization can be shown by

1760-548: A unit of logarithmic power ratio named after Alexander Graham Bell , creating a conflict with the IEC specification. However, little danger of confusion exists, because the bel is a rarely used unit. It is used primarily in its decadic fraction, the decibel (dB), for signal strength and sound pressure level measurements, while a unit for one-tenth of a byte, the decibyte, and other fractions, are only used in derived units, such as transmission rates. The lowercase letter o for octet

1848-405: A unit which "contains an unspecified amount of information ... capable of holding at least 64 distinct values ... at most 100 distinct values. On a binary computer a byte must therefore be composed of six bits". He notes that "Since 1975 or so, the word byte has come to mean a sequence of precisely eight binary digits...When we speak of bytes in connection with MIX we shall confine ourselves to

SECTION 20

#1732772778694

1936-668: Is 1024 bytes = 1024 bytes, one mebibyte (1 MiB) is 1024 bytes = 1 048 576 bytes, and so on. In 1999, Donald Knuth suggested calling the kibibyte a "large kilobyte" ( KKB ). The IEC adopted the IUPAC proposal and published the standard in January 1999. The IEC prefixes are part of the International System of Quantities . The IEC further specified that the kilobyte should only be used to refer to 1000 bytes. Lawsuits arising from alleged consumer confusion over

2024-415: Is another important measurement in modern computers. Higher power efficiency can often be traded for lower speed or higher cost. The typical measurement when referring to power consumption in computer architecture is MIPS/W (millions of instructions per second per watt). Modern circuits have less power required per transistor as the number of transistors per chip grows. This is because each transistor that

2112-510: Is coined from bite , but respelled to avoid accidental mutation to bit .)     A word consists of the number of data bits transmitted in parallel from or to memory in one memory cycle. Word size is thus defined as a structural property of the memory. (The term catena was coined for this purpose by the designers of the Bull GAMMA 60  [ fr ] computer.)      Block refers to

2200-460: Is defined as eight bits. It is a signed data type, holding values from −128 to 127. .NET programming languages, such as C# , define byte as an unsigned type, and the sbyte as a signed data type, holding values from 0 to 255, and −128 to 127 , respectively. In data transmission systems, the byte is used as a contiguous sequence of bits in a serial data stream, representing the smallest distinguished unit of data. For asynchronous communication

2288-455: Is defined as the symbol for octet in IEC ;80000-13 and is commonly used in languages such as French and Romanian , and is also combined with metric prefixes for multiples, for example ko and Mo. More than one system exists to define unit multiples based on the byte. Some systems are based on powers of 10 , following the International System of Units (SI), which defines for example

2376-672: Is defined to equal 1,000 bytes—is recommended by the International Electrotechnical Commission (IEC). The IEC standard defines eight such multiples, up to 1 yottabyte (YB), equal to 1000 bytes. The additional prefixes ronna- for 1000 and quetta- for 1000 were adopted by the International Bureau of Weights and Measures (BIPM) in 2022. This definition is most commonly used for data-rate units in computer networks , internal bus, hard drive and flash media transfer speeds, and for

2464-618: Is equal to 1,024 (i.e., 2 ) bytes is defined by international standard IEC 80000-13 and is supported by national and international standards bodies ( BIPM , IEC , NIST ). The IEC standard defines eight such multiples, up to 1 yobibyte (YiB), equal to 1024 bytes. The natural binary counterparts to ronna- and quetta- were given in a consultation paper of the International Committee for Weights and Measures' Consultative Committee for Units (CCU) as robi- (Ri, 1024 ) and quebi- (Qi, 1024 ), but have not yet been adopted by

2552-433: Is just as easy to use all six bits in alphanumeric work, or to handle bytes of only one bit for logical analysis, or to offset the bytes by any number of bits. All this can be done by pulling the appropriate shift diagonals. An analogous matrix arrangement is used to change from serial to parallel operation at the output of the adder. [...]     byte:     A string that consists of

2640-556: Is necessary. The primary configuration for the computers used in the grid is based on CentOS . In 2015, CERN switched away from Scientific Linux to CentOS. Distributed computing resources for analysis by end-user physicists are provided by multiple federations across the Europe, Asia Pacific and the Americas. Petabyte The byte is a unit of digital information that most commonly consists of eight bits . Historically,

2728-466: Is often called a nibble , also nybble , which is conveniently represented by a single hexadecimal digit. The term octet unambiguously specifies a size of eight bits. It is used extensively in protocol definitions. Historically, the term octad or octade was used to denote eight bits as well at least in Western Europe; however, this usage is no longer common. The exact origin of

Worldwide LHC Computing Grid - Misplaced Pages Continue

2816-471: Is put in a new chip requires its own power supply and requires new pathways to be built to power it. However, the number of transistors per chip is starting to increase at a slower rate. Therefore, power efficiency is starting to become as important, if not more important than fitting more and more transistors into a single chip. Recent processor designs have shown this emphasis as they put more focus on power efficiency rather than cramming as many transistors into

2904-465: Is the amount of time that it takes for information from one node to travel to the source) and throughput. Sometimes other considerations, such as features, size, weight, reliability, and expandability are also factors. The most common scheme does an in-depth power analysis and figures out how to keep power consumption low while maintaining adequate performance. Modern computer performance is often described in instructions per cycle (IPC), which measures

2992-457: Is used here because a given character may be represented in different applications by more than one code, and different codes may use different numbers of bits (ie, different byte sizes). In input-output transmission the grouping of bits may be completely arbitrary and have no relation to actual characters. (The term is coined from bite , but respelled to avoid accidental mutation to bit. )      System/360 took over many of

3080-466: Is usually not considered architectural design, but rather hardware design engineering . Implementation can be further broken down into several steps: For CPUs , the entire implementation process is organized differently and is often referred to as CPU design . The exact form of a computer system depends on the constraints and goals. Computer architectures usually trade off standards, power versus performance , cost, memory capacity, latency (latency

3168-597: The IRE Transactions on Electronic Computers , June 1959, page 121. The notions of that paper were elaborated in Chapter 4 of Planning a Computer System (Project Stretch) , edited by W Buchholz, McGraw-Hill Book Company (1962). The rationale for coining the term was explained there on page 40 as follows: Byte denotes a group of bits used to encode a character, or the number of bits transmitted in parallel to and from input-output units. A term other than character

3256-649: The American Standard Code for Information Interchange (ASCII) as the Federal Information Processing Standard , which replaced the incompatible teleprinter codes in use by different branches of the U.S. government and universities during the 1960s. ASCII included the distinction of upper- and lowercase alphabets and a set of control characters to facilitate the transmission of written language as well as printing device functions, such as page advance and line feed, and

3344-626: The International Union of Pure and Applied Chemistry 's (IUPAC) Interdivisional Committee on Nomenclature and Symbols attempted to resolve this ambiguity by proposing a set of binary prefixes for the powers of 1024, including kibi (kilobinary), mebi (megabinary), and gibi (gigabinary). In December 1998, the IEC addressed such multiple usages and definitions by adopting the IUPAC's proposed prefixes (kibi, mebi, gibi, etc.) to unambiguously denote powers of 1024. Thus one kibibyte (1 KiB)

3432-514: The Stretch , an IBM-developed supercomputer for Los Alamos National Laboratory (at the time known as Los Alamos Scientific Laboratory). To describe the level of detail for discussing the luxuriously embellished computer, he noted that his description of formats, instruction types, hardware parameters, and speed enhancements were at the level of "system architecture", a term that seemed more useful than "machine organization". Subsequently, Brooks,

3520-526: The bit endianness . The size of the byte has historically been hardware -dependent and no definitive standards existed that mandated the size. Sizes from 1 to 48 bits have been used. The six-bit character code was an often-used implementation in early encoding systems, and computers using six-bit and nine-bit bytes were common in the 1960s. These systems often had memory words of 12, 18, 24, 30, 36, 48, or 60 bits, corresponding to 2, 3, 4, 5, 6, 8, or 10 six-bit bytes, and persisted, in legacy systems, into

3608-512: The Adder. The Adder may accept all or only some of the bits.     Assume that it is desired to operate on 4 bit decimal digits , starting at the right. The 0-diagonal is pulsed first, sending out the six bits 0 to 5, of which the Adder accepts only the first four (0-3). Bits 4 and 5 are ignored. Next, the 4 diagonal is pulsed. This sends out bits 4 to 9, of which the last two are again ignored, and so on.     It

Worldwide LHC Computing Grid - Misplaced Pages Continue

3696-465: The Grid consisted of some 200,000 processing cores and 150 petabytes of disk space, distributed across 34 countries. The data stream from the detectors provides approximately 300 GByte /s of data, which after filtering for "interesting events", results in a data stream of about 300 MByte /s. The CERN computer center, considered "Tier 0" of the LHC Computing Grid, has a dedicated 10 Gbit /s connection to

3784-515: The IEC and ISO. An alternative system of nomenclature for the same units (referred to here as the customary convention ), in which 1 kilobyte (KB) is equal to 1,024 bytes, 1 megabyte (MB) is equal to 1024 bytes and 1 gigabyte (GB) is equal to 1024 bytes is mentioned by a 1990s JEDEC standard. Only the first three multiples (up to GB) are mentioned by the JEDEC standard, which makes no mention of TB and larger. While confusing and incorrect,

3872-512: The Shift Matrix to be used to convert a 60-bit word , coming from Memory in parallel, into characters , or 'bytes' as we have called them, to be sent to the Adder serially. The 60 bits are dumped into magnetic cores on six different levels. Thus, if a 1 comes out of position 9, it appears in all six cores underneath. Pulsing any diagonal line will send the six bits stored along that line to

3960-478: The Stretch concepts, including the basic byte and word sizes, which are powers of 2. For economy, however, the byte size was fixed at the 8 bit maximum, and addressing at the bit level was replaced by byte addressing.     Since then the term byte has generally meant 8 bits, and it has thus passed into the general vocabulary.     Are there any other terms coined especially for

4048-631: The System/360 led to the ubiquitous adoption of the eight-bit storage size, while in detail the EBCDIC and ASCII encoding schemes are different. In the early 1960s, AT&T introduced digital telephony on long-distance trunk lines . These used the eight-bit μ-law encoding . This large investment promised to reduce transmission costs for eight-bit data. In Volume 1 of The Art of Computer Programming (first published in 1968), Donald Knuth uses byte in his hypothetical MIX computer to denote

4136-565: The Tier 1 institutions by general-purpose national research and education networks . The data produced by the LHC on all of its distributed computing grid is expected to add up to 200 PB of data each year. In total, the four main detectors at the LHC produced 13 petabytes of data in 2010. The Tier 1 institutions receive specific subsets of the raw data, for which they serve as a backup repository for CERN. They also perform reprocessing when recalibration

4224-591: The binary and decimal definitions of multiples of the byte have generally ended in favor of the manufacturers, with courts holding that the legal definition of gigabyte or GB is 1 GB = 1 000 000 000 (10 ) bytes (the decimal definition), rather than the binary definition (2 , i.e., 1 073 741 824 ). Specifically, the United States District Court for the Northern District of California held that "the U.S. Congress has deemed

4312-551: The byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures . To disambiguate arbitrarily sized bytes from the common 8-bit definition, network protocol documents such as the Internet Protocol ( RFC   791 ) refer to an 8-bit byte as an octet . Those bits in an octet are usually counted with numbering from 0 to 7 or 7 to 0 depending on

4400-452: The capacities of most storage media , particularly hard drives , flash -based storage, and DVDs . Operating systems that use this definition include macOS , iOS , Ubuntu , and Debian . It is also consistent with the other uses of the SI prefixes in computing, such as CPU clock speeds or measures of performance . A system of units based on powers of 2 in which 1 kibibyte (KiB)

4488-503: The code is to understand), size of the code (how much code is required to do a specific action), cost of the computer to interpret the instructions (more complexity means more hardware needed to decode and execute the instructions), and speed of the computer (with more complex decoding hardware comes longer decode time). Memory organization defines how instructions interact with the memory, and how memory interacts with itself. During design emulation , emulators can run programs written in

SECTION 50

#1732772778694

4576-399: The computer field which have found their way into general dictionaries of English language?     1956 Summer: Gerrit Blaauw , Fred Brooks , Werner Buchholz , John Cocke and Jim Pomerene join the Stretch team. Lloyd Hunter provides transistor leadership.     1956 July [ sic ]: In a report Werner Buchholz lists the advantages of

4664-425: The computer's software and hardware and also can be viewed as the programmer's view of the machine. Computers do not understand high-level programming languages such as Java , C++ , or most programming languages used. A processor only understands instructions encoded in some numerical fashion, usually as binary numbers . Software tools, such as compilers , translate those high level languages into instructions that

4752-651: The counting room. The project was expected to generate multiple TB of raw data and event summary data, which represents the output of calculations done by the CPU farm at the CERN data center. This data is sent out from CERN to thirteen Tier 1 academic institutions in Europe, Asia, and North America, via dedicated links with 10 Gbit/s or higher of bandwidth. This is called the LHC Optical Private Network. More than 150 Tier 2 institutions are connected to

4840-635: The customary convention is used by the Microsoft Windows operating system and random-access memory capacity, such as main memory and CPU cache size, and in marketing and billing by telecommunication companies, such as Vodafone , AT&T , Orange and Telstra . For storage capacity, the customary convention was used by macOS and iOS through Mac OS X 10.6 Snow Leopard and iOS 10, after which they switched to units based on powers of 10. Various computer vendors have coined terms for data of various sizes, sometimes with different sizes for

4928-413: The data. A design report was published in 2005. It was announced to be ready for data on 3 October 2008. A popular 2008 press article predicted "the internet could soon be made obsolete" by its technology. CERN had to publish its own articles trying to clear up the confusion. It incorporates both private fiber-optic cable links and existing high-speed portions of the public Internet . At the end of 2010,

5016-454: The decimal definition of gigabyte to be the 'preferred' one for the purposes of 'U.S. trade and commerce' [...] The California Legislature has likewise adopted the decimal system for all 'transactions in this state. ' " Earlier lawsuits had ended in settlement with no court ruling on the question, such as a lawsuit against drive manufacturer Western Digital . Western Digital settled the challenge and added explicit disclaimers to products that

5104-452: The description may include the instruction set architecture design, microarchitecture design, logic design , and implementation . The first documented computer architecture was in the correspondence between Charles Babbage and Ada Lovelace , describing the analytical engine . While building the computer Z1 in 1936, Konrad Zuse described in two patent applications for his future projects that machine instructions could be stored in

5192-448: The efficiency of the architecture at any clock frequency; a faster IPC rate means the computer is faster. Older computers had IPC counts as low as 0.1 while modern processors easily reach nearly 1. Superscalar processors may reach three to five IPC by executing several instructions per clock cycle. Counting machine-language instructions would be misleading because they can do varying amounts of work in different ISAs. The "instruction" in

5280-406: The final hardware form. The discipline of computer architecture has three main subcategories: There are other technologies in computer architecture. The following technologies are used in bigger companies like Intel, and were estimated in 2002 to count for 1% of all of computer architecture: Computer architecture is concerned with balancing the performance, efficiency, cost, and reliability of

5368-475: The form of a transistor–transistor logic (TTL) computer—such as the prototypes of the 6800 and the PA-RISC —tested, and tweaked, before committing to the final hardware form. As of the 1990s, new computer architectures are typically "built", tested, and tweaked—inside some other computer architecture in a computer architecture simulator ; or inside a FPGA as a soft microprocessor ; or both—before committing to

SECTION 60

#1732772778694

5456-473: The former sense of the word, harking back to the days when bytes were not yet standardized." The development of eight-bit microprocessors in the 1970s popularized this storage size. Microprocessors such as the Intel 8080 , the direct predecessor of the 8086 , could also perform a small number of operations on the four-bit pairs in a byte, such as the decimal-add-adjust (DAA) instruction. A four-bit quantity

5544-566: The input and output. However, the LINK Computer can be equipped to edit out these gaps and to permit handling of bytes which are split between words. [...]     [...] The maximum input-output byte size for serial operation will now be 8 bits, not counting any error detection and correction bits. Thus, the Exchange will operate on an 8-bit byte basis, and any input-output units with less than 8 bits per byte will leave

5632-428: The instruction. It is a deliberate respelling of bite to avoid accidental mutation to bit . Another origin of byte for bit groups smaller than a computer's word size, and in particular groups of four bits , is on record by Louis G. Dooley, who claimed he coined the term while working with Jules Schwartz and Dick Beeler on an air defense system called SAGE at MIT Lincoln Laboratory in 1956 or 1957, which

5720-523: The instructions. The names can be recognized by a software development tool called an assembler . An assembler is a computer program that translates a human-readable form of the ISA into a computer-readable form. Disassemblers are also widely available, usually in debuggers and software programs to isolate and correct malfunctions in binary computer programs. ISAs vary in quality and completeness. A good ISA compromises between programmer convenience (how easy

5808-418: The integral data type unsigned char must hold at least 256 different values, and is represented by at least eight bits (clause 5.2.4.2.1). Various implementations of C and C++ reserve 8, 9, 16, 32, or 36 bits for the storage of a byte. In addition, the C and C++ standards require that there are no gaps between two bytes. This means every bit in memory is part of a byte. Java's primitive data type byte

5896-465: The last, of IBM's second-generation transistorized computers to be developed).     The first reference found in the files was contained in an internal memo written in June 1956 during the early days of developing Stretch . A byte was described as consisting of any number of parallel bits from one to six. Thus a byte was assumed to have a length appropriate for the occasion. Its first use

5984-455: The number of words transmitted to or from an input-output unit in response to a single input-output instruction. Block size is a structural property of an input-output unit; it may have been fixed by the design or left to be varied by the program.     [...] Most important, from the point of view of editing, will be the ability to handle any characters or digits, from 1 to 6 bits long.     Figure 2 shows

6072-460: The physical or logical control of data flow over the transmission media. During the early 1960s, while also active in ASCII standardization, IBM simultaneously introduced in its product line of System/360 the eight-bit Extended Binary Coded Decimal Interchange Code (EBCDIC), an expansion of their six-bit binary-coded decimal (BCDIC) representations used in earlier card punches. The prominence of

6160-458: The potential ambiguity of the term "byte". The symbol for octet, 'o', also conveniently eliminates the ambiguity in the symbol 'B' between byte and bel . The term byte was coined by Werner Buchholz in June 1956, during the early design phase for the IBM Stretch computer, which had addressing to the bit and variable field length (VFL) instructions with a byte size encoded in

6248-453: The prefix kilo as 1000 (10 ); other systems are based on powers of 2 . Nomenclature for these systems has led to confusion. Systems based on powers of 10 use standard SI prefixes ( kilo , mega , giga , ...) and their corresponding symbols (k, M, G, ...). Systems based on powers of 2, however, might use binary prefixes ( kibi , mebi , gibi , ...) and their corresponding symbols (Ki, Mi, Gi, ...) or they might use

6336-525: The prefixes K, M, and G, creating ambiguity when the prefixes M or G are used. While the difference between the decimal and binary interpretations is relatively small for the kilobyte (about 2% smaller than the kibibyte), the systems deviate increasingly as units grow larger (the relative deviation grows by 2.4% for each three orders of magnitude). For example, a power-of-10-based terabyte is about 9% smaller than power-of-2-based tebibyte. Definition of prefixes using powers of 10—in which 1 kilobyte (symbol kB)

6424-482: The processor can understand. Besides instructions, the ISA defines items in the computer that are available to a program—e.g., data types , registers , addressing modes , and memory . Instructions locate these available items with register indexes (or names) and memory addressing modes. The ISA of a computer is usually described in a small instruction manual, which describes how the instructions are encoded. Also, it may define short (vaguely) mnemonic names for

6512-401: The remaining bits blank. The resultant gaps can be edited out later by programming [...] Computer architecture In computer science and computer engineering , computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level,

6600-526: The same storage used for data, i.e., the stored-program concept. Two other early and important examples are: The term "architecture" in computer literature can be traced to the work of Lyle R. Johnson and Frederick P. Brooks, Jr. , members of the Machine Organization department in IBM's main research center in 1959. Johnson had the opportunity to write a proprietary research communication about

6688-450: The same term even within a single vendor. These terms include double word , half word , long word , quad word , slab , superword and syllable . There are also informal terms. e.g., half byte and nybble for 4 bits, octal K for 1000 8 . Contemporary computer memory has a binary architecture making a definition of memory units based on powers of 2 most practical. The use of the metric prefix kilo for binary multiples arose as

6776-498: The standard measurements is not a count of the ISA's machine-language instructions, but a unit of measurement, usually based on the speed of the VAX computer architecture. Many people used to measure a computer's speed by the clock rate (usually in MHz or GHz). This refers to the cycles per second of the main clock of the CPU . However, this metric is somewhat misleading, as a machine with

6864-507: The start of a process and its completion. Throughput is the amount of work done per unit time. Interrupt latency is the guaranteed maximum response time of the system to an electronic event (like when the disk drive finishes moving some data). Performance is affected by a very wide range of design choices — for example, pipelining a processor usually makes latency worse, but makes throughput better. Computers that control machinery usually need low interrupt latencies. These computers operate in

6952-535: The term is unclear, but it can be found in British, Dutch, and German sources of the 1960s and 1970s, and throughout the documentation of Philips mainframe computers. The unit symbol for the byte is specified in IEC 80000-13 , IEEE 1541 and the Metric Interchange Format as the upper-case character B. In the International System of Quantities (ISQ), B is also the symbol of the bel ,

7040-724: The twenty-first century. In this era, bit groupings in the instruction stream were often referred to as syllables or slab , before the term byte became common. The modern de facto standard of eight bits, as documented in ISO/IEC 2382-1:1993, is a convenient power of two permitting the binary-encoded values 0 through 255 for one byte, as 2 to the power of 8 is 256. The international standard IEC 80000-13 codified this common meaning. Many types of applications use information representable in eight or fewer bits and processor designers commonly optimize for this usage. The popularity of major commercial computing architectures has aided in

7128-532: The ubiquitous acceptance of the 8-bit byte. Modern architectures typically use 32- or 64-bit words, built of four or eight bytes, respectively. The unit symbol for the byte was designated as the upper-case letter B by the International Electrotechnical Commission (IEC) and Institute of Electrical and Electronics Engineers (IEEE). Internationally, the unit octet explicitly defines a sequence of eight bits, eliminating

7216-425: The usable capacity may differ from the advertised capacity. Seagate was sued on similar grounds and also settled. Many programming languages define the data type byte . The C and C++ programming languages define byte as an "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment" (clause 3.6 of the C standard). The C standard requires that

7304-559: The world with near real-time access to the LHC data, and the power to process it. According to the WLCG Website as of 2024: "WLCG combines about 1.4 million computer cores and 1.5 exabytes of storage from over 170 sites in 42 countries [...] It runs over 2 million tasks per day and [...] global transfer rates exceeded 260 GB/s." Indicating substantial upgrades to WLCG over time beyond its initial release. The Large Hadron Collider at CERN

7392-404: Was advertised as "110 Kbyte", using the 1000 convention. Likewise, the 8-inch DEC RX01 floppy (1975) held 256 256 bytes formatted, and was advertised as "256k". Some devices were advertised using a mixture of the two definitions: most notably, floppy disks advertised as "1.44 MB" have an actual capacity of 1440 KiB , the equivalent of 1.47 MB or 1.41 MiB. In 1995,

7480-566: Was designed to test the existence of the Higgs boson , an important but elusive piece of knowledge that had been sought by particle physicists for over 40 years . A very powerful particle accelerator was needed, because Higgs bosons might not be seen in lower energy experiments, and because vast numbers of collisions would need to be studied. Such a collider would also produce unprecedented quantities of collision data requiring analysis. Therefore, advanced computing facilities were needed to process

7568-523: Was in the context of the input-output equipment of the 1950s, which handled six bits at a time. The possibility of going to 8-bit bytes was considered in August 1956 and incorporated in the design of Stretch shortly thereafter .     The first published reference to the term occurred in 1959 in a paper ' Processing Data in Bits and Pieces ' by G A Blaauw , F P Brooks Jr and W Buchholz in

7656-530: Was jointly developed by Rand , MIT, and IBM. Later on, Schwartz's language JOVIAL actually used the term, but the author recalled vaguely that it was derived from AN/FSQ-31 . Early computers used a variety of four-bit binary-coded decimal (BCD) representations and the six-bit codes for printable graphic patterns common in the U.S. Army ( FIELDATA ) and Navy . These representations included alphanumeric characters and special graphical symbols. These sets were expanded in 1963 to seven bits of coding, called

7744-473: Was working on IBM's Project Stretch in the mid 1950s. His letter tells the story.     Not being a regular reader of your magazine, I heard about the question in the November 1976 issue regarding the origin of the term "byte" from a colleague who knew that I had perpetrated this piece of jargon [see page 77 of November 1976 BYTE, "Olde Englishe"] . I searched my files and could not locate

#693306