Misplaced Pages

64-bit computing

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In computer architecture , 64-bit integers , memory addresses , or other data units are those that are 64 bits wide. Also, 64-bit central processing units (CPU) and arithmetic logic units (ALU) are those that are based on processor registers , address buses , or data buses of that size. A computer that uses such a processor is a 64-bit computer.

#432567

121-515: From the software perspective, 64-bit computing means the use of machine code with 64-bit virtual memory addresses. However, not all 64-bit instruction sets support full 64-bit virtual memory addresses; x86-64 and AArch64 for example, support only 48 bits of virtual address, with the remaining 16 bits of the virtual address required to be all zeros (000...) or all ones (111...), and several 64-bit instruction sets support fewer than 64 bits of physical memory address. The term 64-bit also describes

242-408: A 32-bit to a 64-bit architecture is a fundamental alteration, as most operating systems must be extensively modified to take advantage of the new architecture, because that software has to manage the actual memory addressing hardware. Other software must also be ported to use the new abilities; older 32-bit software may be supported either by virtue of the 64-bit instruction set being a superset of

363-492: A code obfuscation technique as a measure against disassembly and tampering. The principle is also used in shared code sequences of fat binaries which must run on multiple instruction-set-incompatible processor platforms. This property is also used to find unintended instructions called gadgets in existing code repositories and is used in return-oriented programming as alternative to code injection for exploits such as return-to-libc attacks . In some computers,

484-612: A virtual machine of a 16- or 32-bit operating system to run 16-bit applications or use one of the alternatives for NTVDM . Mac OS X 10.4 "Tiger" and Mac OS X 10.5 "Leopard" had only a 32-bit kernel, but they can run 64-bit user-mode code on 64-bit processors. Mac OS X 10.6 "Snow Leopard" had both 32- and 64-bit kernels, and, on most Macs, used the 32-bit kernel even on 64-bit processors. This allowed those Macs to support 64-bit processes while still supporting 32-bit device drivers; although not 64-bit drivers and performance advantages that can come with them. Mac OS X 10.7 "Lion" ran with

605-723: A 16  MiB ( 16 × 1024 bytes ) address space. 32-bit superminicomputers , such as the DEC VAX , became common in the 1970s, and 32-bit microprocessors, such as the Motorola 68000 family and the 32-bit members of the x86 family starting with the Intel 80386 , appeared in the mid-1980s, making 32 bits something of a de facto consensus as a convenient register size. A 32-bit address register meant that 2 addresses, or 4  GB of random-access memory (RAM), could be referenced. When these architectures were devised, 4 GB of memory

726-422: A 32- or 64-bit Java virtual machine with no modification. The lengths and precision of all the built-in types, such as char , short , int , long , float , and double , and the types that can be used as array indices, are specified by the standard and are not dependent on the underlying architecture. Java programs that run on a 64-bit Java virtual machine have access to a larger address space. Speed

847-400: A 60-bit word without having to split a byte between one word and the next. If longer bytes were needed, 60 bits would, of course, no longer be ideal. With present applications, 1, 4, and 6 bits are the really important cases.     With 64-bit words, it would often be necessary to make some compromises, such as leaving 4 bits unused in a word when dealing with 6-bit bytes at

968-609: A 64-bit kernel on more Macs, and OS X 10.8 "Mountain Lion" and later macOS releases only have a 64-bit kernel. On systems with 64-bit processors, both the 32- and 64-bit macOS kernels can run 32-bit user-mode code, and all versions of macOS up to macOS Mojave (10.14) include 32-bit versions of libraries that 32-bit applications would use, so 32-bit user-mode software for macOS will run on those systems. The 32-bit versions of libraries have been removed by Apple in macOS Catalina (10.15). Linux and most other Unix-like operating systems, and

1089-467: A 64-bit word length for Stretch. It also supports NSA 's requirement for 8-bit bytes. Werner's term "Byte" first popularized in this memo.     NB. This timeline erroneously specifies the birth date of the term "byte" as July 1956 , while Buchholz actually used the term as early as June 1956 .     [...] 60 is a multiple of 1, 2, 3, 4, 5, and 6. Hence bytes of length from 1 to 6 bits can be packed efficiently into

1210-465: A birth certificate. But I am sure that "byte" is coming of age in 1977 with its 21st birthday.     Many have assumed that byte, meaning 8 bits, originated with the IBM System/360, which spread such bytes far and wide in the mid-1960s. The editor is correct in pointing out that the term goes back to the earlier Stretch computer (but incorrect in that Stretch was the first, not

1331-476: A convenience, because 1024 is approximately 1000 . This definition was popular in early decades of personal computing , with products like the Tandon 5 1 ⁄ 4 -inch DD floppy format (holding 368 640 bytes) being advertised as "360 KB", following the 1024 -byte convention. It was not universal, however. The Shugart SA-400 5 1 ⁄ 4 -inch floppy disk held 109,375 bytes unformatted, and

SECTION 10

#1732771858433

1452-416: A direct map between the numerical machine code and a human-readable mnemonic. In assembly, numerical opcodes and operands are replaced with mnemonics and labels. For example, the x86 architecture has available the 0x90 opcode; it is represented as NOP in the assembly source code . While it is possible to write programs directly in machine code, managing individual bits and calculating numerical addresses

1573-779: A driver for a 32-bit PCI device asking the device to DMA data into upper areas of a 64-bit machine's memory could not satisfy requests from the operating system to load data from the device to memory above the 4 gigabyte barrier, because the pointers for those addresses would not fit into the DMA registers of the device. This problem is solved by having the OS take the memory restrictions of the device into account when generating requests to drivers for DMA, or by using an input–output memory management unit (IOMMU). As of August 2023, 64-bit architectures for which processors are being manufactured include: Most architectures of 64 bits that are derived from

1694-484: A full transmission unit usually additionally includes a start bit, 1 or 2 stop bits, and possibly a parity bit , and thus its size may vary from seven to twelve bits for five to eight bits of actual data. For synchronous communication the error checking usually uses bytes at the end of a frame .     Terms used here to describe the structure imposed by the machine design, in addition to bit , are listed below.      Byte denotes

1815-449: A generation of computers in which 64-bit processors are the norm. 64 bits is a word size that defines certain classes of computer architecture, buses, memory, and CPUs and, by extension, the software that runs on them. 64-bit CPUs have been used in supercomputers since the 1970s ( Cray-1 , 1975) and in reduced instruction set computers (RISC) based workstations and servers since the early 1990s. In 2003, 64-bit CPUs were introduced to

1936-484: A given process and can have implications for efficient processor cache use. Maintaining a partial 32-bit model is one way to handle this, and is in general reasonably effective. For example, the z/OS operating system takes this approach, requiring program code to reside in 31-bit address spaces (the high order bit is not used in address calculation on the underlying hardware platform) while data objects can optionally reside in 64-bit regions. Not all such applications require

2057-475: A group of bits used to encode a character, or the number of bits transmitted in parallel to and from input-output units. A term other than character is used here because a given character may be represented in different applications by more than one code, and different codes may use different numbers of bits (i.e., different byte sizes). In input-output transmission the grouping of bits may be completely arbitrary and have no relation to actual characters. (The term

2178-750: A large address space or manipulate 64-bit data items, so these applications do not benefit from these features. x86-based 64-bit systems sometimes lack equivalents of software that is written for 32-bit architectures. The most severe problem in Microsoft Windows is incompatible device drivers for obsolete hardware. Most 32-bit application software can run on a 64-bit operating system in a compatibility mode , also termed an emulation mode, e.g., Microsoft WoW64 Technology for IA-64 and AMD64. The 64-bit Windows Native Mode driver environment runs atop 64-bit NTDLL.DLL , which cannot call 32-bit Win32 subsystem code (often devices whose actual hardware function

2299-486: A machine with a single accumulator , the accumulator is implicitly both the left operand and result of most arithmetic instructions. Some other architectures, such as the x86 architecture, have accumulator versions of common instructions, with the accumulator regarded as one of the general registers by longer instructions. A stack machine has most or all of its operands on an implicit stack. Special purpose instructions also often lack explicit operands; for example, CPUID in

2420-406: A number of bits, treated as a unit, and usually representing a character or a part of a character.     NOTES:     1 The number of bits in a byte is fixed for a given data processing system.     2 The number of bits in a byte is usually 8.      We received the following from W Buchholz, one of the individuals who

2541-433: A one-to-one mapping to machine code. The assembly language decoding method is called disassembly . Machine code may be decoded back to its corresponding high-level language under two conditions: The first condition is to accept an obfuscated reading of the source code. An obfuscated version of source code is displayed if the machine code is sent to a decompiler of the source language. The second condition requires

SECTION 20

#1732771858433

2662-400: A paging based system, if the current page actually holds machine code by an execute bit — pages have multiple such permission bits (readable, writable, etc.) for various housekeeping functionality. E.g. on Unix-like systems memory pages can be toggled to be executable with the mprotect() system call, and on Windows, VirtualProtect() can be used to achieve a similar result. If an attempt

2783-520: A problem. 64-bit drivers were not provided for many older devices, which could consequently not be used in 64-bit systems. Driver compatibility was less of a problem with open-source drivers, as 32-bit ones could be modified for 64-bit use. Support for hardware made before early 2007, was problematic for open-source platforms, due to the relatively small number of users. 64-bit versions of Windows cannot run 16-bit software . However, most 32-bit applications will work well. 64-bit users are forced to install

2904-412: A processor with 64-bit memory addresses can directly access 2 bytes (16 exabytes or EB) of byte-addressable memory. With no further qualification, a 64-bit computer architecture generally has integer and addressing registers that are 64 bits wide, allowing direct support for 64-bit data types and addresses. However, a CPU might have external data buses or address buses with different sizes from

3025-405: A single integer register can store the memory address to any location in the computer's physical or virtual memory . Therefore, the total number of addresses to memory is often determined by the width of these registers. The IBM System/360 of the 1960s was an early 32-bit computer; it had 32-bit integer registers, although it only used the low order 24 bits of a word for addresses, resulting in

3146-444: A tag and a Y field. In addition to transfer (branch) instructions, these machines have skip instruction that conditionally skip one or two words, e.g., Compare Accumulator with Storage (CAS) does a three way compare and conditionally skips to NSI, NSI+1 or NSI+2, depending on the result. The MIPS architecture provides a specific example for a machine code whose instructions are always 32 bits long. The general type of instruction

3267-548: A unit of logarithmic power ratio named after Alexander Graham Bell , creating a conflict with the IEC specification. However, little danger of confusion exists, because the bel is a rarely used unit. It is used primarily in its decadic fraction, the decibel (dB), for signal strength and sound pressure level measurements, while a unit for one-tenth of a byte, the decibyte, and other fractions, are only used in derived units, such as transmission rates. The lowercase letter o for octet

3388-405: A unit which "contains an unspecified amount of information ... capable of holding at least 64 distinct values ... at most 100 distinct values. On a binary computer a byte must therefore be composed of six bits". He notes that "Since 1975 or so, the word byte has come to mean a sequence of precisely eight binary digits...When we speak of bytes in connection with MIX we shall confine ourselves to

3509-668: Is 1024 bytes = 1024 bytes, one mebibyte (1 MiB) is 1024 bytes = 1 048 576 bytes, and so on. In 1999, Donald Knuth suggested calling the kibibyte a "large kilobyte" ( KKB ). The IEC adopted the IUPAC proposal and published the standard in January 1999. The IEC prefixes are part of the International System of Quantities . The IEC further specified that the kilobyte should only be used to refer to 1000 bytes. Lawsuits arising from alleged consumer confusion over

3630-516: Is an abbreviation of "Long, Pointer, 64". Other models are the ILP64 data model in which all three data types are 64 bits wide, and even the SILP64 model where short integers are also 64 bits wide. However, in most cases the modifications required are relatively minor and straightforward, and many well-written programs can simply be recompiled for the new environment with no changes. Another alternative

3751-510: Is coined from bite , but respelled to avoid accidental mutation to bit .)     A word consists of the number of data bits transmitted in parallel from or to memory in one memory cycle. Word size is thus defined as a structural property of the memory. (The term catena was coined for this purpose by the designers of the Bull GAMMA 60  [ fr ] computer.)      Block refers to

64-bit computing - Misplaced Pages Continue

3872-460: Is defined as eight bits. It is a signed data type, holding values from −128 to 127. .NET programming languages, such as C# , define byte as an unsigned type, and the sbyte as a signed data type, holding values from 0 to 255, and −128 to 127 , respectively. In data transmission systems, the byte is used as a contiguous sequence of bits in a serial data stream, representing the smallest distinguished unit of data. For asynchronous communication

3993-455: Is defined as the symbol for octet in IEC ;80000-13 and is commonly used in languages such as French and Romanian , and is also combined with metric prefixes for multiples, for example ko and Mo. More than one system exists to define unit multiples based on the byte. Some systems are based on powers of 10 , following the International System of Units (SI), which defines for example

4114-672: Is defined to equal 1,000 bytes—is recommended by the International Electrotechnical Commission (IEC). The IEC standard defines eight such multiples, up to 1 yottabyte (YB), equal to 1000 bytes. The additional prefixes ronna- for 1000 and quetta- for 1000 were adopted by the International Bureau of Weights and Measures (BIPM) in 2022. This definition is most commonly used for data-rate units in computer networks , internal bus, hard drive and flash media transfer speeds, and for

4235-483: Is emulated in user mode software, like Winprinters). Because 64-bit drivers for most devices were unavailable until early 2007 (Vista x64), using a 64-bit version of Windows was considered a challenge. However, the trend has since moved toward 64-bit computing, more so as memory prices dropped and the use of more than 4 GB of RAM increased. Most manufacturers started to provide both 32-bit and 64-bit drivers for new devices, so unavailability of 64-bit drivers ceased to be

4356-618: Is equal to 1,024 (i.e., 2 ) bytes is defined by international standard IEC 80000-13 and is supported by national and international standards bodies ( BIPM , IEC , NIST ). The IEC standard defines eight such multiples, up to 1 yobibyte (YiB), equal to 1024 bytes. The natural binary counterparts to ronna- and quetta- were given in a consultation paper of the International Committee for Weights and Measures' Consultative Committee for Units (CCU) as robi- (Ri, 1024 ) and quebi- (Qi, 1024 ), but have not yet been adopted by

4477-477: Is generally different from bytecode (also known as p-code), which is either executed by an interpreter or itself compiled into machine code for faster (direct) execution. An exception is when a processor is designed to use a particular bytecode directly as its machine code, such as is the case with Java processors . Machine code and assembly code are sometimes called native code when referring to platform-dependent parts of language features or libraries. From

4598-449: Is given by the op (operation) field, the highest 6 bits. J-type (jump) and I-type (immediate) instructions are fully specified by op . R-type (register) instructions include an additional field funct to determine the exact operation. The fields used in these types are: rs , rt , and rd indicate register operands; shamt gives a shift amount; and the address or immediate fields contain an operand directly. For example, adding

4719-433: Is just as easy to use all six bits in alphanumeric work, or to handle bytes of only one bit for logical analysis, or to offset the bytes by any number of bits. All this can be done by pulling the appropriate shift diagonals. An analogous matrix arrangement is used to change from serial to parallel operation at the output of the adder. [...]     byte:     A string that consists of

4840-428: Is made to execute machine code on a non-executable page, an architecture specific fault will typically occur. Treating data as machine code , or finding new ways to use existing machine code, by various techniques, is the basis of some security vulnerabilities. Similarly, in a segment based system, segment descriptors can indicate whether a segment can contain executable code and in what rings that code can run. From

4961-413: Is not the only factor to consider in comparing 32-bit and 64-bit processors. Applications such as multi-tasking, stress testing, and clustering – for high-performance computing (HPC) – may be more suited to a 64-bit architecture when deployed appropriately. For this reason, 64-bit clusters have been widely deployed in large organizations, such as IBM, HP, and Microsoft. Summary: A common misconception

64-bit computing - Misplaced Pages Continue

5082-466: Is often called a nibble , also nybble , which is conveniently represented by a single hexadecimal digit. The term octet unambiguously specifies a size of eight bits. It is used extensively in protocol definitions. Historically, the term octad or octade was used to denote eight bits as well at least in Western Europe; however, this usage is no longer common. The exact origin of

5203-441: Is often written with implicit assumptions about the widths of data types. C code should prefer ( u ) intptr_t instead of long when casting pointers into integer objects. A programming model is a choice made to suit a given compiler, and several can coexist on the same OS. However, the programming model chosen as the primary model for the OS application programming interface (API) typically dominates. Another consideration

5324-504: Is often, but not always, based on 64-bit units of data. For example, although the x86 / x87 architecture has instructions able to load and store 64-bit (and 32-bit) floating-point values in memory, the internal floating-point data and register format is 80 bits wide, while the general-purpose registers are 32 bits wide. In contrast, the 64-bit Alpha family uses a 64-bit floating-point data and register format, and 64-bit integer registers. Many computer instruction sets are designed so that

5445-425: Is rarely a problem. Systems may also differ in other details, such as memory arrangement, operating systems, or peripheral devices . Because a program normally relies on such factors, different systems will typically not run the same machine code, even when the same type of processor is used. A processor's instruction set may have fixed-length or variable-length instructions. How the patterns are organized varies with

5566-512: Is tedious and error-prone. Therefore, programs are rarely written directly in machine code. However, an existing machine code program may be edited if the assembly source code is not available. The majority of programs today are written in a high-level language . A high-level program may be translated into machine code by a compiler . Every processor or processor family has its own instruction set . Instructions are patterns of bits , digits, or characters that correspond to machine commands. Thus,

5687-417: Is that 64-bit architectures are no better than 32-bit architectures unless the computer has more than 4 GB of random-access memory . This is not entirely true: The main disadvantage of 64-bit architectures is that, relative to 32-bit architectures, the same data occupies more space in memory (due to longer pointers and possibly other types, and alignment padding). This increases the memory requirements of

5808-495: Is the IBM AS/400 , software for which is compiled into a virtual instruction set architecture (ISA) called Technology Independent Machine Interface (TIMI); TIMI code is then translated to native machine code by low-level software before being executed. The translation software is all that must be rewritten to move the full OS and all software to a new platform, as when IBM transitioned the native instruction set for AS/400 from

5929-505: Is the LLP64 model, which maintains compatibility with 32-bit code by leaving both int and long as 32-bit. LL refers to the long long integer type, which is at least 64 bits on all platforms, including 32-bit environments. There are also systems with 64-bit processors using an ILP32 data model, with the addition of 64-bit long long integers; this is also used on many platforms with 32-bit processors. This model reduces code size and

6050-561: Is the binary representation of a computer program which is actually read and interpreted by the computer. A program in machine code consists of a sequence of machine instructions (possibly interspersed with data). Each machine code instruction causes the CPU to perform a specific task. Examples of such tasks include: In general, each architecture family (e.g., x86 , ARM ) has its own instruction set architecture (ISA), and hence its own specific machine code language. There are exceptions, such as

6171-399: Is the data model used for device drivers . Drivers make up the majority of the operating system code in most modern operating systems (although many may not be loaded when the operating system is running). Many drivers use pointers heavily to manipulate data, and in some cases have to load pointers of a certain size into the hardware they support for direct memory access (DMA). As an example,

SECTION 50

#1732771858433

6292-419: Is typically set to a hard coded value when the CPU is first powered on, and will hence execute whatever machine code happens to be at this address. Similarly, the program counter can be set to execute whatever machine code is at some arbitrary address, even if this is not valid machine code. This will typically trigger an architecture specific protection fault. The CPU is oftentimes told, by page permissions in

6413-457: Is used here because a given character may be represented in different applications by more than one code, and different codes may use different numbers of bits (ie, different byte sizes). In input-output transmission the grouping of bits may be completely arbitrary and have no relation to actual characters. (The term is coined from bite , but respelled to avoid accidental mutation to bit. )      System/360 took over many of

6534-597: The IRE Transactions on Electronic Computers , June 1959, page 121. The notions of that paper were elaborated in Chapter 4 of Planning a Computer System (Project Stretch) , edited by W Buchholz, McGraw-Hill Book Company (1962). The rationale for coining the term was explained there on page 40 as follows: Byte denotes a group of bits used to encode a character, or the number of bits transmitted in parallel to and from input-output units. A term other than character

6655-649: The American Standard Code for Information Interchange (ASCII) as the Federal Information Processing Standard , which replaced the incompatible teleprinter codes in use by different branches of the U.S. government and universities during the 1960s. ASCII included the distinction of upper- and lowercase alphabets and a set of control characters to facilitate the transmission of written language as well as printing device functions, such as page advance and line feed, and

6776-647: The Apple Watch Series 4 and 5. Many 64-bit platforms today use an LP64 model (including Solaris, AIX , HP-UX , Linux, macOS, BSD, and IBM z/OS). Microsoft Windows uses an LLP64 model. The disadvantage of the LP64 model is that storing a long into an int truncates. On the other hand, converting a pointer to a long will "work" in LP64. In the LLP64 model, the reverse is true. These are not problems which affect fully standard-compliant code, but code

6897-513: The C and C++ toolchains for them, have supported 64-bit processors for many years. Many applications and libraries for those platforms are open-source software , written in C and C++, so that if they are 64-bit-safe, they can be compiled into 64-bit versions. This source-based distribution model, with an emphasis on frequent releases, makes availability of application software for those operating systems less of an issue. In 32-bit programs, pointers and data types such as integers generally have

7018-461: The Cray-1 , used registers up to 64 bits wide, and supported 64-bit integer arithmetic, although they did not support 64-bit addressing. In the mid-1980s, Intel i860 development began culminating in a 1989 release; the i860 had 32-bit integer registers and 32-bit addressing, so it was not a fully 64-bit processor, although its graphics unit supported 64-bit integer arithmetic. However, 32 bits remained

7139-626: The International Union of Pure and Applied Chemistry 's (IUPAC) Interdivisional Committee on Nomenclature and Symbols attempted to resolve this ambiguity by proposing a set of binary prefixes for the powers of 1024, including kibi (kilobinary), mebi (megabinary), and gibi (gigabinary). In December 1998, the IEC addressed such multiple usages and definitions by adopting the IUPAC's proposed prefixes (kibi, mebi, gibi, etc.) to unambiguously denote powers of 1024. Thus one kibibyte (1 KiB)

7260-539: The Kruskal count , sometimes possible through opcode-level programming to deliberately arrange the resulting code so that two code paths share a common fragment of opcode sequences. These are called overlapping instructions , overlapping opcodes , overlapping code , overlapped code , instruction scission , or jump into the middle of an instruction . In the 1970s and 1980s, overlapping instructions were sometimes used to preserve memory space. One example were in

7381-961: The Nintendo 64 and the PlayStation 2 had 64-bit microprocessors before their introduction in personal computers. High-end printers, network equipment, and industrial computers also used 64-bit microprocessors, such as the Quantum Effect Devices R5000 . 64-bit computing started to trickle down to the personal computer desktop from 2003 onward, when some models in Apple 's Macintosh lines switched to PowerPC 970 processors (termed G5 by Apple), and Advanced Micro Devices (AMD) released its first 64-bit x86-64 processor. Physical memory eventually caught up with 32 bit limits. In 2023, laptop computers were commonly equipped with 16GB and servers up to 64 GB of memory, greatly exceeding

SECTION 60

#1732771858433

7502-698: The VAX architecture, which includes optional support of the PDP-11 instruction set; the IA-64 architecture, which includes optional support of the IA-32 instruction set; and the PowerPC 615 microprocessor, which can natively process both PowerPC and x86 instruction sets. Machine code is a strictly numerical language, and it is the lowest-level interface to the CPU intended for a programmer. Assembly language  provides

7623-599: The Zilog Z80 processor, the machine code 00000101 , which causes the CPU to decrement the B general-purpose register , would be represented in assembly language as DEC B . The IBM 704, 709, 704x and 709x store one instruction in each instruction word; IBM numbers the bit from the left as S, 1, ..., 35. Most instructions have one of two formats: For all but the IBM 7094 and 7094 II, there are three index registers designated A, B and C; indexing with multiple 1 bits in

7744-526: The bit endianness . The size of the byte has historically been hardware -dependent and no definitive standards existed that mandated the size. Sizes from 1 to 48 bits have been used. The six-bit character code was an often-used implementation in early encoding systems, and computers using six-bit and nine-bit bytes were common in the 1960s. These systems often had memory words of 12, 18, 24, 30, 36, 48, or 60 bits, corresponding to 2, 3, 4, 5, 6, 8, or 10 six-bit bytes, and persisted, in legacy systems, into

7865-520: The 32-bit instruction set, so that processors that support the 64-bit instruction set can also run code for the 32-bit instruction set, or through software emulation , or by the actual implementation of a 32-bit processor core within the 64-bit processor, as with some Itanium processors from Intel, which included an IA-32 processor core to run 32-bit x86 applications. The operating systems for those 64-bit architectures generally support both 32-bit and 64-bit applications. One significant exception to this

7986-829: The 32-bit limit of 4 GB ( 4 × 1024 bytes ), allowing room for later expansion and incurring no overhead of translating full 64-bit addresses. The Power ISA v3.0 allows 64 bits for an effective address, mapped to a segmented address with between 65 and 78 bits allowed, for virtual memory, and, for any given processor, up to 60 bits for physical memory. The Oracle SPARC Architecture 2015 allows 64 bits for virtual memory and, for any given processor, between 40 and 56 bits for physical memory. The ARM AArch64 Virtual Memory System Architecture allows 48 bits for virtual memory and, for any given processor, from 32 to 48 bits for physical memory. The DEC Alpha specification requires minimum of 43 bits of virtual memory address space (8 TB) to be supported, and hardware need to check and trap if

8107-648: The 4 GB address capacity of 32 bits. In principle, a 64-bit microprocessor can address 16 EB ( 16 × 1024 = 2 = 18,446,744,073,709,551,616 bytes ) of memory. However, not all instruction sets, and not all processors implementing those instruction sets, support a full 64-bit virtual or physical address space. The x86-64 architecture (as of 2016) allows 48 bits for virtual memory and, for any given processor, up to 52 bits for physical memory. These limits allow memory sizes of 256  TB ( 256 × 1024 bytes ) and 4  PB ( 4 × 1024 bytes ), respectively. A PC cannot currently contain 4  petabytes of memory (due to

8228-512: The Adder. The Adder may accept all or only some of the bits.     Assume that it is desired to operate on 4 bit decimal digits , starting at the right. The 0-diagonal is pulsed first, sending out the six bits 0 to 5, of which the Adder accepts only the first four (0-3). Bits 4 and 5 are ignored. Next, the 4 diagonal is pulsed. This sends out bits 4 to 9, of which the last two are again ignored, and so on.     It

8349-515: The IEC and ISO. An alternative system of nomenclature for the same units (referred to here as the customary convention ), in which 1 kilobyte (KB) is equal to 1,024 bytes, 1 megabyte (MB) is equal to 1024 bytes and 1 gigabyte (GB) is equal to 1024 bytes is mentioned by a 1990s JEDEC standard. Only the first three multiples (up to GB) are mentioned by the JEDEC standard, which makes no mention of TB and larger. While confusing and incorrect,

8470-512: The Shift Matrix to be used to convert a 60-bit word , coming from Memory in parallel, into characters , or 'bytes' as we have called them, to be sent to the Adder serially. The 60 bits are dumped into magnetic cores on six different levels. Thus, if a 1 comes out of position 9, it appears in all six cores underneath. Pulsing any diagonal line will send the six bits stored along that line to

8591-478: The Stretch concepts, including the basic byte and word sizes, which are powers of 2. For economy, however, the byte size was fixed at the 8 bit maximum, and addressing at the bit level was replaced by byte addressing.     Since then the term byte has generally meant 8 bits, and it has thus passed into the general vocabulary.     Are there any other terms coined especially for

8712-631: The System/360 led to the ubiquitous adoption of the eight-bit storage size, while in detail the EBCDIC and ASCII encoding schemes are different. In the early 1960s, AT&T introduced digital telephony on long-distance trunk lines . These used the eight-bit μ-law encoding . This large investment promised to reduce transmission costs for eight-bit data. In Volume 1 of The Art of Computer Programming (first published in 1968), Donald Knuth uses byte in his hypothetical MIX computer to denote

8833-591: The binary and decimal definitions of multiples of the byte have generally ended in favor of the manufacturers, with courts holding that the legal definition of gigabyte or GB is 1 GB = 1 000 000 000 (10 ) bytes (the decimal definition), rather than the binary definition (2 , i.e., 1 073 741 824 ). Specifically, the United States District Court for the Northern District of California held that "the U.S. Congress has deemed

8954-551: The byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures . To disambiguate arbitrarily sized bytes from the common 8-bit definition, network protocol documents such as the Internet Protocol ( RFC   791 ) refer to an 8-bit byte as an octet . Those bits in an octet are usually counted with numbering from 0 to 7 or 7 to 0 depending on

9075-452: The capacities of most storage media , particularly hard drives , flash -based storage, and DVDs . Operating systems that use this definition include macOS , iOS , Ubuntu , and Debian . It is also consistent with the other uses of the SI prefixes in computing, such as CPU clock speeds or measures of performance . A system of units based on powers of 2 in which 1 kibibyte (KiB)

9196-453: The computer field which have found their way into general dictionaries of English language?     1956 Summer: Gerrit Blaauw , Fred Brooks , Werner Buchholz , John Cocke and Jim Pomerene join the Stretch team. Lloyd Hunter provides transistor leadership.     1956 July [ sic ]: In a report Werner Buchholz lists the advantages of

9317-635: The customary convention is used by the Microsoft Windows operating system and random-access memory capacity, such as main memory and CPU cache size, and in marketing and billing by telecommunication companies, such as Vodafone , AT&T , Orange and Telstra . For storage capacity, the customary convention was used by macOS and iOS through Mac OS X 10.6 Snow Leopard and iOS 10, after which they switched to units based on powers of 10. Various computer vendors have coined terms for data of various sizes, sometimes with different sizes for

9438-454: The decimal definition of gigabyte to be the 'preferred' one for the purposes of 'U.S. trade and commerce' [...] The California Legislature has likewise adopted the decimal system for all 'transactions in this state. ' " Earlier lawsuits had ended in settlement with no court ruling on the question, such as a lawsuit against drive manufacturer Western Digital . Western Digital settled the challenge and added explicit disclaimers to products that

9559-422: The former sense of the word, harking back to the days when bytes were not yet standardized." The development of eight-bit microprocessors in the 1970s popularized this storage size. Microprocessors such as the Intel 8080 , the direct predecessor of the 8086 , could also perform a small number of operations on the four-bit pairs in a byte, such as the decimal-add-adjust (DAA) instruction. A four-bit quantity

9680-469: The implementation of error tables in Microsoft 's Altair BASIC , where interleaved instructions mutually shared their instruction bytes. The technique is rarely used today, but might still be necessary to resort to in areas where extreme optimization for size is necessary on byte-level such as in the implementation of boot loaders which have to fit into boot sectors . It is also sometimes used as

9801-566: The input and output. However, the LINK Computer can be equipped to edit out these gaps and to permit handling of bytes which are split between words. [...]     [...] The maximum input-output byte size for serial operation will now be 8 bits, not counting any error detection and correction bits. Thus, the Exchange will operate on an 8-bit byte basis, and any input-output units with less than 8 bits per byte will leave

9922-511: The instruction set is specific to a class of processors using (mostly) the same architecture . Successor or derivative processor designs often include instructions of a predecessor and may add new additional instructions. Occasionally, a successor design will discontinue or alter the meaning of some instruction code (typically because it is needed for new purposes), affecting code compatibility to some extent; even compatible processors may show slightly different behavior for some instructions, but this

10043-428: The instruction. It is a deliberate respelling of bite to avoid accidental mutation to bit . Another origin of byte for bit groups smaller than a computer's word size, and in particular groups of four bits , is on record by Louis G. Dooley, who claimed he coined the term while working with Jules Schwartz and Dick Beeler on an air defense system called SAGE at MIT Lincoln Laboratory in 1956 or 1957, which

10164-418: The integral data type unsigned char must hold at least 256 different values, and is represented by at least eight bits (clause 5.2.4.2.1). Various implementations of C and C++ reserve 8, 9, 16, 32, or 36 bits for the storage of a byte. In addition, the C and C++ standards require that there are no gaps between two bytes. This means every bit in memory is part of a byte. Java's primitive data type byte

10285-465: The last, of IBM's second-generation transistorized computers to be developed).     The first reference found in the files was contained in an internal memo written in June 1956 during the early days of developing Stretch . A byte was described as consisting of any number of parallel bits from one to six. Thus a byte was assumed to have a length appropriate for the occasion. Its first use

10406-526: The machine code of the architecture is implemented by an even more fundamental underlying layer called microcode , providing a common machine language interface across a line or family of different models of computer with widely different underlying dataflows . This is done to facilitate porting of machine language programs between different models. An example of this use is the IBM System/360 family of computers and their successors. Machine code

10527-487: The machine code to have information about the source code encoded within. The information includes a symbol table that contains debug symbols . The symbol table may be stored within the executable, or it may exist in separate files. A debugger can then read the symbol table to help the programmer interactively debug the machine code in execution . Mebibyte The byte is a unit of digital information that most commonly consists of eight bits . Historically,

10648-626: The mainstream PC market in the form of x86-64 processors and the PowerPC G5 . A 64-bit register can hold any of 2 (over 18 quintillion or 1.8×10) different values. The range of integer values that can be stored in 64 bits depends on the integer representation used. With the two most common representations, the range is 0 through 18,446,744,073,709,551,615 (equal to 2 − 1) for representation as an ( unsigned ) binary number , and −9,223,372,036,854,775,808 (−2) through 9,223,372,036,854,775,807 (2 − 1) for representation as two's complement . Hence,

10769-542: The mid-1990s, HAL Computer Systems , Sun Microsystems , IBM , Silicon Graphics , and Hewlett-Packard had developed 64-bit architectures for their workstation and server systems. A notable exception to this trend were mainframes from IBM, which then used 32-bit data and 31-bit address sizes; the IBM mainframes did not include 64-bit processors until 2000. During the 1990s, several low-cost 64-bit microprocessors were used in consumer electronics and embedded applications. Notably,

10890-409: The norm until the early 1990s, when the continual reductions in the cost of memory led to installations with amounts of RAM approaching 4 GB, and the use of virtual memory spaces exceeding the 4 GB ceiling became desirable for handling certain types of problems. In response, MIPS and DEC developed 64-bit microprocessor architectures, initially for high-end workstation and server machines. By

11011-455: The number of words transmitted to or from an input-output unit in response to a single input-output instruction. Block size is a structural property of an input-output unit; it may have been fixed by the design or left to be varied by the program.     [...] Most important, from the point of view of editing, will be the ability to handle any characters or digits, from 1 to 6 bits long.     Figure 2 shows

11132-763: The older 32/48-bit IMPI to the newer 64-bit PowerPC-AS , codenamed Amazon . The IMPI instruction set was quite different from even 32-bit PowerPC, so this transition was even bigger than moving a given instruction set from 32 to 64 bits. On 64-bit hardware with x86-64 architecture (AMD64), most 32-bit operating systems and applications can run with no compatibility issues. While the larger address space of 64-bit architectures makes working with large data sets in applications such as digital video , scientific computing, and large databases easier, there has been considerable debate on whether they or their 32-bit compatibility modes will be faster than comparably priced 32-bit systems for other tasks. A compiled Java program can run on

11253-412: The other four index registers. The effective address is normally Y-C(T), where C(T) is either 0 for a tag of 0, the logical or of the selected index regisrs in multiple tag mode or the selected index register if not in multiple tag mode. However, the effective address for index register control instructions is just Y. A flag with both bits 1 selects indirect addressing; the indirect address word has both

11374-428: The other types of registers cannot. The size of these registers therefore normally limits the amount of directly addressable memory, even if there are registers, such as floating-point registers, that are wider. Most high performance 32-bit and 64-bit processors (some notable exceptions are older or embedded ARM architecture (ARM) and 32-bit MIPS architecture (MIPS) CPUs) have integrated floating point hardware, which

11495-529: The particular architecture and type of instruction. Most instructions have one or more opcode fields that specify the basic instruction type (such as arithmetic, logical, jump , etc.), the operation (such as add or compare), and other fields that may give the type of the operand (s), the addressing mode (s), the addressing offset(s) or index, or the operand value itself (such constant operands contained in an instruction are called immediate ). Not all machines or individual instructions have explicit operands. On

11616-460: The physical or logical control of data flow over the transmission media. During the early 1960s, while also active in ASCII standardization, IBM simultaneously introduced in its product line of System/360 the eight-bit Extended Binary Coded Decimal Interchange Code (EBCDIC), an expansion of their six-bit binary-coded decimal (BCDIC) representations used in earlier card punches. The prominence of

11737-419: The physical size of the memory chips), but AMD envisioned large servers, shared memory clusters, and other uses of physical address space that might approach this in the foreseeable future. Thus the 52-bit physical address provides ample room for expansion while not incurring the cost of implementing full 64-bit physical addresses. Similarly, the 48-bit virtual address space was designed to provide 65,536 (2) times

11858-657: The point of view of a process , the code space is the part of its address space where the code in execution is stored. In multitasking systems this comprises the program's code segment and usually shared libraries . In multi-threading environment, different threads of one process share code space along with data space, which reduces the overhead of context switching considerably as compared to process switching. Various tools and methods exist to decode machine code back to its corresponding source code . Machine code can easily be decoded back to its corresponding assembly language source code because assembly language forms

11979-504: The point of view of the CPU, machine code is stored in RAM, but is typically also kept in a set of caches for performance reasons. There may be different caches for instructions and data, depending on the architecture. The CPU knows what machine code to execute, based on its internal program counter. The program counter points to a memory address and is changed based on special instructions which may cause programmatic branches. The program counter

12100-458: The potential ambiguity of the term "byte". The symbol for octet, 'o', also conveniently eliminates the ambiguity in the symbol 'B' between byte and bel . The term byte was coined by Werner Buchholz in June 1956, during the early design phase for the IBM Stretch computer, which had addressing to the bit and variable field length (VFL) instructions with a byte size encoded in

12221-453: The prefix kilo as 1000 (10 ); other systems are based on powers of 2 . Nomenclature for these systems has led to confusion. Systems based on powers of 10 use standard SI prefixes ( kilo , mega , giga , ...) and their corresponding symbols (k, M, G, ...). Systems based on powers of 2, however, might use binary prefixes ( kibi , mebi , gibi , ...) and their corresponding symbols (Ki, Mi, Gi, ...) or they might use

12342-525: The prefixes K, M, and G, creating ambiguity when the prefixes M or G are used. While the difference between the decimal and binary interpretations is relatively small for the kilobyte (about 2% smaller than the kibibyte), the systems deviate increasingly as units grow larger (the relative deviation grows by 2.4% for each three orders of magnitude). For example, a power-of-10-based terabyte is about 9% smaller than power-of-2-based tebibyte. Definition of prefixes using powers of 10—in which 1 kilobyte (symbol kB)

12463-412: The registers 1 and 2 and placing the result in register 6 is encoded: Load a value into register 8, taken from the memory cell 68 cells after the location listed in register 3: Jumping to the address 1024: On processor architectures with variable-length instruction sets (such as Intel 's x86 processor family) it is, within the limits of the control-flow resynchronizing phenomenon known as

12584-577: The registers, even larger (the 32-bit Pentium had a 64-bit data bus, for instance). Processor registers are typically divided into several groups: integer , floating-point , single instruction, multiple data (SIMD), control , and often special registers for address arithmetic which may have various uses and names such as address , index , or base registers . However, in modern designs, these functions are often performed by more general purpose integer registers. In most processors, only integer or address-registers can be used to address data in memory;

12705-547: The remaining unsupported bits are zero (to support compatibility on future processors). Alpha 21064 supported 43 bits of virtual memory address space (8 TB) and 34 bits of physical memory address space (16 GB). Alpha 21164 supported 43 bits of virtual memory address space (8 TB) and 40 bits of physical memory address space (1 TB). Alpha 21264 supported user-configurable 43 or 48 bits of virtual memory address space (8 TB or 256 TB) and 44 bits of physical memory address space (16 TB). A change from

12826-416: The result of a constant expression freed up by replacing it by that constant) and other code enhancements. A much more human-friendly rendition of machine language, named assembly language , uses mnemonic codes to refer to machine code instructions, rather than using the instructions' numeric values directly, and uses symbolic names to refer to storage locations and sometimes registers . For example, on

12947-478: The same architecture of 32 bits can execute code written for the 32-bit versions natively, with no performance penalty. This kind of support is commonly called bi-arch support or more generally multi-arch support . Machine code In computer programming , machine code is computer code consisting of machine language instructions , which are used to control a computer's central processing unit (CPU). For conventional binary computers , machine code

13068-483: The same length. This is not necessarily true on 64-bit machines. Mixing data types in programming languages such as C and its descendants such as C++ and Objective-C may thus work on 32-bit implementations but not on 64-bit implementations. In many programming environments for C and C-derived languages on 64-bit machines, int variables are still 32 bits wide, but long integers and pointers are 64 bits wide. These are described as having an LP64 data model , which

13189-450: The same term even within a single vendor. These terms include double word , half word , long word , quad word , slab , superword and syllable . There are also informal terms. e.g., half byte and nybble for 4 bits, octal K for 1000 8 . Contemporary computer memory has a binary architecture making a definition of memory units based on powers of 2 most practical. The use of the metric prefix kilo for binary multiples arose as

13310-447: The size of data structures containing pointers, at the cost of a much smaller address space, a good choice for some embedded systems. For instruction sets such as x86 and ARM in which the 64-bit version of the instruction set has more registers than does the 32-bit version, it provides access to the additional registers without the space penalty. It is common in 64-bit RISC machines, explored in x86 as x32 ABI , and has recently been used in

13431-443: The tag subtracts the logical or of the selected index registers and loading with multiple 1 bits in the tag loads all of the selected index registers. The 7094 and 7094 II have seven index registers, but when they are powered on they are in multiple tag mode , in which they use only the three of the index registers in a fashion compatible with earlier machines, and require a Leave Multiple Tag Mode ( LMTM ) instruction in order to access

13552-535: The term is unclear, but it can be found in British, Dutch, and German sources of the 1960s and 1970s, and throughout the documentation of Philips mainframe computers. The unit symbol for the byte is specified in IEC 80000-13 , IEEE 1541 and the Metric Interchange Format as the upper-case character B. In the International System of Quantities (ISQ), B is also the symbol of the bel ,

13673-724: The twenty-first century. In this era, bit groupings in the instruction stream were often referred to as syllables or slab , before the term byte became common. The modern de facto standard of eight bits, as documented in ISO/IEC 2382-1:1993, is a convenient power of two permitting the binary-encoded values 0 through 255 for one byte, as 2 to the power of 8 is 256. The international standard IEC 80000-13 codified this common meaning. Many types of applications use information representable in eight or fewer bits and processor designers commonly optimize for this usage. The popularity of major commercial computing architectures has aided in

13794-532: The ubiquitous acceptance of the 8-bit byte. Modern architectures typically use 32- or 64-bit words, built of four or eight bytes, respectively. The unit symbol for the byte was designated as the upper-case letter B by the International Electrotechnical Commission (IEC) and Institute of Electrical and Electronics Engineers (IEEE). Internationally, the unit octet explicitly defines a sequence of eight bits, eliminating

13915-425: The usable capacity may differ from the advertised capacity. Seagate was sued on similar grounds and also settled. Many programming languages define the data type byte . The C and C++ programming languages define byte as an "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment" (clause 3.6 of the C standard). The C standard requires that

14036-406: The x86 architecture writes values into four implicit destination registers. This distinction between explicit and implicit operands is important in code generators, especially in the register allocation and live range tracking parts. A good code optimizer can track implicit and explicit operands which may allow more frequent constant propagation , constant folding of registers (a register assigned

14157-404: Was advertised as "110 Kbyte", using the 1000 convention. Likewise, the 8-inch DEC RX01 floppy (1975) held 256 256 bytes formatted, and was advertised as "256k". Some devices were advertised using a mixture of the two definitions: most notably, floppy disks advertised as "1.44 MB" have an actual capacity of 1440 KiB , the equivalent of 1.47 MB or 1.41 MiB. In 1995,

14278-523: Was in the context of the input-output equipment of the 1950s, which handled six bits at a time. The possibility of going to 8-bit bytes was considered in August 1956 and incorporated in the design of Stretch shortly thereafter .     The first published reference to the term occurred in 1959 in a paper ' Processing Data in Bits and Pieces ' by G A Blaauw , F P Brooks Jr and W Buchholz in

14399-530: Was jointly developed by Rand , MIT, and IBM. Later on, Schwartz's language JOVIAL actually used the term, but the author recalled vaguely that it was derived from AN/FSQ-31 . Early computers used a variety of four-bit binary-coded decimal (BCD) representations and the six-bit codes for printable graphic patterns common in the U.S. Army ( FIELDATA ) and Navy . These representations included alphanumeric characters and special graphical symbols. These sets were expanded in 1963 to seven bits of coding, called

14520-416: Was so far beyond the typical amounts (4 MiB) in installations, that this was considered to be enough headroom for addressing. 4.29 billion addresses were considered an appropriate size to work with for another important reason: 4.29 billion integers are enough to assign unique references to most entities in applications like databases . Some supercomputer architectures of the 1970s and 1980s, such as

14641-473: Was working on IBM's Project Stretch in the mid 1950s. His letter tells the story.     Not being a regular reader of your magazine, I heard about the question in the November 1976 issue regarding the origin of the term "byte" from a colleague who knew that I had perpetrated this piece of jargon [see page 77 of November 1976 BYTE, "Olde Englishe"] . I searched my files and could not locate

#432567