Misplaced Pages

General-purpose computing on graphics processing units

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

A graphics processing unit ( GPU ) is a specialized electronic circuit initially designed for digital image processing and to accelerate computer graphics , being present either as a discrete video card or embedded on motherboards , mobile phones , personal computers , workstations , and game consoles . After their initial design, GPUs were found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure . Other non-graphical uses include the training of neural networks and cryptocurrency mining .

#158841

133-412: General-purpose computing on graphics processing units ( GPGPU , or less often GPGP ) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics , to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes

266-436: A Shader Model standard, to help rank the various features of graphic cards into a simple Shader Model version number (1.0, 2.0, 3.0, etc.). Pre-DirectX 9 video cards only supported paletted or integer color types. Sometimes another alpha value is added, to be used for transparency. Common formats are: For early fixed-function or limited programmability graphics (i.e., up to and including DirectX 8.1-compliant GPUs) this

399-565: A central processing unit (CPU) to a graphics processing unit (GPU), then to a display device . As time progressed, however, it became valuable for GPUs to store at first simple, then complex structures of data to be passed back to the CPU that analyzed an image, or a set of scientific-data represented as a 2D or 3D format that a video card can understand. Because the GPU has access to every draw operation, it can analyze data in these forms quickly, whereas

532-420: A fat binary . The use of different toolsets may not be enough to build a working executables for different platforms. In this case, programmers must port the source code to the new platform. For example, an application such as Firefox, which already runs on Windows on the x86 family, can be modified and re-built to run on Linux on the x86 (and potentially other architectures) as well. The multiple versions of

665-625: A functionally complete set of logic operators. In 1987, Conway's Game of Life became one of the first examples of general-purpose computing using an early stream processor called a blitter to invoke a special sequence of logical operations on bit vectors. General-purpose computing on GPUs became more practical and popular after about 2001, with the advent of both programmable shaders and floating point support on graphics processors. Notably, problems involving matrices and/or vectors  – especially two-, three-, or four-dimensional vectors – were easy to translate to

798-491: A personal computer graphics display processor as a single large-scale integration (LSI) integrated circuit chip. This enabled the design of low-cost, high-performance video graphics cards such as those from Number Nine Visual Technology . It became the best-known GPU until the mid-1980s. It was the first fully integrated VLSI (very large-scale integration) metal–oxide–semiconductor ( NMOS ) graphics display processor for PCs, supported up to 1024×1024 resolution , and laid

931-500: A rack ), which adds a third layer – many computing units each using many CPUs to correspond to many GPUs. Some Bitcoin "miners" used such setups for high-quantity processing. Historically, CPUs have used hardware-managed caches , but the earlier GPUs only provided software-managed local memories. However, as GPUs are being increasingly used for general-purpose applications, state-of-the-art GPUs are being designed with hardware-managed multi-level caches which have helped

1064-562: A vector processor ), running compute kernels . This turns the massive computational power of a modern graphics accelerator's shader pipeline into general-purpose computing power. In certain applications requiring massive vector operations, this can yield several orders of magnitude higher performance than a conventional CPU. The two largest discrete (see " Dedicated graphics processing unit " above) GPU designers, AMD and Nvidia , are pursuing this approach with an array of applications. Both Nvidia and AMD teamed with Stanford University to create

1197-407: A CPU must poll every pixel or data element much more slowly, as the speed of access between a CPU and its larger pool of random-access memory (or in an even worse case, a hard drive ) is slower than GPUs and video cards, which typically contain smaller amounts of more expensive memory that is much faster to access. Transferring the portion of the data set to be actively analyzed to that GPU memory in

1330-433: A GPU, which acts with native speed and support on those types. A significant milestone for GPGPU was the year 2003 when two research groups independently discovered GPU-based approaches for the solution of general linear algebra problems on GPUs that ran faster than on CPUs. These early efforts to use GPUs as general-purpose processors required reformulating computational problems in terms of graphics primitives, as supported by

1463-534: A GPU-based client for the Folding@home distributed computing project for protein folding calculations. In certain circumstances, the GPU calculates forty times faster than the CPUs traditionally used by such applications. GPGPUs can be used for many types of embarrassingly parallel tasks including ray tracing . They are generally suited to high-throughput computations that exhibit data-parallelism to exploit

SECTION 10

#1732779566159

1596-583: A JVM. Java software can be executed by a hardware-based Java processor . This is used mostly in embedded systems. Java code running in the JVM has access to OS-related services, like disk input/output (I/O) and network access, if the appropriate privileges are granted. The JVM makes the system calls on behalf of the Java application. This lets users to decide the appropriate protection level, depending on an access-control list (ACL). For example, disk and network access

1729-507: A Vérité V2200 core to create a graphics card with a full T&L engine years before Nvidia's GeForce 256 ; This card, designed to reduce the load placed upon the system's CPU, never made it to market. NVIDIA RIVA 128 was one of the first consumer-facing GPU integrated 3D processing unit and 2D processing unit on a chip. OpenGL was introduced in the early '90s by SGI as a professional graphics API, with proprietary hardware support for 3D rasterization. In 1994 Microsoft acquired Softimage ,

1862-442: A client/web-server architecture. The distinction between traditional and web applications is not always clear. Features, installation methods and architectures for web and traditional applications overlap and blur the distinction. Nevertheless, this simplifying distinction is a common and useful generalization. Traditional application software has been distributed as binary files, especially executable files . Executables only support

1995-469: A concern—except to invoke the pixel shader). Nvidia's CUDA platform, first introduced in 2007, was the earliest widely adopted programming model for GPU computing. OpenCL is an open standard defined by the Khronos Group that allows for the development of code for both GPUs and CPUs with an emphasis on portability. OpenCL solutions are supported by Intel, AMD, Nvidia, and ARM, and according to

2128-560: A development machine for Capcom 's CP System arcade board. Fujitsu's FM Towns computer, released in 1989, had support for a 16,777,216 color palette. In 1988, the first dedicated polygonal 3D graphics boards were introduced in arcades with the Namco System 21 and Taito Air System. IBM introduced its proprietary Video Graphics Array (VGA) display standard in 1987, with a maximum resolution of 640×480 pixels. In November 1988, NEC Home Electronics announced its creation of

2261-483: A game with the intention of release on the latest Nintendo and Sony game consoles. Should Disney license the game with Sony first, it may be required to release the game solely on Sony's console for a short time or indefinitely . Several developers have implemented ways to play games online while using different platforms. Psyonix , Epic Games , Microsoft , and Valve all possess technology that allows Xbox 360 and PlayStation 3 gamers to play with PC gamers, leaving

2394-657: A highly customizable function block and did not really "run" a program. Many of these disparities between vertex and pixel shading were not addressed until the Unified Shader Model . In October 2002, with the introduction of the ATI Radeon 9700 (also known as R300), the world's first Direct3D 9.0 accelerator, pixel and vertex shaders could implement looping and lengthy floating point math, and were quickly becoming as flexible as CPUs, yet orders of magnitude faster for image-array operations. Pixel shading

2527-466: A number of brand names. In 2009, Intel , Nvidia , and AMD / ATI were the market share leaders, with 49.4%, 27.8%, and 20.6% market share respectively. In addition, Matrox produces GPUs. Modern smartphones use mostly Adreno GPUs from Qualcomm , PowerVR GPUs from Imagination Technologies , and Mali GPUs from ARM . Modern GPUs have traditionally used most of their transistors to do calculations related to 3D computer graphics . In addition to

2660-645: A particular platform—either the hardware, OS, or virtual machine (VM) it runs on. For example, the Java platform is a common VM platform which runs on many OSs and hardware types. A hardware platform can refer to an instruction set architecture . For example: ARM or the x86 architecture. These machines can run different operating systems. Smartphones and tablets generally run ARM architecture, these often run Android or iOS and other mobile operating systems . A software platform can be either an operating system (OS) or programming environment , though more commonly it

2793-446: A processed image representing outlines to a computer vision program controlling, say, a mobile robot. Because the GPU has fast and local hardware access to every pixel or other picture element in an image, it can analyze and average it (for the first example) or apply a Sobel edge filter or other convolution filter (for the second) with much greater speed than a CPU, which typically must access slower random-access memory copies of

SECTION 20

#1732779566159

2926-554: A properly crafted series of arithmetic/bit operations, but looping and conditional branching were not possible. Recent GPUs allow branching, but usually with a performance penalty. Branching should generally be avoided in inner loops, whether in CPU or GPU code, and various methods, such as static branch resolution, pre-computation, predication, loop splitting, and Z-cull can be used to achieve branching when hardware support does not exist. Graphics processing unit Arcade system boards have used specialized graphics circuits since

3059-615: A report in 2011 by Evans Data, OpenCL had become the second most popular HPC tool. In 2010, Nvidia partnered with Audi to power their cars' dashboards, using the Tegra GPU to provide increased functionality to cars' navigation and entertainment systems. Advances in GPU technology in cars helped advance self-driving technology . AMD's Radeon HD 6000 series cards were released in 2010, and in 2011 AMD released its 6000M Series discrete GPUs for mobile devices. The Kepler line of graphics cards by Nvidia were released in 2012 and were used in

3192-411: A single physical pool of RAM, allowing more efficient transfer of data. Hybrid GPUs compete with integrated graphics in the low-end desktop and notebook markets. The most common implementations of this are ATI's HyperMemory and Nvidia's TurboCache . Hybrid graphics cards are somewhat more expensive than integrated graphics, but much less expensive than dedicated graphics cards. They share memory with

3325-536: A single-source domain specific embedded language based on pure C++11. The dominant proprietary framework is Nvidia CUDA . Nvidia launched CUDA in 2006, a software development kit (SDK) and application programming interface (API) that allows using the programming language C to code algorithms for execution on GeForce 8 series and later GPUs. ROCm , launched in 2016, is AMD's open-source response to CUDA. It is, as of 2022, on par with CUDA with regards to features, and still lacking in consumer support. OpenVIDIA

3458-522: A specific use, real-time 3D graphics, or other mass calculations: Dedicated graphics processing units uses RAM that is dedicated to the GPU rather than relying on the computer’s main system memory. This RAM is usually specially selected for the expected serial workload of the graphics card (see GDDR ). Sometimes systems with dedicated discrete GPUs were called "DIS" systems as opposed to "UMA" systems (see next section). Dedicated GPUs are not necessarily removable, nor does it necessarily interface with

3591-443: A third. While this is straightforward, compared to developing for only one platform it can cost much more to pay a larger team or release products more slowly. It can also result in more bugs to be tracked and fixed. Another approach is to use software that hides the differences between the platforms. This abstraction layer insulates the application from the platform. Such applications are platform agnostic . Applications that run on

3724-527: A time-consuming task because different OSs have different application programming interfaces (API). Software written for one OS may not automatically work on all architectures that OS supports. Just because software is written in a popular programming language such as C or C++ , it does not mean it will run on all OSs that support that language—or even on different versions of the same OS. Web applications are typically described as cross-platform because, ideally, they are accessible from any web browser :

3857-781: A variety of computational resources available on the GPU: In fact, a program can substitute a write only texture for output instead of the framebuffer. This is done either through Render to Texture (RTT), Render-To-Backbuffer-Copy-To-Texture (RTBCTT), or the more recent stream-out. The most common form for a stream to take in GPGPU is a 2D grid because this fits naturally with the rendering model built into GPUs. Many computations naturally map into grids: matrix algebra, image processing, physically based simulation, and so on. Since textures are used as memory, texture lookups are then used as memory reads. Certain operations can be done automatically by

3990-603: A variety of imitators: by 1995, all major PC graphics chip makers had added 2D acceleration support to their chips. Fixed-function Windows accelerators surpassed expensive general-purpose graphics coprocessors in Windows performance, and such coprocessors faded from the PC market. Throughout the 1990s, 2D GUI acceleration evolved. As manufacturing capabilities improved, so did the level of integration of graphics chips. Additional application programming interfaces (APIs) arrived for

4123-538: A variety of tasks, such as Microsoft's WinG graphics library for Windows 3.x , and their later DirectDraw interface for hardware acceleration of 2D games in Windows 95 and later. In the early- and mid-1990s, real-time 3D graphics became increasingly common in arcade, computer, and console games, which led to increasing public demand for hardware-accelerated 3D graphics. Early examples of mass-market 3D graphics hardware can be found in arcade system boards such as

General-purpose computing on graphics processing units - Misplaced Pages Continue

4256-469: A workaround for this problem. Tools such as the Page Object Model allow cross-platform tests to be scripted so that one test case covers multiple versions of an app. If different versions have similar user interfaces, all can be tested with one test case. Web applications are becoming increasingly popular but many computer users still use traditional application software which does not rely on

4389-700: Is IBM PowerVM Lx86 , which allows Linux/x86 applications to run unmodified on the Linux/Power OS. Example of cross-platform binary software: A script can be considered to be cross-platform if its interpreter is available on multiple platforms and the script only uses the facilities built into the language. For example, a script written in Python for a Unix-like system will likely run with little or no modification on Windows, because Python also runs on Windows; indeed there are many implementations (e.g. IronPython for .NET Framework ). The same goes for many of

4522-418: Is a combination of both. An exception is Java , which uses an OS-independent virtual machine (VM) to execute Java bytecode . Some software platforms are: The Java language is typically compiled to run on a VM that is part of the Java platform. The Java virtual machine (Java VM, JVM) is a CPU implemented in software, which runs all Java code. This enables the same code to run on all systems that implement

4655-516: Is a term that can also apply to video games released on a range of video game consoles . Examples of cross-platform games include: Miner 2049er , Tomb Raider: Legend , FIFA series , NHL series and Minecraft . Each has been released across a variety of gaming platforms, such as the Wii , PlayStation 3 , Xbox 360 , personal computers , and mobile devices . Some platforms are harder to write for than others, requiring more time to develop

4788-416: Is also increasing over different GPU generations, e.g., the total register file size on Maxwell (GM200), Pascal and Volta GPUs are 6 MiB, 14 MiB and 20 MiB, respectively. By comparison, the size of a register file on CPUs is small, typically tens or hundreds of kilobytes. The high performance of GPUs comes at the cost of high power consumption, which under full load is in fact as much power as

4921-430: Is available on Windows, macOS (both PowerPC and x86 through what Apple Inc. calls a Universal binary ), Linux, and BSD on multiple computer architectures. The four platforms (in this case, Windows, macOS, Linux, and BSD) are separate executable distributions, although they come largely from the same source code . In rare cases, executable code built for several platforms is combined into a single executable file called

5054-463: Is available on may Android devices, but is not officially supported by Android. Apple introduced the proprietary Metal API for iOS applications, able to execute arbitrary code through Apple's GPU compute shaders . Computer video cards are produced by various vendors, such as Nvidia , AMD . Cards from such vendors differ on implementing data-format support, such as integer and floating-point formats (32-bit and 64-bit). Microsoft introduced

5187-712: Is commonly referred to as "GPU accelerated video decoding", "GPU assisted video decoding", "GPU hardware accelerated video decoding", or "GPU hardware assisted video decoding". Recent graphics cards decode high-definition video on the card, offloading the central processing unit. The most common APIs for GPU accelerated video decoding are DxVA for Microsoft Windows operating systems and VDPAU , VAAPI , XvMC , and XvBA for Linux-based and UNIX-like operating systems. All except XvMC are capable of decoding videos encoded with MPEG-1 , MPEG-2 , MPEG-4 ASP (MPEG-4 Part 2) , MPEG-4 AVC (H.264 / DivX 6), VC-1 , WMV3 / WMV9 , Xvid / OpenDivX (DivX 4), and DivX 5 codecs , while XvMC

5320-736: Is not available. Technologies such as Scan-Line Interleave by 3dfx, SLI and NVLink by Nvidia and CrossFire by AMD allow multiple GPUs to draw images simultaneously for a single screen, increasing the processing power available for graphics. These technologies, however, are increasingly uncommon; most games do not fully use multiple GPUs, as most users cannot afford them. Multiple GPUs are still used on supercomputers (like in Summit ), on workstations to accelerate video (processing multiple videos at once) and 3D rendering, for VFX , GPGPU workloads and for simulations, and in AI to expedite training, as

5453-708: Is often used for bump mapping , which adds texture to make an object look shiny, dull, rough, or even round or extruded. With the introduction of the Nvidia GeForce 8 series and new generic stream processing units, GPUs became more generalized computing devices. Parallel GPUs are making computational inroads against the CPU, and a subfield of research, dubbed GPU computing or GPGPU for general purpose computing on GPU , has found applications in fields as diverse as machine learning , oil exploration , scientific image processing , linear algebra , statistics , 3D reconstruction , and stock options pricing. GPGPU

General-purpose computing on graphics processing units - Misplaced Pages Continue

5586-518: Is only capable of decoding MPEG-1 and MPEG-2. There are several dedicated hardware video decoding and encoding solutions . Video decoding processes that can be accelerated by modern GPU hardware are: These operations also have applications in video editing, encoding, and transcoding. An earlier GPU may support one or more 2D graphics API for 2D acceleration, such as GDI and DirectDraw . A GPU can support one or more 3D graphics API, such as DirectX , Metal , OpenGL , OpenGL ES , Vulkan . In

5719-492: Is permissible to have multiple inputs and multiple outputs, but never a piece of memory that is both readable and writable. Arithmetic intensity is defined as the number of operations performed per word of memory transferred. It is important for GPGPU applications to have high arithmetic intensity else the memory access latency will limit computational speedup. Ideal GPGPU applications have large data sets, high parallelism, and minimal dependency between data elements. There are

5852-448: Is separation of functionality, which disables functionality not supported by browsers or OSs, while still delivering a complete application to the user. (See also: Separation of concerns .) This technique is used in web development where interpreted code (as in scripting languages) can query the platform it is running on to execute different blocks conditionally. Third-party libraries attempt to simplify cross-platform capability by hiding

5985-463: Is simply a set of records that require similar computation. Streams provide data parallelism. Kernels are the functions that are applied to each element in the stream. In the GPUs, vertices and fragments are the elements in streams and vertex and fragment shaders are the kernels to be run on them. For each element we can only read from the input, perform operations on it, and write to the output. It

6118-687: Is the Super FX chip, a RISC -based on-cartridge graphics chip used in some SNES games, notably Doom and Star Fox . Some systems used DSPs to accelerate transformations. Fujitsu , which worked on the Sega Model 2 arcade system, began working on integrating T&L into a single LSI solution for use in home computers in 1995; the Fujitsu Pinolite, the first 3D geometry processor for personal computers, released in 1997. The first hardware T&L GPU on home video game consoles

6251-457: Is the case with Nvidia's lineup of DGX workstations and servers, Tesla GPUs, and Intel's Ponte Vecchio GPUs. Integrated graphics processing units (IGPU), integrated graphics , shared graphics solutions , integrated graphics processors (IGP), or unified memory architectures (UMA) use a portion of a computer's system RAM rather than dedicated graphics memory. IGPs can be integrated onto a motherboard as part of its northbridge chipset, or on

6384-408: Is the dominant open general-purpose GPU computing language, and is an open standard defined by the Khronos Group . OpenCL provides a cross-platform GPGPU platform that additionally supports data parallel compute on CPUs. OpenCL is actively supported on Intel, AMD, Nvidia, and ARM platforms. The Khronos Group has also standardised and implemented SYCL , a higher-level programming model for OpenCL as

6517-547: Is the practice of deliberately writing software to work on more than one platform. There are different ways to write a cross-platform application. One approach is to create multiple versions of the same software in different source trees —in other words, the Microsoft Windows version of an application might have one set of source code files and the Macintosh version another, while a FOSS *nix system might have

6650-641: Is used in complex graphics pipelines as well as scientific computing ; more so in fields with large data sets like genome mapping , or where two- or three-dimensional analysis is useful – especially at present biomolecule analysis, protein study, and other complex organic chemistry . An example of such applications is NVIDIA software suite for genome analysis . Such pipelines can also vastly improve efficiency in image processing and computer vision , among other fields; as well as parallel processing generally. Some very heavily optimized pipelines have yielded speed increases of several hundred times

6783-411: Is used with this technique. Cross-platform applications need much more integration testing . Some web browsers prohibit installation of different versions on the same machine. There are several approaches used to target multiple platforms, but all of them result in software that requires substantial manual effort for testing and maintenance. Techniques such as full virtualization are sometimes used as

SECTION 50

#1732779566159

6916-428: Is useful in graphics because almost every basic data type is a vector (either 2-, 3-, or 4-dimensional). Examples include vertices, colors, normal vectors, and texture coordinates. Many other applications can put this to good use, and because of their higher performance, vector instructions, termed single instruction, multiple data ( SIMD ), have long been available on CPUs. Originally, data was simply passed one-way from

7049-764: Is usually enabled for desktop applications, but not for browser-based applets . The Java Native Interface (JNI) can also be used to access OS-specific functions, with a loss of portability. Currently, Java Standard Edition software can run on Microsoft Windows, macOS, several Unix-like OSs, and several real-time operating systems for embedded devices. For mobile applications, browser plugins are used for Windows and Mac based devices, and Android has built-in support for Java. There are also subsets of Java, such as Java Card or Java Platform, Micro Edition , designed for resource-constrained devices. For software to be considered cross-platform, it must function on more than one computer architecture or OS. Developing such software can be

7182-586: The DirectCompute GPU computing API, released with the DirectX 11 API. Alea GPU , created by QuantAlea, introduces native GPU computing capabilities for the Microsoft .NET languages F# and C# . Alea GPU also provides a simplified GPU programming model based on GPU parallel-for and parallel aggregate using delegates and automatic memory management. MATLAB supports GPGPU acceleration using

7315-609: The GeForce 256 as "the world's first GPU". It was presented as a "single-chip processor with integrated transform, lighting, triangle setup/clipping , and rendering engines". Rival ATI Technologies coined the term " visual processing unit " or VPU with the release of the Radeon 9700 in 2002. The AMD Alveo MA35D features dual VPU’s, each using the 5 nm process in 2023. In personal computers, there are two main forms of GPUs. Each has many synonyms: Most GPUs are designed for

7448-517: The Intel Core line and with contemporary Pentiums and Celerons. This resulted in a large nominal market share, as the majority of computers with an Intel CPU also featured this embedded graphics processor. These generally lagged behind discrete processors in performance. Intel re-entered the discrete GPU market in 2022 with its Arc series, which competed with the then-current GeForce 30 series and Radeon 6000 series cards at competitive prices. In

7581-535: The Parallel Computing Toolbox and MATLAB Distributed Computing Server , and third-party packages like Jacket . GPGPU processing is also used to simulate Newtonian physics by physics engines , and commercial implementations include Havok Physics, FX and PhysX , both of which are typically used for computer and video games . C++ Accelerated Massive Parallelism ( C++ AMP ) is a library that accelerates execution of C++ code by exploiting

7714-465: The PowerVR and the 3dfx Voodoo . However, as manufacturing technology continued to progress, video, 2D GUI acceleration, and 3D functionality were all integrated into one chip. Rendition 's Verite chipsets were among the first to do this well. In 1997, Rendition collaborated with Hercules and Fujitsu on a "Thriller Conspiracy" project which combined a Fujitsu FXG-1 Pinolite geometry processor with

7847-522: The Sega Model 1 , Namco System 22 , and Sega Model 2 , and the fifth-generation video game consoles such as the Saturn , PlayStation , and Nintendo 64 . Arcade systems such as the Sega Model 2 and SGI Onyx -based Namco Magic Edge Hornet Simulator in 1993 were capable of hardware T&L ( transform, clipping, and lighting ) years before appearing in consumer graphics cards. Another early example

7980-616: The Video Electronics Standards Association (VESA) to develop and promote a Super VGA (SVGA) computer display standard as a successor to VGA. Super VGA enabled graphics display resolutions up to 800×600 pixels , a 36% increase. In 1991, S3 Graphics introduced the S3 86C911 , which its designers named after the Porsche 911 as an indication of the performance increase it promised. The 86C911 spawned

8113-479: The interpreters or run-time packages are common or standard components of all supported platforms. For example, a cross-platform application may run on Linux , macOS and Microsoft Windows . Cross-platform software may run on many platforms, or as few as two. Some frameworks for cross-platform development are Codename One , ArkUI-X, Kivy , Qt , GTK , Flutter , NativeScript , Xamarin , Apache Cordova , Ionic , and React Native . Platform can refer to

SECTION 60

#1732779566159

8246-412: The motherboard by means of an expansion slot such as PCI Express (PCIe) or Accelerated Graphics Port (AGP). They can usually be replaced or upgraded with relative ease, assuming the motherboard is capable of supporting the upgrade. A few graphics cards still use Peripheral Component Interconnect (PCI) slots, but their bandwidth is so limited that they are generally used only when a PCIe or AGP slot

8379-422: The open-source scripting languages . Unlike binary executable files, the same script can be used on all computers that have software to interpret the script. This is because the script is generally stored in plain text in a text file . There may be some trivial issues, such as the representation of a new line character . Some popular cross-platform scripting languages are: Cross-platform or multi-platform

8512-414: The rotation and translation of vertices into different coordinate systems . Recent developments in GPUs include support for programmable shaders which can manipulate vertices and textures with many of the same operations that are supported by CPUs , oversampling and interpolation techniques to reduce aliasing , and very high-precision color spaces . Several factors of GPU construction affect

8645-483: The 1970s, the term "GPU" originally stood for graphics processor unit and described a programmable processing unit working independently from the CPU that was responsible for graphics manipulation and output. In 1994, Sony used the term (now standing for graphics processing unit ) in reference to the PlayStation console's Toshiba -designed Sony GPU . The term was popularized by Nvidia in 1999, who marketed

8778-594: The 1970s. In early video game hardware, RAM for frame buffers was expensive, so video chips composited data together as the display was being scanned out on the monitor. A specialized barrel shifter circuit helped the CPU animate the framebuffer graphics for various 1970s arcade video games from Midway and Taito , such as Gun Fight (1975), Sea Wolf (1976), and Space Invaders (1978). The Namco Galaxian arcade system in 1979 used specialized graphics hardware that supported RGB color , multi-colored sprites, and tilemap backgrounds. The Galaxian hardware

8911-598: The 2020s, GPUs have been increasingly used for calculations involving embarrassingly parallel problems, such as training of neural networks on enormous datasets that are needed for large language models . Specialized processing cores on some modern workstation's GPUs are dedicated for deep learning since they have significant FLOPS performance increases, using 4×4 matrix multiplication and division, resulting in hardware performance up to 128 TFLOPS in some applications. These tensor cores are expected to appear in consumer cards, as well. Many companies have produced GPUs under

9044-422: The 3D hardware, today's GPUs include basic 2D acceleration and framebuffer capabilities (usually with a VGA compatibility mode). Newer cards such as AMD/ATI HD5000–HD7000 lack dedicated 2D acceleration; it is emulated by 3D hardware. GPUs were initially used to accelerate the memory-intensive work of texture mapping and rendering polygons. Later, units were added to accelerate geometric calculations such as

9177-460: The APIs used. (See e.g.,) GPUs can only process independent vertices and fragments, but can process many of them in parallel. This is especially effective when the programmer wants to process many vertices or fragments in the same way. In this sense, GPUs are stream processors – processors that can operate in parallel by running one kernel on many records in a stream at once. A stream

9310-698: The CPU for relatively slow system RAM, as it has minimal or no dedicated video memory. IGPs use system memory with bandwidth up to a current maximum of 128 GB/s, whereas a discrete graphics card may have a bandwidth of more than 1000 GB/s between its VRAM and GPU core. This memory bus bandwidth can limit the performance of the GPU, though multi-channel memory can mitigate this deficiency. Older integrated graphics chipsets lacked hardware transform and lighting , but newer ones include it. On systems with "Unified Memory Architecture" (UMA), including modern AMD processors with integrated graphics, modern Intel processors with integrated graphics, Apple processors,

9443-466: The DirectX 9 specification. DirectX 9 Shader Model 2.x suggested the support of two precision types: full and partial precision. Full precision support could either be FP32 or FP24 (floating point 32- or 24-bit per component) or greater, while partial precision was FP16. ATI's Radeon R300 series of GPUs supported FP24 precision only in the programmable fragment pipeline (although FP32 was supported in

9576-563: The GPU because of this. Compute kernels can be thought of as the body of loops . For example, a programmer operating on a grid on the CPU might have code that looks like this: On the GPU, the programmer only specifies the body of the loop as the kernel and what data to loop over by invoking geometry processing. In sequential code it is possible to control the flow of the program using if-then-else statements and various forms of loops. Such flow control structures have only recently been added to GPUs. Conditional writes could be performed using

9709-561: The GPU to scan and analyze it can create a large speedup . GPGPU pipelines were developed at the beginning of the 21st century for graphics processing (e.g. for better shaders ). These pipelines were found to fit scientific computing needs well, and have since been developed in this direction. The most known GPGPUs are Nvidia Tesla that are used for Nvidia DGX , alongside AMD Instinct and Intel Gaudi. In principle, any arbitrary Boolean function , including addition, multiplication, and other mathematical functions, can be built up from

9842-686: The GPUs to move towards mainstream computing. For example, GeForce 200 series GT200 architecture GPUs did not feature an L2 cache, the Fermi GPU has 768 KiB last-level cache, the Kepler GPU has 1.5 MiB last-level cache, the Maxwell GPU has 2 MiB last-level cache, and the Pascal GPU has 4 MiB last-level cache. GPUs have very large register files , which allow them to reduce context-switching latency. Register file size

9975-583: The JVM are built this way. Some applications mix various methods of cross-platform programming to create the final application. An example is the Firefox web browser, which uses abstraction to build some of the lower-level components, with separate source subtrees for implementing platform-specific features (like the GUI), and the implementation of more than one scripting language to ease software portability . Firefox implements XUL , CSS and JavaScript for extending

10108-487: The Nvidia's 600 and 700 series cards. A feature in this GPU microarchitecture included GPU boost, a technology that adjusts the clock-speed of a video card to increase or decrease it according to its power draw. The Kepler microarchitecture was manufactured on the 28 nm process . The PS4 and Xbox One were released in 2013; they both use GPUs based on AMD's Radeon HD 7850 and 7790 . Nvidia's Kepler line of GPUs

10241-568: The PC world, notable failed attempts for low-cost 3D graphics chips included the S3 ViRGE , ATI Rage , and Matrox Mystique . These chips were essentially previous-generation 2D accelerators with 3D features bolted on. Many were pin-compatible with the earlier-generation chips for ease of implementation and minimal cost. Initially, 3D graphics were possible only with discrete boards dedicated to accelerating 3D functions (and lacking 2D graphical user interface (GUI) acceleration entirely) such as

10374-573: The PS5 and Xbox Series (among others), the CPU cores and the GPU block share the same pool of RAM and memory address space. This allows the system to dynamically allocate memory between the CPU cores and the GPU block based on memory needs (without needing a large static split of the RAM) and thanks to zero copy transfers, removes the need for either copying data over a bus (computing) between physically separate RAM pools or copying between separate address spaces on

10507-556: The R9 290X or better at the time of their release. Cards based on the Pascal microarchitecture were released in 2016. The GeForce 10 series of cards are of this generation of graphics cards. They are made using the 16 nm manufacturing process which improves upon previous microarchitectures. Nvidia released one non-consumer card under the new Volta architecture, the Titan V. Changes from

10640-535: The RTX 20 series GPUs that added ray-tracing cores to GPUs, improving their performance on lighting effects. Polaris 11 and Polaris 10 GPUs from AMD are fabricated by a 14 nm process. Their release resulted in a substantial increase in the performance per watt of AMD video cards. AMD also released the Vega GPU series for the high end market as a competitor to Nvidia's high end Pascal cards, also featuring HBM2 like

10773-560: The RX 6800, RX 6800 XT, and RX 6900 XT. The RX 6700 XT, which is based on Navi 22, was launched in early 2021. The PlayStation 5 and Xbox Series X and Series S were released in 2020; they both use GPUs based on the RDNA 2 microarchitecture with incremental improvements and different GPU configurations in each system's implementation. Intel first entered the GPU market in the late 1990s, but produced lackluster 3D accelerators compared to

10906-608: The Titan V. In 2019, AMD released the successor to their Graphics Core Next (GCN) microarchitecture/instruction set. Dubbed RDNA, the first product featuring it was the Radeon RX 5000 series of video cards. The company announced that the successor to the RDNA microarchitecture would be incremental (aka a refresh). AMD unveiled the Radeon RX 6000 series , its RDNA 2 graphics cards with support for hardware-accelerated ray tracing. The product series, launched in late 2020, consisted of

11039-493: The Titan XP, Pascal's high-end card, include an increase in the number of CUDA cores, the addition of tensor cores, and HBM2 . Tensor cores are designed for deep learning, while high-bandwidth memory is on-die, stacked, lower-clocked memory that offers an extremely wide memory bus. To emphasize that the Titan V is not a gaming card, Nvidia removed the "GeForce GTX" suffix it adds to consumer gaming cards. In 2018, Nvidia launched

11172-456: The actual display rate. Most GPUs made since 1995 support the YUV color space and hardware overlays , important for digital video playback, and many GPUs made since 2000 also support MPEG primitives such as motion compensation and iDCT . This hardware-accelerated video decoding, in which portions of the video decoding process and video post-processing are offloaded to the GPU hardware,

11305-459: The already parallel nature of graphics processing. Essentially, a GPGPU pipeline is a kind of parallel processing between one or more GPUs and CPUs that analyzes data as if it were in image or other graphic form. While GPUs operate at lower frequencies, they typically have many times the number of cores . Thus, GPUs can process far more pictures and graphical data per second than a traditional CPU. Migrating data into graphical form and then using

11438-609: The basis of the Texas Instruments Graphics Architecture ("TIGA") Windows accelerator cards. In 1987, the IBM 8514 graphics system was released. It was one of the first video cards for IBM PC compatibles to implement fixed-function 2D primitives in electronic hardware . Sharp 's X68000 , released in 1987, used a custom graphics chipset with a 65,536 color palette and hardware support for sprites, scrolling, and multiple playfields. It served as

11571-612: The books: " Game of X " v.1 and v.2 by Russel Demaria, " Renegades of the Empire " by Mike Drummond, " Opening the Xbox " by Dean Takahashi and " Masters of Doom " by David Kushner. The Nvidia GeForce 256 (also known as NV10) was the first consumer-level card with hardware-accelerated T&L; While the OpenGL API provided software support for texture mapping and lighting the first 3D hardware acceleration for these features arrived with

11704-486: The browser is the platform. Web applications generally employ a client–server model , but vary widely in complexity and functionality. It can be hard to reconcile the desire for features with the need for compatibility. Basic web applications perform all or most processing from a stateless server , and pass the result to the client web browser. All user interaction with the application consists of simple exchanges of data requests and server responses. This type of application

11837-449: The code may be stored as separate codebases, or merged into one codebase. An alternative to porting is cross-platform virtualization , where applications compiled for one platform can run on another without modification of the source code or binaries. As an example, Apple's Rosetta , which is built into Intel -based Macintosh computers, runs applications compiled for the previous generation of Macs that used PowerPC CPUs. Another example

11970-470: The code, but can be worthwhile where the amount of platform-specific code is high. This strategy relies on having one codebase that may be compiled to multiple platform-specific formats. One technique is conditional compilation . With this technique, code that is common to all platforms is not repeated. Blocks of code that are only relevant to certain platforms are made conditional, so that they are only interpreted or compiled when needed. Another technique

12103-579: The competition at the time. Rather than attempting to compete with the high-end manufacturers Nvidia and ATI/AMD, they began integrating Intel Graphics Technology GPUs into motherboard chipsets, beginning with the Intel 810 for the Pentium III, and later into CPUs. They began with the Intel Atom 'Pineview' laptop processor in 2009, continuing in 2010 with desktop processors in the first generation of

12236-447: The complexities of client differentiation behind a single, unified API, at the expense of vendor lock-in . Responsive web design (RWD) is a Web design approach aimed at crafting the visual layout of sites to provide an optimal viewing experience—easy reading and navigation with a minimum of resizing, panning, and scrolling—across a wide range of devices, from mobile phones to desktop computer monitors. Little or no platform-specific code

12369-404: The data-parallel hardware on GPUs. Due to a trend of increasing power of mobile GPUs, general-purpose programming became available also on the mobile devices running major mobile operating systems . Google Android 4.2 enabled running RenderScript code on the mobile device GPU. Renderscript has since been deprecated in favour of first OpenGL compute shaders and later Vulkan Compute. OpenCL

12502-526: The decision of which platform to use to consumers. The first game to allow this level of interactivity between PC and console games (Dreamcast with specially produced keyboard and mouse) was Quake 3 . Games that feature cross-platform online play include Rocket League , Final Fantasy XIV , Street Fighter V , Killer Instinct , Paragon and Fable Fortune , and Minecraft with its Better Together update on Windows 10 , VR editions, Pocket Edition and Xbox One . Cross-platform programming

12635-541: The dominant CGI movie production tool used for early CGI movie hits like Jurassic Park, Terminator 2 and Titanic. With that deal came a strategic relationship with SGI and a commercial license of SGI's OpenGL libraries enabling Microsoft to port the API to the Windows NT OS but not to the upcoming release of Windows '95. Although it was little known at the time, SGI had contracted with Microsoft to transition from Unix to

12768-517: The first Direct3D accelerated consumer GPU's . Nvidia was first to produce a chip capable of programmable shading : the GeForce 3 . Each pixel could now be processed by a short program that could include additional image textures as inputs, and each geometric vertex could likewise be processed by a short program before it was projected onto the screen. Used in the Xbox console, this chip competed with

12901-479: The first Direct3D GPU's. Nvidia, quickly pivoted from a failed deal with Sega in 1996 to aggressively embracing support for Direct3D. In this era Microsoft merged their internal Direct3D and OpenGL teams and worked closely with SGI to unify driver standards for both industrial and consumer 3D graphics hardware accelerators. Microsoft ran annual events for 3D chip makers called "Meltdowns" to test their 3D hardware and drivers to work both with Direct3D and OpenGL. It

13034-481: The first major CMOS graphics processor for personal computers. The ARTC could display up to 4K resolution when in monochrome mode. It was used in a number of graphics cards and terminals during the late 1980s. In 1985, the Amiga was released with a custom graphics chip including a blitter for bitmap manipulation, line drawing, and area fill. It also included a coprocessor with its own simple instruction set, that

13167-480: The form of textures or other easily readable GPU forms results in speed increase. The distinguishing feature of a GPGPU design is the ability to transfer information bidirectionally back from the GPU to the CPU; generally the data throughput in both directions is ideally high, resulting in a multiplier effect on the speed of a specific high-use algorithm . GPGPU pipelines may improve efficiency on especially large data sets and/or data containing 2D or 3D imagery. It

13300-496: The forthcoming Windows '95 consumer OS, in '95 Microsoft announced the acquisition of UK based Rendermorphics Ltd and the Direct3D driver model for the acceleration of consumer 3D graphics. The Direct3D driver model shipped with DirectX 2.0 in 1996. It included standards and specifications for 3D chip makers to compete to support 3D texture, lighting and Z-buffering. ATI, which was later to be acquired by AMD, began development on

13433-441: The forthcoming Windows NT OS , the deal which was signed in 1995 was not announced publicly until 1998. In the intervening period, Microsoft worked closely with SGI to port OpenGL to Windows NT. In that era OpenGL had no standard driver model for competing hardware accelerators to compete on the basis of support for higher level 3D texturing and lighting functionality. In 1994 Microsoft announced DirectX 1.0 and support for gaming in

13566-483: The foundations for the emerging PC graphics market. It was used in a number of graphics cards and was licensed for clones such as the Intel 82720, the first of Intel's graphics processing units . The Williams Electronics arcade games Robotron 2084 , Joust , Sinistar , and Bubbles , all released in 1982, contain custom blitter chips for operating on 16-color bitmaps. In 1984, Hitachi released ARTC HD63484,

13699-518: The graphic in question. GPGPU is fundamentally a software concept, not a hardware concept; it is a type of algorithm , not a piece of equipment. Specialized equipment designs may, however, even further enhance the efficiency of GPGPU pipelines, which traditionally perform relatively few algorithms on very large amounts of data. Massively parallelized, gigantic-data-level tasks thus may be parallelized even further via specialized setups such as rack computing (many similar, highly tailored machines built into

13832-462: The legacy model of GPGPU programming, where graphics APIs ( OpenGL or DirectX ) were used to perform general-purpose computation. With the introduction of the CUDA (Nvidia, 2007) and OpenCL (vendor-independent, 2008) general-purpose computing APIs, in new GPGPU codes it is no longer necessary to map the computation to graphics primitives. The stream processing nature of GPUs remains valid regardless of

13965-454: The more recent versions of popular web browsers. These features include Ajax , JavaScript , Dynamic HTML , SVG , and other components of rich web applications . Because of the competing interests of compatibility and functionality, numerous design strategies have emerged. Many software systems use a layered architecture where platform-dependent code is restricted to the upper- and lowermost layers. Graceful degradation attempts to provide

14098-581: The motherboard in a standard fashion. The term "dedicated" refers to the fact that graphics cards have RAM that is dedicated to the card's use, not to the fact that most dedicated GPUs are removable. Dedicated GPUs for portable computers are most commonly interfaced through a non-standard and often proprietary slot due to size and weight constraints. Such ports may still be considered PCIe or AGP in terms of their logical host interface, even if they are not physically interchangeable with their counterparts. Graphics cards with dedicated GPUs typically interface with

14231-410: The number of core on-silicon processor units within the GPU chip that perform the core calculations, typically working in parallel with other SM/CUs on the GPU. GPU performance is typically measured in floating point operations per second ( FLOPS ); GPUs in the 2010s and 2020s typically deliver performance measured in teraflops (TFLOPS). This is an estimated performance measure, as other factors can affect

14364-519: The one in the PlayStation 2 , which used a custom vector unit for hardware accelerated vertex processing (commonly referred to as VU0/VU1). The earliest incarnations of shader execution engines used in Xbox were not general purpose and could not execute arbitrary pixel code. Vertices and pixels were processed by different units which had their own resources, with pixel shaders having tighter constraints (because they execute at higher frequencies than vertices). Pixel shading engines were actually more akin to

14497-421: The original CPU-based pipeline on one high-use task. A simple example would be a GPU program that collects data about average lighting values as it renders some view from either a camera or a computer graphics program back to the main program on the CPU, so that the CPU can then make adjustments to the overall screen view. A more advanced example might use edge detection to return both numerical information and

14630-410: The performance of the card for real-time rendering, such as the size of the connector pathways in the semiconductor device fabrication , the clock signal frequency, and the number and size of various on-chip memory caches . Performance is also affected by the number of streaming multiprocessors (SM) for NVidia GPUs, or compute units (CU) for AMD GPUs, or Xe cores for Intel discrete GPUs, which describe

14763-542: The platform they were built for—which means that a single cross-platform executable could be very bloated with code that never executes on a particular platform. Instead, generally there is a selection of executables, each built for one platform. For software that is distributed as a binary executable, such as that written in C or C++, there must be a software build for each platform, using a toolset that translates—transcompiles—a single codebase into multiple binary executables. For example, Firefox , an open-source web browser,

14896-532: The rest of the PC system combined. The maximum power consumption of the Pascal series GPU (Tesla P100) was specified to be 250W. GPUs are designed specifically for graphics and thus are very restrictive in operations and programming. Due to their design, GPUs are only effective for problems that can be solved using stream processing and the hardware can only be used in certain ways. The following discussion referring to vertices, fragments and textures concerns mainly

15029-482: The same die (integrated circuit) with the CPU (like AMD APU or Intel HD Graphics ). On certain motherboards, AMD's IGPs can use dedicated sideport memory: a separate fixed block of high performance memory that is dedicated for use by the GPU. As of early 2007 computers with integrated graphics account for about 90% of all PC shipments. They are less costly to implement than dedicated graphics processing, but tend to be less capable. Historically, integrated processing

15162-497: The same or similar functionality to all users and platforms, while diminishing that functionality to a least common denominator for more limited client browsers. For example, a user attempting to use a limited-feature browser to access Gmail may notice that Gmail switches to basic mode, with reduced functionality but still of use. Some software is maintained in distinct codebases for different (hardware and OS) platforms, with equivalent functionality. This requires more effort to maintain

15295-415: The scan lines map to specific bitmapped or character modes and where the memory is stored (so there did not need to be a contiguous frame buffer). 6502 machine code subroutines could be triggered on scan lines by setting a bit on a display list instruction. ANTIC also supported smooth vertical and horizontal scrolling independent of the CPU. The NEC μPD7220 was the first implementation of

15428-433: The speed of a GPU without requiring full and explicit conversion of the data to a graphical form. Mark Harris, the founder of GPGPU.org, coined the term GPGPU . Any language that allows the code running on the CPU to poll a GPU shader for return values, can create a GPGPU framework. Programming standards for parallel computing include OpenCL (vendor-independent), OpenACC , OpenMP and OpenHMPP . As of 2016, OpenCL

15561-458: The speed tradeoff negates any benefit to offloading the computing onto the GPU in the first place. Most operations on the GPU operate in a vectorized fashion: one operation can be performed on up to four values at once. For example, if one color ⟨R1, G1, B1⟩ is to be modulated by another color ⟨R2, G2, B2⟩ , the GPU can produce the resulting color ⟨R1*R2, G1*G2, B1*B2⟩ in one operation. This functionality

15694-486: The system and have a small dedicated memory cache, to make up for the high latency of the system RAM. Technologies within PCI Express make this possible. While these solutions are sometimes advertised as having as much as 768 MB of RAM, this refers to how much can be shared with the system memory. It is common to use a general purpose graphics processing unit (GPGPU) as a modified form of stream processor (or

15827-568: The two major APIs for graphics processors, OpenGL and DirectX . This cumbersome translation was obviated by the advent of general-purpose programming languages and APIs such as Sh / RapidMind , Brook and Accelerator. These were followed by Nvidia's CUDA , which allowed programmers to ignore the underlying graphical concepts in favor of more common high-performance computing concepts. Newer, hardware-vendor-independent offerings include Microsoft's DirectCompute and Apple/Khronos Group's OpenCL . This means that modern GPGPU pipelines can leverage

15960-447: The type of processor (CPU) or other hardware on which an operating system (OS) or application runs, the type of OS, or a combination of the two. An example of a common platform is Android which runs on the ARM architecture family . Other well-known platforms are Linux / Unix , macOS and Windows , these are all cross-platform. Applications can be written to depend on the features of

16093-719: The vertex processors) while Nvidia 's NV30 series supported both FP16 and FP32; other vendors such as S3 Graphics and XGI supported a mixture of formats up to FP24. The implementations of floating point on Nvidia GPUs are mostly IEEE compliant; however, this is not true across all vendors. This has implications for correctness which are considered important to some scientific applications. While 64-bit floating point values (double precision float) are commonly available on CPUs, these are not universally supported on GPUs. Some GPU architectures sacrifice IEEE compliance, while others lack double-precision. Efforts have occurred to emulate double-precision floating point values on GPUs; however,

16226-497: The video game to the same standard. To offset this, a video game may be released on a few platforms first, then later on others. Typically, this happens when a new gaming system is released, because video game developers need to acquaint themselves with its hardware and software. Some games may not be cross-platform because of licensing agreements between developers and video game console manufacturers that limit development to one particular console. As an example, Disney could create

16359-546: The wide vector width SIMD architecture of the GPU. Cross-platform In computing , cross-platform software (also called multi-platform software , platform-agnostic software , or platform-independent software ) is computer software that is designed to work in several computing platforms . Some cross-platform software requires a separate build for each platform, but some can be directly run on any platform without special preparation, being written in an interpreted language or compiled to portable bytecode for which

16492-471: Was capable of manipulating graphics hardware registers in sync with the video beam (e.g. for per-scanline palette switches, sprite multiplexing, and hardware windowing), or driving the blitter. In 1986, Texas Instruments released the TMS34010 , the first fully programmable graphics processor. It could run general-purpose code, but it had a graphics-oriented instruction set. During 1990–1992, this chip became

16625-504: Was considered unfit for 3D games or graphically intensive programs but could run less intensive programs such as Adobe Flash. Examples of such IGPs would be offerings from SiS and VIA circa 2004. However, modern integrated graphics processors such as AMD Accelerated Processing Unit and Intel Graphics Technology (HD, UHD, Iris, Iris Pro, Iris Plus, and Xe-LP ) can handle 2D graphics or low-stress 3D graphics. Since GPU computations are memory-intensive, integrated processing may compete with

16758-406: Was developed at University of Toronto between 2003–2005, in collaboration with Nvidia. Altimesh Hybridizer created by Altimesh compiles Common Intermediate Language to CUDA binaries. It supports generics and virtual functions. Debugging and profiling is integrated with Visual Studio and Nsight. It is available as a Visual Studio extension on Visual Studio Marketplace. Microsoft introduced

16891-517: Was during this period of strong Microsoft influence over 3D standards that 3D accelerator cards moved beyond being simple rasterizers to become more powerful general purpose processors as support for hardware accelerated texture mapping, lighting, Z-buffering and compute created the modern GPU. During this period the same Microsoft team responsible for Direct3D and OpenGL driver standardization introduced their own Microsoft 3D chip design called Talisman . Details of this era are documented extensively in

17024-569: Was followed by the Maxwell line, manufactured on the same process. Nvidia's 28 nm chips were manufactured by TSMC in Taiwan using the 28 nm process. Compared to the 40 nm technology from the past, this manufacturing process allowed a 20 percent boost in performance while drawing less power. Virtual reality headsets have high system requirements; manufacturers recommended the GTX 970 and

17157-420: Was sufficient because this is also the representation used in displays. This representation does have certain limitations. Given sufficient graphics processing power even graphics programmers would like to use better formats, such as floating point data formats, to obtain effects such as high-dynamic-range imaging . Many GPGPU applications require floating point accuracy, which came with video cards conforming to

17290-498: Was the Nintendo 64 's Reality Coprocessor , released in 1996. In 1997, Mitsubishi released the 3Dpro/2MP , a GPU capable of transformation and lighting, for workstations and Windows NT desktops; ATi used it for its FireGL 4000 graphics card , released in 1997. The term "GPU" was coined by Sony in reference to the 32-bit Sony GPU (designed by Toshiba ) in the PlayStation video game console, released in 1994. In

17423-565: Was the norm in the early phases of World Wide Web application development. Such applications follow a simple transaction model, identical to that of serving static web pages . Today, they are still relatively common, especially where cross-platform compatibility and simplicity are deemed more critical than advanced functionality. Prominent examples of advanced web applications include the Web interface to Gmail and Google Maps . Such applications routinely depend on additional features found only in

17556-426: Was the precursor to what is now called a compute shader (e.g. CUDA, OpenCL, DirectCompute) and actually abused the hardware to a degree by treating the data passed to algorithms as texture maps and executing algorithms by drawing a triangle or quad with an appropriate pixel shader. This entails some overheads since units like the scan converter are involved where they are not needed (nor are triangle manipulations even

17689-484: Was widely used during the golden age of arcade video games , by game companies such as Namco , Centuri , Gremlin , Irem , Konami , Midway, Nichibutsu , Sega , and Taito. The Atari 2600 in 1977 used a video shifter called the Television Interface Adaptor . Atari 8-bit computers (1979) had ANTIC , a video processor which interpreted instructions describing a " display list "—the way

#158841