Top Qs
Timeline
Chat
Perspective
Hardware for artificial intelligence
Hardware specially designed and optimized for artificial intelligence From Wikipedia, the free encyclopedia
Remove ads
Specialized computer hardware is often used to execute artificial intelligence (AI) programs faster, and with less energy, such as Lisp machines, neuromorphic engineering, event cameras, and physical neural networks. Since 2017, several consumer grade CPUs and SoCs have on-die NPUs. As of 2023, the market for AI hardware is dominated by GPUs.[1]
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
As of the 2020s, AI computation is dominated by graphics processing units (GPUs) and newer domain-specific accelerators such as Google’s Tensor Processing Units (TPUs), AMD’s Instinct MI300 series, and various on-device neural-processing units (NPUs) found in consumer hardware.[2][3]
Remove ads
Scope
For the purposes of this article, AI hardware refers to computing components and systems specifically designed or optimized to accelerate artificial-intelligence workloads such as machine-learning training or inference. This includes general-purpose accelerators used for AI (for example, GPUs) and domain-specific accelerators (for example, TPUs, NPUs, and other AI ASICs).[4]
Event-based cameras are sometimes discussed in the context of neuromorphic computing, but they are input sensors rather than AI compute devices. Conversely, components such as memristors are basic circuit elements rather than specialized AI hardware when considered alone.[5][6]
Remove ads
Lisp machines

Lisp machines were developed in the late 1970s and early 1980s to make artificial intelligence programs written in the programming language Lisp run faster.
Dataflow architecture
Dataflow architecture processors used for AI serve various purposes with varied implementations like the polymorphic dataflow[7] Convolution Engine[8] by Kinara (formerly Deep Vision), structure-driven dataflow by Hailo,[9] and dataflow scheduling by Cerebras.[10]
Component hardware
Summarize
Perspective
AI accelerators

Since the 2010s, advances in computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.[11] By 2019, graphics processing units (GPUs), often with AI-specific enhancements, had displaced central processing units (CPUs) as the dominant means to train large-scale commercial cloud AI.[12] OpenAI estimated the hardware compute used in the largest deep learning projects from Alex Net (2012) to Alpha Zero (2017), and found a 300,000-fold increase in the amount of compute needed, with a doubling-time trend of 3.4 months.[13][14]
General-purpose GPUs for AI
Since the 2010s, graphics processing units (GPUs) have been widely used to train and deploy deep learning models because of their highly parallel architecture and high memory bandwidth. Modern data-center GPUs include dedicated tensor or matrix-math units that accelerate neural-network operations.
In 2022, NVIDIA introduced the Hopper-generation H100 GPU, adding FP8 precision support and faster interconnects for large-scale model training.[15] AMD and other vendors have also developed GPUs and accelerators aimed at AI and high-performance computing workloads.[16]
Domain-specific accelerators (ASICs / NPUs)
Beyond general-purpose GPUs, several companies have developed application-specific integrated circuits (ASICs) and neural processing units (NPUs) tailored for AI workloads. Google introduced the Tensor Processing Unit (TPU) in 2016 for deep-learning inference, with later generations supporting large-scale training through dense systolic-array designs and optical interconnects.[17] Other vendors have released similar devices—such as Apple’s Neural Engine and various on-device NPUs—that emphasize energy-efficient inference in mobile or edge computing environments.[18]
Memory and interconnects
AI accelerators rely on fast memory and inter-chip links to manage the large data volumes of training and inference. High-bandwidth memory (HBM) stacks, standardized as HBM3 in 2023, provide terabytes-per-second throughput on modern GPUs and ASICs.[19] These accelerators are often connected through dedicated fabrics such as NVIDIA’s NVLink and NVSwitch or optical interconnects used in TPU systems to scale performance across thousands of chips.[20]
Remove ads
Sources
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads