Top Qs
Timeline
Chat
Perspective
ROCm
Parallel computing platform: GPGPU libraries and application programming interface From Wikipedia, the free encyclopedia
Remove ads
ROCm[3] is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains, including general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), and heterogeneous computing. It offers several programming models: HIP (GPU-kernel-based programming), OpenMP (directive-based programming), and OpenCL.
ROCm is free, libre and open-source software (except the GPU firmware blobs[4]), and it is distributed under various licenses. ROCm initially stood for Radeon Open Compute platform; however, due to Open Compute being a registered trademark, ROCm is no longer an acronym — it is simply AMD's open-source stack designed for GPU compute.
Remove ads
Background
The first GPGPU software stack from ATI/AMD was Close to Metal, which became Stream.
ROCm was launched around 2016[5] with the Boltzmann Initiative.[6] ROCm stack builds upon previous AMD GPU stacks; some tools trace back to GPUOpen and others to the Heterogeneous System Architecture (HSA).
Heterogeneous System Architecture Intermediate Language
HSAIL[7] was aimed at producing a middle-level, hardware-agnostic intermediate representation that could be JIT-compiled to the eventual hardware (GPU, FPGA...) using the appropriate finalizer. This approach was dropped for ROCm: now it builds only GPU code, using LLVM, and its AMDGPU backend that was upstreamed,[8] although there is still research on such enhanced modularity with LLVM MLIR.[9]
Remove ads
Programming abilities
| This section needs expansion. You can help by adding to it.  (January 2022) | 
ROCm as a stack ranges from the kernel driver to the end-user applications. AMD has introductory videos about AMD GCN hardware,[10] and ROCm programming[11] via its learning portal.[12]
One of the best technical introductions about the stack and ROCm/HIP programming, remains, to date, to be found on Reddit.[13]
Hardware support
Summarize
Perspective
ROCm is primarily targeted at discrete professional GPUs,[14] but consumer GPUs and APUs of the same architecture as a supported professional GPU are known to work with ROCm. For example, all professional GPUs of the RDNA 2 architecture are officially supported by ROCm 5.x; users report that Consumer RDNA2 units such as the Radeon 6800M APU and the Radeon 6700XT GPU also work.[15]
Professional-grade GPUs
Consumer-grade GPUs
- DRM (Direct Rendering Manager) is a component of the Linux kernel.
Remove ads
Software ecosystem
Summarize
Perspective
Learning resources
| This section needs expansion. You can help by adding to it.  (January 2022) | 
AMD ROCm product manager Terry Deem gave a tour of the stack.[22]
Third-party integration
The main consumers of the stack are machine learning and high-performance computing/GPGPU applications.
Machine learning
Various deep learning frameworks have a ROCm backend:[23]
Supercomputing
ROCm is gaining significant traction in the top 500.[25] ROCm is used with the Exascale supercomputers El Capitan[26][27] and Frontier.
Some related software is to be found at AMD Infinity hub.
Other acceleration & graphics interoperation
As of version 3.0, Blender can now use HIP compute kernels for its renderer cycles.[28]
Other Languages
Julia
Julia has the AMDGPU.jl package,[29] which integrates with LLVM and selects components of the ROCm stack. Instead of compiling code through HIP, AMDGPU.jl uses Julia's compiler to generate LLVM IR directly, which is later consumed by LLVM to generate native device code. AMDGPU.jl uses ROCr's HSA implementation to upload native code onto the device and execute it, similar to how HIP loads its own generated device code.
AMDGPU.jl also supports integration with ROCm's rocBLAS (for BLAS), rocRAND (for random number generation), and rocFFT (for FFTs). Future integration with rocALUTION, rocSOLVER, MIOpen, and certain other ROCm libraries is planned.
Software distribution
Official
Installation instructions are provided for Linux and Windows in the official AMD ROCm documentation. ROCm software is currently spread across several public GitHub repositories. Within the main public meta-repository, there is an XML manifest for each official release: using git-repo, a version control tool built on top of Git, is the recommended way to synchronize with the stack locally.[30]
AMD starts distributing containerized applications for ROCm, notably scientific research applications gathered under AMD Infinity Hub.[31]
AMD distributes itself packages tailored to various Linux distributions.
Third-party
There is a growing third-party ecosystem packaging ROCm.
Linux distributions are officially packaging (natively) ROCm, with various degrees of advancement: Arch Linux,[32] Gentoo,[33] Debian, Fedora ,[34] GNU Guix, and NixOS.
Remove ads
Components
Summarize
Perspective
| This section needs expansion. You can help by adding to it.  (January 2022) | 
There is one kernel-space component, ROCk, and the rest - there is roughly a hundred components in the stack - is made of user-space modules.
The unofficial typographic policy is to use: uppercase ROC lowercase following for low-level libraries, i.e. ROCt, and the contrary for user-facing libraries, i.e. rocBLAS.[36]
AMD is active developing with the LLVM community, but upstreaming is not instantaneous, and as of January 2022, is still lagging.[37] AMD still officially packages various LLVM forks[38][39][9] for parts that are not yet upstreamed – compiler optimizations destined to remain proprietary, debug support, OpenMP offloading, etc.
Low-level
ROCk – Kernel driver
ROCm – Device libraries
Support libraries implemented as LLVM bitcode. These provide various utilities and functions for math operations, atomics, queries for launch parameters, on-device kernel launch, etc.
ROCt – Thunk
The thunk is responsible for all the thinking and queuing that goes into the stack.
ROCr – Runtime
The ROC runtime is a set of APIs/libraries that allows the launch of compute kernels by host applications. It is AMD's implementation of the HSA runtime API.[40] It is different from the ROC Common Language Runtime.
ROCm – CompilerSupport
ROCm code object manager is in charge of interacting with LLVM intermediate representation.
Mid-level
ROCclr Common Language Runtime
The common language runtime is an indirection layer adapting calls to ROCr on Linux and PAL on windows. It used to be able to route between different compilers, like the HSAIL-compiler. It is now being absorbed by the upper indirection layers (HIP and OpenCL).
OpenCL
ROCm ships its installable client driver (ICD) loader and an OpenCL[41] implementation bundled together. As of January 2022, ROCm 4.5.2 ships OpenCL 2.2, and is lagging behind competition.[42]
HIP – Heterogeneous Interface for Portability
The AMD implementation for its GPUs is called HIPAMD. There is also a CPU implementation mostly for demonstration purposes.
HIPCC
HIP builds a `HIPCC` compiler that either wraps Clang and compiles with LLVM open AMDGPU backend, or redirects to the NVIDIA compiler.[43]
HIPIFY
HIPIFY is a source-to-source compiling tool. It translates CUDA to HIP and reverse, either using a Clang-based tool, or a sed-like Perl script.
GPUFORT
Like HIPIFY, GPUFORT is a tool compiling source code into other third-generation-language sources, allowing users to migrate from CUDA Fortran to HIP Fortran. It is also in the repertoire of research projects, even more so.[44]
High-level
ROCm high-level libraries are usually consumed directly by application software, such as machine learning frameworks. Most of the following libraries are in the General Matrix Multiply (GEMM) category, which GPU architecture excels at.
The majority of these user-facing libraries comes in dual-form: hip for the indirection layer that can route to Nvidia hardware, and roc for the AMD implementation.[45]
rocBLAS / hipBLAS
rocBLAS and hipBLAS are central in high-level libraries, it is the AMD implementation for Basic Linear Algebra Subprograms. It uses the library Tensile privately.
rocSOLVER / hipSOLVER
This pair of libraries constitutes the LAPACK implementation for ROCm and is strongly coupled to rocBLAS.
Utilities
- ROCm developer tools: Debug, tracer, profiler, System Management Interface, Validation suite, Cluster management.
- GPUOpen tools: GPU analyzer, memory visualizer...
- External tools: radeontop (TUI overview)
Remove ads
Comparison with competitors
Summarize
Perspective
ROCm competes with other GPU computing stacks: Nvidia CUDA and Intel OneAPI.
Nvidia CUDA
Nvidia's CUDA is closed-source, whereas AMD ROCm is open source. There is open-source software built on top of the closed-source CUDA, for instance RAPIDS.
CUDA is able to run on consumer GPUs, whereas ROCm support is mostly offered for professional hardware such as AMD Instinct and AMD Radeon Pro.
Nvidia provides a C/C++-centered frontend and its Parallel Thread Execution (PTX) LLVM GPU backend as the Nvidia CUDA Compiler (NVCC).
Intel OneAPI
Like ROCm, oneAPI is open source, and all the corresponding libraries are published on its GitHub Page.
Unified Acceleration Foundation (UXL)
Unified Acceleration Foundation (UXL) is a new technology consortium that are working on the continuation of the OneAPI initiative, with the goal to create a new open standard accelerator software ecosystem, related open standards and specification projects through Working Groups and Special Interest Groups (SIGs). The goal will compete with Nvidia's CUDA. The main companies behind it are Intel, Google, Arm, Qualcomm, Samsung, Imagination, and VMware.[46]
Remove ads
See also
- AMD Software – a general overview of AMD's drivers, APIs, and development endeavors.
- GPUOpen – AMD's complementary graphics stack
- AMD Radeon Software – AMD's software distribution channel
References
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads


![[icon]](http://upload.wikimedia.org/wikipedia/commons/thumb/1/1c/Wiki_letter_w_cropped.svg/44px-Wiki_letter_w_cropped.svg.png)

