METAL-TOOLCHAIN(7) | Metal | METAL-TOOLCHAIN(7) |
metal-toolchain - metal compiler toolchain overview
The Metal toolchain consists of a set of programs targeting Apple GPUs. The goal of this document is to provide an overview of the toolchain behavior. Refer to the documentation of individual programs for more specific information.
Metal supports two compilation mode: split-compilation and traditional.
In the split-compilation mode, the toolchain targets the AIR virtual target. Final translation to the actual GPU binary code is performed at runtime. In the more traditional mode, the toolchain directly emits binary code compatible with the selected GPU target.
The architecture of the AIR virtual target is air64. There are different subarchitectures for air64. Each architecture is associated with a platform version.
The currently supported AIR achitectures, together with their native platform versions are:
air64_v16
air64_v18
air64_v111
air64_v20
air64_v21
air64_v22
air64_v23
air64_v24
air64_v25
air64_v26
air64_v27
Native GPU targets are in the <vendor>gpu_<arch> form, where <vendor> can be apple, amd, or intel; <arch> identifies the actual GPU architecture.
Known Apple GPU architectures are:
applegpu_gx2
applegpu_g4p
applegpu_g4g
applegpu_g5p
applegpu_g9p
applegpu_g9g
applegpu_g10p
applegpu_g11p
applegpu_g11m
applegpu_g11g
applegpu_g11g_8fstp
applegpu_g12p
applegpu_g13p
applegpu_g13g
applegpu_g13s
applegpu_g13c
applegpu_g13d
applegpu_g14p
applegpu_g14g
applegpu_g14s
applegpu_g14d
applegpu_g15p
Known AMD GPU architectures are:
amdgpu_gfx600
amdgpu_gfx600_nwh
amdgpu_gfx701
amdgpu_gfx704
amdgpu_gfx803
amdgpu_gfx802
amdgpu_gfx900
amdgpu_gfx904
amdgpu_gfx906
amdgpu_gfx1010_nsgc
amdgpu_gfx1010
amdgpu_gfx1011
amdgpu_gfx1012
amdgpu_gfx1030
amdgpu_gfx1032
Known Intel GPU architectures are:
intelgpu_skl_gt2r6
intelgpu_skl_gt2r7
intelgpu_skl_gt3r10
intelgpu_kbl_gt2r0
intelgpu_kbl_gt2r2
intelgpu_kbl_gt2r4
intelgpu_kbl_gt3r1
intelgpu_kbl_gt3r6
intelgpu_icl_1x6x8r7
intelgpu_icl_1x8x8r7
Having multiple architectures allows to store inside the same universal binary multiple binaries, each targeting a different version of the same platform.
The AIR toolchain is able to target the following platforms:
iPhoneOS
macOS
tvOS
watchOS
visionOS
Starting with air64_v23, all platforms are compatible with each other. So for instance you can link an air64_v23-apple-iphoneos14 object and an air64_v23-apple-macos11 object together.
There two main inputs of the AIR toolchain are Metal source files and Metal scripts. The canonical extension of Metal source files is .metal. The canonical extension of Metal scripts is .mtlp-json.
Metal scripts are consumed by tools emitting GPU binary code. Depending on the code being emitted, a Metal script might be required or not. For instance, a Metal script is required to emit a pipeline, but it is not required when emitting a dynamic library.
The AIR toolchain emits MetalLibs and MachOs. The former stores AIR binaries. The latter stores GPU binaries.
The AIR toolchain also emits universal binaries, that can contains both MetalLib and MachO slices at the same time.
The AIR toolchain provides two main compiler drivers: metal and metal-tt.
metal primary goal is to translate a bunch of source files into MetalLibs, MachOs, or universal binaries.
What is actually emitted depends on the selected target architectures. If more than one architecture is selected, a universal binary is emitted. Otherwise, if the target architecture is AIR a MetalLib is emitted. If the target architecture is a GPU architecture, a MachO is emitted.
$ metal -arch air64_v23 foo.metal -o foo.metallib
Emits a MetalLib.
$ metal -arch applegpu_g13s foo.metal -N foo.mtlp-json -o foo.metallib
Emits a MachO.
$ metal -arch air64_v23 -arch applegpu_g13s foo.metal -N foo.mtlp-json -o foo.metallib
Emits a universal binary, with one MetalLib slice and one MachO slice.
The most efficient way to use the metal driver is to independently compile a bunch of source files, followed by a link step:
$ metal -arch air64_v23 -c foo.metal -o foo.air $ metal -arch air64_v23 -c bar.metal -o bar.air $ metal -arch air64_v23 foo.air bar.air -o foobar.metallib
Since the emission of GPU binaries starts from MetalLibs, it is only needed to specify a GPU architecture at the link step:
$ metal -arch air64_v23 -c foo.metal -o foo.air $ metal -arch air64_v23 -c bar.metal -o bar.air $ metal -arch applegpu_g13s foo.air bar.air -N foobar.mtlp-json -o foobar.metallib
The metal driver requires to be told what architectures to target, which can be challenging when a large number of GPU architectures has to be targeted. The metal-tt driver solves this problem by automatically targeting all the GPU architectures supported by the toolchain:
$ metal -arch air64_v23 foo.metal -o foo.metallib-air64_v23 $ metal-tt foo.metallib-air64_v23 foo.mtlp-json -o foo.metallib
The produced foo.metallib contains one slice for each supported GPU architecture, plus the air64_v23 slice produced by metal.
A target is composed of a target architecture and a target platform.
Generally speaking, the target used by a compiler driver can be explicitly spelled out in the compiler driver command line. If the target is only partially spelled out -- e.g. the command line only specifies the target architecture -- the remaining components of the target are deduced by the compiler driver.
The deduction process is specific to each compiler driver, but it generally split deduction into two steps: selection of an architecture, followed by selection of a platform.
The default architecture is air64.
The platform is selected starting from the system root. If the system root points to a Darwin SDK, the target platform is set to the one of the SDK.
For instance assuming iPhoneOS16.0.sdk contains a valid iPhoneOS SDK, the target selected by the following command:
$ metal -isysroot iPhoneOS16.0.sdk foo.metal -o foo.metallib
Would be air64-apple-iphoneos16.0.
The system root can also be set using the SDKROOT environment variable. On Darwin, development tools are usually invoked using xcrun, which automatically sets SDKROOT to the selected SDK. Thus this command:
xcrun -sdk iphoneos metal foo.metal -o foo.metallib
Will target air64-apple-iphoneosX.Y, where X.Y is the iPhoneOS SDK target platform found by xcrun.
The metal-arch tool prints information about the architectures of the GPUs available in the current platform.
The metal-config tool prints information about the GPU architectures that can be targeted by the current toolchain.
To report bugs, please visit <https://developer.apple.com/bug-reporting/>.
metal(1), metal-arch(1), metal-config(1), metal-pipelines-script(5), metal-tt(1), xcrun(1)
Metal Shading Language Specification: <https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf>
2014-2024, The Metal Team
July 10, 2024 | 32023 |