AoS and SoA

Parallel computing data layout methods

In computing, an array of structures (AoS), structure of arrays (SoA) or array of structures of arrays (AoSoA) are contrasting ways to arrange a sequence of records in memory, with regard to interleaving, and are of interest in SIMD and SIMT programming.

Structure of arrays

[edit ]

Main article: Parallel array

Array of structures

[edit ]

Array of structures (AoS) is the opposite (and more conventional) layout, in which data for different fields is interleaved. This is often more intuitive, and supported directly by most programming languages.

For example, to store N points in 3D space using an array of structures:

structVector3{
floatx;
floaty;
floatz;
};
structVector3points[N];

floatget_point_x(size_ti){
returnpoints[i].x;
}

Array of structures of arrays

[edit ]

Array of structures of arrays (AoSoA) or tiled array of structs is a hybrid approach between the previous layouts, in which data for different fields is interleaved using tiles or blocks with size equal to the SIMD vector size. This is often less intuitive, but can achieve the memory throughput of the SoA approach, while being more friendly to the cache locality and load port architectures of modern processors.^[2] In particular, memory requests in modern processors have to be fulfilled in fixed width (e.g., size of a cacheline^[3]). The tiled storage of AoSoA aligns the memory access pattern to the requests' fixed width, leading to fewer access operations to complete a memory request and thus increasing the efficiency.^[4]

For example, to store N points in 3D space using an array of structures of arrays with a SIMD register width of 8 floats (or ×ばつ32 = 256 bits):

structVector3x8{
floatx[8];
floaty[8];
floatz[8];
};
structVector3x8points[(N+7)/8];
floatget_point_x(size_ti){
returnpoints[i/8].x[i%8];
}

A different width may be needed depending on the actual SIMD register width. The interior arrays may be replaced with SIMD types such as float32x8 for languages with such support.

Alternatives

[edit ]

[icon]

This section is empty. You can help by adding to it. (August 2025)

4D vectors

[edit ]

AoS vs. SoA presents a choice when considering 3D or 4D vector data on machines with four-lane SIMD hardware. SIMD ISAs are usually designed for homogeneous data, however some provide a dot product instruction^[5] and additional permutes, making the AoS case easier to handle.

Although most GPU hardware has moved away from 4D instructions to scalar SIMT pipelines,^[6] modern compute kernels using SoA instead of AoS can still give better performance due to memory coalescing.^[7]

Software support

[edit ]

icon

This article needs additional citations for verification . Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "AoS and SoA" – news · newspapers · books · scholar · JSTOR (July 2023) (Learn how and when to remove this message)

Most languages support the AoS format more naturally by combining records and various array abstract data types.

SoA is mostly found in languages, libraries, or metaprogramming tools used to support a data-oriented design. Examples include:

"Data frames", as implemented in R, Python's Pandas package, and Julia's DataFrames.jl package, are interfaces to access SoA like AoS.
The Julia package StructArrays.jl allows for accessing SoA as AoS to combine the performance of SoA with the intuitiveness of AoS.
Code generators for the C language, including Datadraw and the X Macro technique.

Automated creation of AoSoA is more complex. An example of AoSoA in metaprogramming is found in LANL's Cabana library written in C++; it assumes a vector width of 16 lanes by default.^[8]

References

[edit ]

^ "How to Manipulate Data Structure to Optimize Memory Use". Intel. 2012年02月09日. Retrieved 2019年03月17日.
^ "Memory Layout Transformations". Intel. 2019年03月26日. Retrieved 2019年06月02日.
^ "Kernel Profiling Guide" (PDF). NVIDIA. 2022年12月01日. Retrieved 2022年01月14日.)
^ Fei, Yun (Raymond); Huang, Yuhan; Gao, Ming (2021), "Principles towards Real-Time Simulation of Material Point Method on Modern GPUs", pp. 1–16, arXiv:2111.00699 [cs.GR]
^ "Intel SSE4 Floating Point Dot Product Intrinsics". Intel. Archived from the original on 2016年06月24日. Retrieved 2019年03月17日.
^ "Modern GPU Architecture (See Scalar Unified Pipelines)" (PDF). NVIDIA. Archived from the original (PDF) on 2018年05月17日. Retrieved 2019年03月17日.
^ Kim, Hyesoon (2010年02月08日). "CUDA Optimization Strategies" (PDF). CS4803 Design Game Consoles. Retrieved 2019年03月17日.
^ "ECP-copa/Cabana: AoSoA". GitHub.

Retrieved from "https://en.wikipedia.org/w/index.php?title=AoS_and_SoA&oldid=1320192918"