 
                Customer
Our client is a renowned German automotive manufacturer, specifically its research and development department in the field of autonomous driving. Among other things, this department focuses on the improvement of sensor technology and data processing as well as the development of machine learning algorithms.
In addition, there is a strong focus on the further development of artificial intelligence to improve decision-making and predict the behavior of other road users. Furthermore, simulation and modeling play a central role in the development process to test and optimize the behavior of autonomous vehicles in a variety of traffic situations.
Project Timeframe: 2022Project Volume: approx. 120.000€
Table of contents
Project description
In this project, we focused on a specific subarea of development tasks in the field of autonomous driving. High-performance systems are designed to process and analyze the enormous amounts of data generated by the sensors of autonomous vehicles with the aim of making safe and efficient driving decisions in real time.
The challenge was to create a high-performance server environment capable of supporting the intensive AI computations these tasks require. To achieve this, our solution needed to have high computational power and scalability to effectively process the massive amounts of data required for AI procedures in autonomous driving.
 
                                                                Project realization
To realize the project, we assembled and delivered a turnkey GPU cluster. This consisted of three Octa Multi GPU nodes Supermicro A+ Server 4124GS-TNR, including Dual 32Core AMD EPYC 7002.
The intensive AI calculations were performed by eight NVIDIA® RTX™ A6000 graphics cards in each of these high-performance systems. This graphics card, currently one of the world's leaders in visual computing, features 48 GB of ultra-fast GDDR6 graphics memory. This memory is scalable up to 96 GB with an NVLink bridge, providing data scientists, engineers, and creative professionals with the memory volume they need to work with large data sets and workloads for analysis and simulation.
We also benefited from the latest PCIe Express Gen 4 support from both GPU and processor platforms. This doubled bandwidth compared to PCIe Gen 3 and improved data transfer rates from CPU memory, which is especially critical for data-intensive tasks like AI and data science.
 
                                                                What is actually ... NVLink?
NVLink is a high-speed direct connection developed by NVIDIA that makes it possible to accelerate the data exchange between the CPU and the GPU as well as between multiple GPUs. An NVLink bridge is thus a physical connection element that is used to connect - depending on the design - two or more graphics cards in a system. This way, these graphics cards can no longer just access their own memory, but use the entire memory pool. This is particularly useful for computationally intensive applications such as machine learning or graphics rendering, where it is important to avoid data transfer bottlenecks.
In more complex systems incl. integrated NVLink function, also called "HGX platforms", special versions of NVIDIA graphics cards known as SXM3/SXM4 modules are used. These modules are designed to be mounted directly on a special so-called GPU baseboard (named here as Restone/4 GPUs and Delat/8 GPUs). That is, the NVLink connections are integrated directly on the GPU baseboard. These HGX platforms can support up to eight SXM3 or SXM4 GPUs.
The result
The GPU cluster we supplied meets the high computing power requirements needed to handle data-intensive AI tasks. At the core of the cluster acts the NVIDIA® RTX™ A6000 with 48GB of GDDR6 graphics memory, which now enables our customer to run AI applications not only quickly, but also efficiently.
"Fast" in this case refers to the raw power or speed of the NVIDIA® RTX™ A6000 to perform complex calculations in a short period of time. "Efficient," on the other hand, refers to how well the available resources are used. An efficient GPU not only performs tasks quickly, but also makes the best use of available memory, power, and other resources. This can be especially important when working with large data sets or when you want to reduce energy costs and heat development in data centers.
The use of the NVLink bridge also allows scaling the graphics memory up to 96GB. Furthermore, a significant improvement in data transfer rates has been achieved by switching from PCIe Gen 3 to PCIe Gen 4. In conclusion, the efficient use of processor and GPU resources offers great flexibility as the hardware can be used for a wide range of applications, ensuring optimal use of available resources.