China’s New CPU Boasts Supercomputing Capabilities

Hardware. image by pixabay

This post is also available in: עברית (Hebrew)

China has unveiled its latest supercomputer which is powered by a new processor with 384 cores that can perform more than 13 trillion TFLOPS. Called the Sunway SW26010 Pro CPU, it is a homegrown chip that aims to boost China’s supercomputing capabilities and reduce its reliance on foreign technology.

According to Interesting Engineering, the CPU is an upgraded version of the Sunway SW26010 used in the Sunway TaihuLight supercomputer, which ranked as the world’s fastest supercomputer in 2016 and 2017. The new CPU has improved the clock speed, the instruction set, and the memory bandwidth of the previous generation, resulting in a four-fold increase in the FP64 performance. The Sunway SW26010 Pro CPU can achieve a peak FP64 performance of 13.8 TFLOPS, which is more than twice of AMD’s 96-core EPYC 9654 CPU, the peak performance of which is around 5.4 TFLOPS.

However, the new Sunway SW26010 Pro CPU does have several drawbacks. The CPU has a limited cache and memory hierarchy, which can negatively affect its performance for some applications. Furthermore, the scratchpad cache of the CPEs needs to be bigger to store all the data required by the vector engine, and the lack of a proper L2 cache means that the data has to be fetched from the main memory frequently.

In addition, the CPU’s memory subsystem is insufficient and needs to be increased to support the high bandwidth demand of the 384 cores. These bottlenecks can limit the CPU and supercomputer’s scalability and efficiency.

All in all, the Sunway SW26010 Pro CPU is a remarkable achievement for China’s supercomputing industry, which has been developing its processors and systems to compete with other global tech leaders. This new CPU demonstrates China’s innovation in high-performance computing that has applications in many different domains like scientific research, artificial intelligence, and national security. However, the CPU also shows that China still has some gaps regarding cache and memory design, which are crucial for achieving optimal performance and energy efficiency.