Main image of Semiconductors, Networks, Cooling, and Various Other Component Technologies That Make Up Data Centers

Semiconductors, Networks, Cooling, and Various Other Component Technologies That Make Up Data Centers

The data centers that have become the infrastructure for our increasingly digitized modern society can be viewed as a single, massive computer system that possesses vast data storage and information processing capabilities.

However, despite being a computer system, data centers differ from PCs, smartphones, and other familiar information devices in many ways. Upon closer inspection of their internal specifications, functions, and structures, we can see that there are many areas that require technologies specific to data centers.

Image of Data centers use unique technologies not found in PCs
Data centers use unique technologies not found in PCs

In this article, which discusses the data centers that typical users rarely have the opportunity to see, we focus on the hardware aspects of the components and technologies used in data centers.

In pursuit of high versatility and flexibility, some data centers are specializing in AI-related processing in recent years

Data centers come in a variety of sizes. They exist at various scales ranging from micro data centers with roughly 100 servers, used by a single company or remote office, to massive hyperscale data centers with over 5,000 installed servers. Among those, there are even facilities that house as many as 1 million servers in one location.

Regardless of size, these data centers are equipped with many servers for processing information and storage equipment for storing data, which are interconnected through complex networks. Depending on the situation, at times they may individually execute diverse and massive processes while at other times, multiple units may work together to flexibly and efficiently process various tasks (information processing tasks) by many users. Due to these unique operating methods of data centers, typical PCs and data centers differ from each other in terms of their system structure, components, and design philosophy (Figure 1).

Image of Major differences in design philosophy between individual-use PCs and data centers
Figure 1 Major differences in design philosophy between individual-use PCs and data centers

To begin with, the PCs that we normally use are intended for personal use. Therefore, the purpose of use and processing tasks can be easily identified, and the specifications of the PC to be prepared are also easily customizable. For example, a user who wants to work on the go may choose a mobile PC that is highly portable while a user who wants to play 3D games may purchase a gaming PC with enhanced graphics processing functions.

By contrast, data centers must quickly and efficiently handle a wide range of tasks from countless users. In many cases, numerous companies with significantly different business operations share the same data center servers. For that reason, data centers install a large number of servers equipped with highly versatile CPUs (Central Processing Units) in parallel to create a system configuration that supports changes in processing tasks through the seamless integration of multiple servers. A system configuration that combines a large number of these highly versatile servers also has the advantage of achieving scalability, which allows for servers to be added as demand increases.

In addition, due to the high volume of data to be processed, it is extremely important to introduce network technologies that enable high-speed, high-capacity data transfers. Typically, a high-speed Internet line of 100 Gbps or more is installed. Furthermore, next-generation data centers are moving toward the introduction of optical communication technologies in pursuit of high speed, high capacity, low latency, energy conservation, and high security.

In order for a data center to rapidly process a large volume of tasks, a mechanism for reading and writing data at higher speed is needed in between the CPU and storage. For this reason, data centers in recent years have adopted a new type of memory called SCM (Storage Class Memory) that combines high-speed data access and high-capacity storage functions and sits in between the DRAM memory next to the CPU and the storage, and examples of initiatives to improve performance and efficiency have also been seen.

In recent years, tasks related to artificial intelligence (AI) have rapidly increased, and the use of servers equipped with GPUs (Graphics Processing Units), TPUs (Tensor Processing Units), and other chips that can efficiently execute AI processing even at the expense of some versatility has also increased. In AI-related processing, a larger volume of data handling is carried out at high speed between the GPU, etc. and memory or between the memory and storage compared to previous tasks. Therefore, higher-bandwidth networks are being introduced. Typically, there are few cases in which servers specializing in AI-related processing are mixed with general-purpose servers, as they tend to be separately established as AI data centers.

Technologies for data center reliability and redundancy

Personal computers are not continuously used 24 hours a day, 365 days a year. Additionally, while the failure of a PC may pose a problem for the user, such a failure will not result in a serious situation such as widespread chaos throughout society.

By contrast, data centers need to continue to process tasks that support people's lives and social activities in a stable manner at all times. For this reason, the CPUs and memory inside the servers that process information as well as the storage equipment that stores the data must be highly reliable and capable of continuous operation. In particular, ECC (Error-Correcting Code) DRAM is a type of memory that is used to protect data integrity and has functions for detecting and correcting bit errors. The flash memory used for storage, etc. is selected based on the ability to withstand frequent data access.

Image of Serious defects and failures occurring at data centers could cause disruption throughout society
Serious defects and failures occurring at data centers could cause disruption throughout society

On the systems side, mechanisms have been introduced to prevent the occurrence of defects and failures. First, the servers used in data centers are equipped with high-performance cooling systems. In particular, AI data centers that continuously operate at high loads under normal conditions use not only typical air-cooling systems but also apply advanced technologies such as liquid cooling and immersion cooling, which submerges the entire server into a liquid with high thermal conductivity.

Furthermore, data centers have adopted redundant system configurations that can rapidly switch to a secondary system during a failure. Specifically, they have implemented mechanisms that use virtualization technologies to rapidly switch an executing task to another server when a server malfunctions. Furthermore, when the power system that supplies power to run the servers, etc. fails, irreparable damage may occur. For this reason, data centers have implemented redundant power systems as well as installed UPSes (Uninterruptible Power Supplies) in the power intake from the grid so that the data center can operate for a fixed time even during a power outage.

Data centers have implemented various technologies that are specific to their needs, and it is believed that technology development aimed at solving more multifaceted problems will be required going forward. The importance of technologies to reduce the environmental load, including power consumption, has increased particularly in recent years. New technologies such as servers equipped with chips designed to save energy and the adoption of DC power supplies that reduce the number of power conversions in power systems, etc. will likely be implemented.

Related products

Related articles