Data Center & Open Compute
The information that needs to be processed by utilizing data centers in our daily life, business, and social activities is rapidly diversifying. Increasing diverse tasks (information processing work) are being performed in data centers in accordance with the expansion in their utilization applications.
Each task processed in a data center has its own characteristics. There is a tendency with information processing in data centers to talk about it all lumped together. In fact, however, there are various differences in the amount and quality of data to be processed, the processing procedures, and the accuracy and the speed (Fig. 1).
The information processing systems installed in data centers up to now have been designed with the intent of quickly and efficiently handling a great variety of tasks from an unspecified large number of users. Therefore, system specifications that can achieve high versatility and flexibility have been adopted. For example, central processing units (CPUs), the most highly versatile processors, were frequently used for the processors of servers that serve as the brain of data centers. The same chip was applied to processing diverse tasks.
However, the picture has changed greatly in data centers in recent years. Systems are now being built to process tasks by using different processors depending on the characteristics of those tasks. We explain in this article the changes in the basic design concept of data centers against the background of an expansion in application areas and an increase in the number of users and usage scenarios.
There are broadly five types of semiconductor chips being used in today's data centers as the processors that play the role of the brain when performing calculations. We look here at what kind of tasks are processed by what chips (Fig. 2).
First, let's turn our attention to daily operation processing in business (customer management, product order processing, etc.). This has been the most common form of information processing used in data centers up to now. This processing involves the execution of a huge number of routine tasks. The tasks to be performed require a flexible response to suit each customer. They include complex conditional branches and sequential processes. However, the load of each task is not that great. When such processes are performed on a system placed in the cloud, a multi-core processor equipped with a large number of CPU cores that excel at flexibly handling sequential processes is utilized to perform the processing simultaneously and in parallel. This system can quickly and efficiently process a huge number of independent tasks that are not closely related to each other.
On the other hand, artificial intelligence (AI) and machine learning-related processing, which have experienced a rapid increase in demand in recent years, require heterogenous information processing. These tasks place a high load on the system. That is because extremely large data sets are input into pre-prepared neural networks and the same calculations are performed repeatedly. The calculations themselves are relatively simple. However, a feature of this processing is the fact that each task itself can be huge. When performing this kind of processing on the cloud, it is common to do so using a graphics processing unit (GPU) that is equipped with thousands of cores suitable for matrix calculations and similar operations frequently performed in AI-related processing.
Processors responsible for AI-related processing have been becoming more segmented in recent years. There are now cases in AI-related processing of dedicated chips optimized for specific tasks called tensor processing units (TPUs) or similar being applied to inference processing for trained models for which it is possible to clearly define the calculation specifications in particular. This is because it enables the pursuit of even faster, lower power consumption and more cost-effective processing. However, it is necessary to try various models, algorithms, and AI frameworks at the stage when researching and developing even more advanced AI. Therefore, GPUs that have both high versatility and parallel processing power are still being used as before in the learning process to enhance the performance of the AI model itself.
Moreover, data is exchanged at higher speeds and in larger volumes than with conventional tasks between GPUs and similar and memories or storage in AI-related processing. Accordingly, the introduction of even higher bandwidth networks is also required. In general, there are few cases in which servers specialized for AI-related processing are mixed with servers used for general purposes. AI data centers tend to be installed separately from general-purpose data centers.
There are tasks among those processed in data centers that are performed in common in all applications. Typical examples of this include networking, packet processing, and other calculations to control the huge volumes of communication data flowing over the Internet, storage reading/writing, and similar processes. In addition, security-related tasks, such as data encryption and compression/decompression, are now also being performed. In particular, processing for social media, video streaming, and other applications requires the execution of such tasks in huge volumes quickly.
All of this processing was once done by the CPUs that were performing customer management and other processes for companies. There have been an increasing number of cases in recent years, however, in which processors specialized for network processing and other tasks called data processing units (DPUs) are being used. Using DPUs makes it possible to reduce the load on the CPUs. This enables expensive CPUs to be allocated to processing that can only be performed by them. As a result, the performance of the entire data center improves.
Network processing is a process that particularly requires real-time performance among the tasks processed by DPUs. On the other hand, certain processing is often performed on standard format data. Therefore, it tends to be easier to narrow down the targets of the calculation contents. The characteristics of such tasks means that field programmable gate arrays (FPGAs), which can realize dedicated hardware by writing programs, have become widely used. It is not necessary to return the data to be processed to an external memory each time processing is completed when using an FPGA. This enables extremely efficient processing.
It is possible to see a trend in recent cloud services in which the selectable hardware is becoming even more segmented such as by offering diverse CPU architecture in addition to using different types of processors. Furthermore, it is expected that data centers will become increasingly diverse in the future such as by using quantum gate, quantum annealing, and various other forms of quantum computers.
Furthermore, data centers are also beginning to diversify from a slightly different perspective. Attention has been focused on the movement to develop and install small data centers called "micro data centers (MDCs)" as a new form of data center (Fig. 3).
An MDC is a small data center with about the size of a refrigerator. A feature of MDCs is the fact that all the functions required of data centers are contained and completed in one standard rack. These functions include uninterruptible power supply (UPSs) and other power supplies and cooling systems in addition to servers, networking, and security systems. In general, they are rarely used alone; rather, they are operated as satellites of large data centers in the cloud. They are used while being remotely monitored and managed linked to the cloud. In addition, MDCs are basically installed and used in places close to the site (edge side) where data is collected and utilized. Therefore, we can say that another feature of MDCs is that they require a design that can be adapted to various environments. This includes the need for wall-mounted, robust, and soundproof MDCs.
It is assumed that the main application of MDCs will be in edge computing. It is envisioned that MDCs will be utilized in applications such as to realize data processing with high real-time performance by being installed adjacent to IoT devices and local networks set up in factories, industrial plants, large commercial facilities, and other sites. It is also thought that they will be installed in medical facilities, educational institutions, and other sites to provide safe storage and quick access for highly confidential information. In addition, there is also a movement to combine MDCs with 5G base stations to, for instance, realize low latency services and to improve network efficiency.
Data centers are evolving to take on more and more roles every year. The configurations and specifications of the systems that are introduced will likely become more diverse.