What Is Digital Data? The Basics of Digital Communication
The widespread adoption of digital devices known as personal computers (PCs) alongside the Internet from the mid-1990s has led to the rapid development of the information and digital society. Furthermore, the appearance of various digital devices such as smartphones has filled our lives today with digital data, and it can be said that digital has become a natural part of society and individual thinking.
Given this context, we feel it is important to introduce the elemental digital technologies that support society and our lives, such as digital wireless communication and digital modulation that enable anyone to easily transmit digital data, in order to understand the role of these technologies in society and their importance. Before that, this article will first cover basic matters related to digital technology, such as the types, characteristics, and units of quantity of digital data.
INDEX
1.1 Types of Digital Data and Their Characteristics
1.2 How to Express Data Size (Data Quantity)
2. Representation of Various Data
1. What Is Digital Data?
Digital data consists of "0s" and "1s" and is represented in binary rather than the decimal system we use in everyday counting. It is also characterized by the lack of a physical form. The types, characteristics, and sizes (quantities) of digital data are summarized below.
1.1 Types of Digital Data and Their Characteristics
The first thing to know about digital data (hereinafter "data") is the types of data and their characteristics. Types of data can be broadly classified as shown in Table 1. Note that video can be considered a collection of still images.
Type | Concrete examples | |
|---|---|---|
Text (characters) | Email text, articles, documents, programs, etc. | |
Images (still images) | Photographs, illustrations, etc. | |
Video | Movies, television programs, etc. | |
Sound | Audio | Narration, telephone call recordings, etc. |
Music | Songs, techno, background music, etc. | |
*1 Another term used alongside data is "information." "Data" and "information" are generally thought of as follows.
• Data: A collection of symbols and codes that represent facts
• Information: Text, images, and other data that can be interpreted by humans and used to make decisions, take action, etc.
In this sense, Table 1 shows types of "information," but for convenience, this article will not distinguish between "information" and "data," and will simply use the term "data" unless otherwise noted.
Table 2 summarizes the characteristics of data. The convenience and efficiency provided by these characteristics could be said to form the foundation of today's digital society.
| Characteristic | Explanation |
|---|---|
| Replicability | Copies can be made without degradation. |
| Integrity | The various data shown in Table 1 can be handled collectively, and accuracy can be maintained across different devices. |
| Transmissibility | Instantaneous transmission over the Internet |
| Compressibility | The quantity of data can be reduced (compressed). |
| Searchability | Easy to search |
| Survivability | Difficult to completely erase |
| Protection | Access can be controlled (encryption). |
| Editability | Easy to add, delete, or modify |
1.2 How to Express Data Size (Data Quantity)
The next thing to know is that although data has no form, it has size (quantity). Data is represented by "0s" and "1s," and its basic unit is bit. "0" is 1 bit and "1" is also 1 bit, making this the smallest unit of data. Furthermore, it has been standard since the 1960s for 8 bits to be 1 byte (Byte, B), and this unit is also widely used (Table 3).
Table 4 shows examples of how to calculate the data quantity for each data type shown in Table 1. Although it depends on the data usage, the data quantity tends to increase in the following order: text < sound < still image < video.
The assumptions for the various data calculations are explained in the following item, "2. Representation of Various Data," and the prefixes M (mega) and G (giga) are explained in "Column: Prefixes for Data Units."
Type | Examples of calculating data quantity | |
|---|---|---|
Text | Quantity of character data for 500 characters Quantity of character data for 500 characters | |
Images | Quantity of image data for resolution: horizontal 1280 pixels × vertical 1080 pixels & | |
Video | Quantity of video data for 2 minutes × 30 frames/second (with the image above as 1 frame) | |
Sound | Audio | Quantity of sound data for 5 minutes with sampling frequency: 44.1 kHz & |
Music | ||
*2 We won't go into detail here, but in actual data communications, a technology called compression is used to reduce the quantity of various data in order to shorten communication times (see Table 2). Compression is a breakthrough technology that is difficult to realize with analog but can be achieved through digital means.
2. Representation of Various Data
2.1 Text Data
As mentioned in the previous section, characters (including emojis) displayed on smartphones and PCs are represented in binary with "0s" and "1s." The characters handled by these digital devices are described by numerical values known as character codes, and the correspondence table between characters and character codes is referred to as the "character code system." Table 5 shows the main character codes (ASCII, Shift JIS, Unicode).
Character code | Description |
|---|---|
ASCII | An early character code created in the U.S. following the invention of computers. Example: ASCII code for "G" → "1000111" |
Shift JIS | A character code corresponding to Japanese. Example: Shift JIS code for "友" → "1001011101000110" |
Unicode | An international standard for a universal character code created in 1993. UTF-8, which is based on Unicode and fully compatible with ASCII, is generally used The UTF-8 character code system represents many Japanese characters by 3 bytes Example: UTF-8 code for "与" → "111001001011100010001110" |
2.2 Image and Video Data
The basic characteristics of photos taken by smartphones and images (still images) displayed on websites are determined by the following factors:
(1) Pixels
(2) Resolution
(3) Color information (gradation)
For videos, there is an additional factor called (4) frame rate.
(1) Pixels
The smallest unit of an image is 1 pixel, and consists of red, green, or blue (RGB) color or light.
(2) Resolution
Resolution indicates the detail of an image. For example, the resolution of a 4K display is horizontal 3840 × vertical 2160 pixels, for a total of 33177600 or about 33 megapixels. The lower the resolution, the coarser the image, and the higher the resolution, the more detailed the image becomes.
On the other hand, the number of pixels per inch is also sometimes called the resolution, and this unit is pixels per inch: ppi (or dots per inch: dpi for printed matter). For example, the resolution of a 27-inch 4K display can be calculated as 163 ppi.
(3) Color information (gradation)
Color information, or gradation, is expressed by discrete values indicating the shade of each of the three primary colors of light: red, green, and blue. For example, if each color is 8 bits (24-bit full color), 28 × 28 × 28 = 256 × 256 × 256 = 16777216 colors can be represented.
For example, the data quantity for a 3 × 2 pixel resolution as shown below with True Color is 6 × 24 bits = 144 bits.
(4) Frame rate
Frame rate refers to the number of still images (frames) shown on a display during 1 second, and is usually expressed in units of frames per second: fps. Table 6 shows typical frame rates and examples of use. Video is used not only for promotion, but also in a variety of fields from research and development, education, and medicine to entertainment. As such it has come to be recognized and used as a common form of content.
| Frame rate | Examples of use |
|---|---|
| 23.976 fps | Movies, video discs, etc. |
| 29.97 fps | Television broadcasts, video discs, etc. |
| 59.94 fps | Television broadcasts, 4K video discs |
| 120 fps | Games, slow-motion filming, etc. |
| 240 fps or higher | Super slow-motion filming, etc. |
2.3 Sound Data
The basic characteristics of sound and music data stored on music players and smartphones are determined by:
• Sampling frequency
• Quantization bit rate
Here, we will use the basic specifications of a music compact disc (music CD) with sound data recorded (sampling frequency: 44.1 kHz, quantization bit rate: 16 bits) as an example to explain the meaning of these terms in order to explain the digitization of sound*3, which is to say conversion from an analog signal to a digital signal, as shown in Figs. 1 and 2 below.
*3 For the differences between analog and digital, see "Column: Analog Data and Digital Data."
(a) | To simplify the explanation, an analog signal of sound per second is considered as the unit time. |
(b) | The horizontal time axis is divided into segments (sampled) at equal intervals. In this example, there are 10 segments, so the sampling period is 1/10 second. |
(c) | The vertical level axis is also segmented and a value is assigned to each level (quantization). During the quantization assignment process, differences may occur between the analog signal and Quantization level values start from "0." |
(d) | A code is assigned (encoding) to the value of each level at each sample point. In this example, there are 10 sample points per second and the quantization bit rate is 3 bits, |
Based on the flow of sound digitization in Fig. 2, the basic specifications of a music CD (sampling frequency: 44.1 kHz, quantization bit rate: 16 bits) have the following meanings.
- There are 44100 segments per second, so the time for one segment (sampling period) is about 22.7 microseconds.
- The sound signal level is divided into 216 (65536) levels.
Section 2.3 was a bit repetitive, but explained the conversion of analog data to digital data (analog-to-digital conversion: A/D conversion). It calls for familiarity with unfamiliar terminology and the concepts of time and frequency, so it may be more challenging to understand than the still image and video data content in section 2.2.
Column: Prefixes for Data Units
The rapid development of the information society due to factors such as the promotion of DX is dramatically increasing the amount of data used around the world. In fact, the amount of digital data generated globally in 2020 was about 15 ZB (zettabytes). However, due to the impact of factors such as online meetings in response to the spread of new infectious diseases, it is predicted that the amount of data generated will reach 180 ZB by 2025, or more than 10 times the 2020 level.
To address this social situation, at the 27th General Conference on Weights and Measures (November 2022), two new SI prefixes*4 were added for the first time in 31 years since 1991: "quetta" for 10 to the power of 30, and "ronna" for 10 to the power of 27. (At the same time, the reciprocals of "quecto" for 10 to the power of negative 30, and "ronto" for 10 to the power of negative 27 were also added.) This brings the total number of SI prefixes to 24, and expands the number of digits to 60 (Table 7).
*4 Prefixes used with SI units (international units of measurement) such as kilogram (kg) and meter (m) to indicate multiples such as 10 times or 100 times, or their reciprocals such as 1/10 or 1/100.
*5 Data is handled in binary, so it has become common to attach the following prefixes, where 210 = 1024 times. For example:
• 1024 times 1 B → 1 KB (10241)
• 1024 times 1 KB → 1 MB (10242; 1048576 times 1 B)
(Uppercase K and lowercase k are sometimes used distinctly as in 1000 B = 1 kB.)
There are also binary prefixes according to an international standard (IEC 80000-13), such as Ki (kibi) for 10241, Mi (mebi) for 10242, Gi (gibi) for 10243, and Ti (tebi) for 10244, but these do not seem to have achieved widespread acceptance yet.
Column: Analog Data and Digital Data
This article has explained data as digital data, but it also briefly touched on analog data such as sound. This column will again present the general differences between analog and digital data.
- Analog data: A quantity that changes continuously without breaks - a continuous quantity (e.g., the values of a clock with hour and minute hands, the values of a liquid thermometer)
- Digital data: A quantity that changes in a discrete and non-continuous manner - a discrete quantity (e.g., the values of a clock or thermometer with numerical display)
Recently, electronic devices that enhance functionality by converting analog data other than sound (for example images) to digital data are also gaining widespread acceptance due to the convenience of digital data as described in section 1.1. In an information society where processing and editing of audio and image data using various electronic devices such as smartphones have become commonplace, the advantages of digital over analog are promoting the rapid replacement of traditional analog devices with digital ones, and this trend is certain to continue in the future.