To view PDF files

You need Adobe Reader 7.0 or later in order to read PDF files on this site.
If Adobe Reader is not installed on your computer, click the button below and go to the download site.

Feature Articles: NTT R&D Forum 2019 Special Sessions

A Digital World of Humans and Society—Digital Twin Computing

Ryutaro Kawamura
Senior Vice President, Head of NTT Service Innovation Laboratory Group


This article introduces a lecture presented by Ryutaro Kawamura, senior vice president and head of NTT Service Innovation Laboratory Group, at a special session of NTT R&D Forum 2019 held on November 14 and 15, 2019. The lecture described the latest research activities at the laboratory group, focusing on NTT’s vision of Innovative Optical and Wireless Network or IOWN.

Keywords: IOWN, Digital Twin Computing, digital representation of humans


1. How will humans use the power of digital technology in the future?

Looking forward 10 and 20 years into the future, to what end will we, humanity, use the power of ever-advancing digital technology? As you know, predictions are being made about the coming of a major turning point in social systems, and Japan’s Society 5.0 initiative reflects such ideas. Discussions are also raising the possibility that the long-lived capitalistic society is coming to an end. Let’s look back at the history of digitalization in relation to humans and things over the past 30–40 years. To begin with, email appeared around 1985. This development can be seen as a step forward in digitalization centered around humans. Then, around 1995, the Internet appeared, and at the same time, the digitalization of information accelerated in relation to things, which improved daily life and services such as for providing products, timetables, and maps. Next, a new era in human communication appeared around 2005 in the form of social networking services (SNSs), and since 2015, we have been in an era marked by the digitalization of things driven by a combination of Internet of Things (IoT) and artificial intelligence (AI). Looking back at this recent history of digitalization, the digitalization of humans and things has been progressing in a mutually repeating cycle. Taking this cycle into account and viewing the recent development of IoT, we consider that it is time for human digitalization to take another turn. Of importance here is that value in this new era will increase not in a linear and proportional manner but rather in an explosive and discontinuous manner, and I predict that the time for this to occur is imminent (Fig. 1). In this regard, the expression digital twin has recently come into use in the sense that the digital representation of humans and things can enable replication, fusion, and exchange as well as saving and recording, all of which are strong points of digitalization. However, in much the same way that so-called silos have appeared in industry, the fact is that the digital twin concept has progressed in separate industries with no mutual compatibility.

Fig. 1. Progress in digitalization in relation to humans and things.

2. Digital Twin Computing initiative

The digital twin framework up to now has been to map individual targets in the real world represented by automobiles, robots, etc. in cyberspace and to analyze those targets and make predictions. The results of such analysis and predictions are then remapped to the real world and put to use.

As an extension of this conventional digital twin framework, we proposed Digital Twin Computing and began an initiative to make it a reality (Fig. 2) [1]. Digital Twin Computing takes digital twins of things and humans in diverse industries and performs computations on them in any desired combination. This makes it possible to accurately reproduce combinations that could not be comprehensively handled up to now, such as humans and automobiles in a city, and to make predictions about the future. In addition, Digital Twin Computing represents a new computing paradigm that goes beyond physical reproduction of the real world by achieving interactive effects among digital twins including the inner state (e.g., consciousness and thought) of humans in cyberspace. This initiative will endeavor to configure a virtual society composed of a variety of digital twins, replicate in cyberspace digital twins of single entities in the real world, and exchange or fuse some of the elements constituting different digital twins to generate new digital twins that do not exist in the real world.

Fig. 2. Digital Twin Computing initiative.

This initiative will take up the challenge of achieving a digital representation of the inner state of individuals. By representing not just the outward appearance of humans but their inner state as well, it should be possible to achieve advanced interactivity even from a social perspective such as human mobility and communication. Moreover, representing the personality of every person should make it possible to achieve interactivity based on diversity and individual features as opposed to interactivity between individual digital twins without personality that are statistically rounded out as average values.

We argue that these features will enable the creation of a virtual society in which a variety of things and humans interact with each other in advanced and sophisticated ways beyond the limitations of the real world.

3. Digital representation of humans

It is important to note that a human digital twin in Digital Twin Computing can provide not only a digital representation of anatomical and physiological features but also a digital representation of a person’s inner state. There are two main approaches to achieving such a difficult objective. The first approach is to emulate human abilities using computers and to repeat that process to “get continuously closer to human qualities.” Technologies for recognizing sounds and voices and for communicating via conversation are good examples of how progress can be made with this approach. The second approach, which might be called the ultimate approach, is to physiologically clarify the human brain and body and transcribe the results to a computer. This field, which is representative of brain and neuroscience, has been making much progress and has already been producing research results that can be used for engineering purposes. Our plan is to work toward this digital representation of humans by using the best elements of these two approaches (Fig. 3).

Fig. 3. Two main approaches to digital representation of humans.

The following introduces several key technologies that NTT laboratories have so far taken up with respect to the first approach (Fig. 4).

Fig. 4. Human digitalization—progress in approach using speech technology.

3.1 Speech recognition

NTT laboratories have been researching for half a century how to accurately recognize the human voice as a technology for listening [2]. This research began with the recognition of words and clearly spoken sentences, but from around 2010, the technology was able to accurately recognize natural human utterances and be used at customer-contact centers. With the introduction of the latest neural networks, the technology is finally approaching the abilities of human speech recognition.

3.2 Speech synthesis

The question here is how and to what extent textual information can be converted to natural, humanlike speech. This technology includes text analysis processing to determine the reading of kanji (Chinese characters) in the Japanese language according to context and processing for synthesizing speech signals with appropriate pitch and speed. From around 1990, the technology has been used for making automatic replies to calls as well as synthesizing speech of animated characters, robots, etc. Deep learning based on speaker voice data is driving the synthesis of natural and diverse voices that have the feel of a real voice.

3.3 Understanding emotions and intent

Various approaches are being undertaken to develop technology for understanding the other party in a conversation to the point of identifying a person’s gender, emotions, and degree of urgency. This can be accomplished, for example, by detecting whether a customer is dissatisfied or angry from a call between that customer and a contact-center operator. This technology is capable of detecting cold anger (calm and cool expression of anger), which is usually difficult to infer, from not only the loudness and pitch of voices but also from conversation rhythm, choice of words, etc. It is also able to achieve high-accuracy recognition of satisfaction, which is a feature that does not appear as easily as dissatisfaction.

4. Layered structure and hourglass structure

Of importance when studying Digital Twin Computing technology and architecture is whether an hourglass structure can be added to a layered structure (Fig. 5). This is because the creation of a common layer in the middle of a structured layer can drive innovation. The Internet is a good example of this concept since the positioning of the IP (Internet protocol) layer as a common layer makes for smooth interaction between the lower network and upper application layers. In Digital Twin Computing, as well, we consider that a narrow section—this common layer—is necessary and that the digital twin layer in Digital Twin Computing architecture will serve as this section. The digital twin layer maintains digital twins generated from various types of sensor data in real space and derivative digital twins generated from computations on digital twins. These digital twins maintained in the digital twin layer serve as basic constituent elements for constructing diverse virtual societies.

Fig. 5. Key to expansion: layered structure × hourglass structure.

5. Technology supporting large-scale computation

It is preferred that the results of discussions among many human digital twins in cyberspace be fed back to real space and reflected in decision-making. This will require computing technology that far exceeds current computer performance. One example of technology supporting such large-scale computation is the LASOLVTM coherent Ising machine now being researched and developed at NTT laboratories. LASOLV is used to find solutions to combinatorial optimization problems by using special optical pulses to reach a physically stable state through mutual interaction. We can expect this technology to enable high-speed processing on an order of magnitude different from that of current computers. We are also developing middleware that simplifies the use of LASOLV in the Python language [3].

6. Use cases of Digital Twin Computing

Digital Twin Computing can be used on a variety of scales, as shown in Fig. 6. Specifically, the following uses can be expected.

  • High-speed and parallel debate and decision-making by human digital twins
  • Development of solutions to difficult national problems based on the actions of past leaders who have experience in overcoming crises
  • Extensive and detailed urban digitalization through the integration of digital twins and social infrastructures such as the transportation system

Fig. 6. Use cases.

7. Toward explosion in value through computations on digital twins

Our goal is to make Digital Twin Computing a truly useful concept together with a wide range of interdisciplinary partners including those in the social sciences, humanities, etc. We also believe collaboration with a variety of industries is essential to making Digital Twin Computing a reality. Going forward, we plan to cultivate productive partnerships and promote research and development in this unexplored field while collecting much knowledge to forge a path to a digital society of the future.


[1] Press release issued by NTT, “NTT proposes the ‘Digital Twin Computing Initiative’ – a platform to combine high-precision digital information reflecting the real world to synthesize diverse virtual worlds, generate novel services and bring about society of the future,” June 10, 2019.
[2] T. Oba, T. Tanaka, and R. Masumura, “Evolution of Speech Recognition System—VoiceRex,” NTT Technical Review, Vol. 17, No. 9, pp. 5–8, Sept. 2019.
[3] J. Arai, S. Yagi, H. Uchiyama, K. Tomita, K. Miyahara, T. Tomoe, and K. Horikawa, “LASOLVTM Computing System: Hybrid Platform for Efficient Combinatorial Optimization,” NTT Technical Review, Vol. 18, No. 1, pp. 35–40, Jan. 2020.
Ryutaro Kawamura
Senior Vice President, Head of NTT Service Innovation Laboratory Group.
He received a B.S. and M.S. in precision engineering and a Ph.D. in electronics and information engineering from Hokkaido University in 1987, 1989, and 1996. He joined NTT Transport Systems Laboratories in 1989. From 1998 to 1999, he was a visiting researcher at Columbia University, USA. From 2003 to 2015, he was a member of the Board of Directors of OSGi Alliance. He is engaged in research on network reliability techniques, network control and management, high-speed computer networks, active networks, network middleware, and mobile cloud/edge computing.