Feature Articles: Network Technology for Digital Society of the FutureĦ½Toward Advanced, Smart, and Environmentally Friendly Operations
Deep Learning Based Anomaly Detection Technology for ICT Services—DeAnoS: Deep Anomaly Surveillance
In this article, we outline a deep learning based anomaly detection technology called Deep Anomaly Surveillance (DeAnoS). The NTT laboratories have been developing this technology with the aim of enabling proactive maintenance operations for ICT (information and communication technology) services. The present situation regarding verification of DeAnoS at NTT Group companies is also explained.
Keywords: Network-AI, network and service operations, anomaly detection
NTT Network Technology Laboratories is developing a deep learning based anomaly detection technology called Deep Anomaly Surveillance (DeAnoS), which utilizes an autoencoder (AE) to promptly detect changes in the state of an ICT (information and communication technology) service [1–3]. In this article, we explain the technical issues concerning DeAnoS, which was exhibited at NTT R&D Forum 2018 Autumn held in November 2018.
2. Overview of DeAnoS
An AE, as used in DeAnoS, is a type of deep learning based neural network that enables the learning of complicated structures inherent in data. Attention is now being focused on the use of AEs in technology for detecting anomalies. When an AE is used, the dimensionality of data in the middle layer of the AE is reduced by setting the dimensionality of the middle layer to be less than that of the input and output layers and by learning the parameters for reproducing the data of the input layer in the output layer. Anomaly detection using an AE is based on the premise that normal data are distributed in the input-data space around certain manifolds that can express low dimensions. Specifically, as shown in Fig. 1, at the time of learning, a normal state is learned by using various kinds of data observed during normal operation of the system, and at the time of the test (i.e., anomaly detection), the current data are input to the AE that has learned the normal state as described above, and the distance between vectors of the input and output layers is output as the anomaly degree. When the degree of anomaly exceeds the threshold, the state is detected as an anomaly.
In addition to numerical data such as resource and traffic information based on SNMP (Simple Network Management Protocol)/MIB (Management Information Base) and flow data based on Netflow, the network data to be entered also include the syslog of routers and servers (text information). With text logs such as syslog, identifiers (IDs) are created by using syslog-analysis technology  for each syslog line, and the text log is converted to numerical data by using the appearance frequency of each ID. This process enables learning of information including syslog.
We are also working on not only detecting anomalies but also estimating their cause at the time they are detected [5, 6]. Specifically, we are investigating a method using sparse optimization for estimating which input dimension causes the anomaly degree to become higher when an anomaly is detected by the AE. With this technology, the degree of contribution of each input dimension to the degree of anomaly is calculated; accordingly, it is expected to improve the efficiency of investigation after detection of anomalies.
3. Status of verification of DeAnoS at NTT Group companies
In cooperation with NTT Group companies, we are presently verifying DeAnoS on the basis of operational data acquired from actual services, and we are assessing the effectiveness of this technology and extracting the issues for practical use. Our initiatives with the Strategic Network Management Department of NTT EAST and the Network Services Department of NTT Communications are shown in Figs. 2 and 3, respectively. First, in cooperation with the Strategic Network Management Department of NTT EAST, we confirmed the effectiveness of DeAnoS in detecting anomalies in application servers and estimating the parameters that caused them. Second, in an initiative with the Network Services Department of NTT Communications, we confirmed a case in which it was possible to estimate the causal parameter in addition to detecting the anomaly by analyzing changes in specific events and long-term trends.
4. Future prospects
In this article, the deep learning based technology DeAnoS, which NTT Network Technology Laboratories has been developing, was outlined, and the situation regarding its verification at NTT Group companies concerning network-anomaly detection technology was described. We will continue to fine tune DeAnoS by proceeding with its technology verification with the aforementioned companies and improve the environment for utilizing DeAnoS in the real world. Future tasks regarding technology for detecting network anomalies are to (i) improve the interpretability of main causes when an anomaly is detected and (ii) adapt such technology to various environments. Accordingly, we will continue our research and development to address those issues.