Feature Articles: Flexible Networking Technologies for Future Networks
Monitoring Technology for Programmable Highly Functional Networks
This article introduces technology for monitoring network quality, which is one of the most basic parameters for network management. NTT's high-precision network monitoring system, PRESTA 10G, which is compatible with perfSONAR (performance service oriented network monitoring architecture), can monitor network quality with high time resolution. It will enable the provision of higher service quality over a highly functional future network.
In future networks, network resources will be virtualized and assigned flexibly in order for users to enjoy high-quality services. To maximize the user’s quality of experience (QoE) and optimize resource assignment, we need to provide monitoring technology that can grasp every external factor directly. Future network services are expected to be used as lifelines for daily living by providing remote collaboration through broadband video applications, electronic government, and telemedicine services. Therefore, it is important to program the network flexibly while being aware of the QoE for users.
2.1 QoE factors
The QoE is determined by the combination of complex user needs and the surrounding environment (Fig. 1). For example, the network condition for user needs, such as many file transfers just before a deadline or live streaming of high-definition video ((a) in Fig. 1), is affected by sudden events (b), application specifications (c), and network status (d). NTT Network Innovation Laboratories is conducting a study of QoE enhancement by monitoring user needs and all of the parameters changing in space and time and by feeding the results back to network management and control.
2.2 Network quality and QoE
To meet user needs, it is necessary to monitor various parameters correctly. Network quality is considered to be one of the leading indicators. For example, if we can get network traffic behavior more accurately, we can detect the cause of an unknown reduction in QoE and we can improve the QoE by eliminating the cause. Namely, there is a deep connection between advances in network monitoring technology and complete understanding of QoE. Moreover, to understand application behavior and environmental changes in the network, it is necessary to perform a lot of packet processing and log analysis. Network monitoring, which analyzes network delay and congestion at the packet level, is a critical technology.
2.3 Need for high-precision network monitoring
Several network monitoring technologies have already been deployed in order to achieve a stable network. For example, routers and switches record the number of input packets and output packets on each management information base (MIB) interface, and we can understand network utilization if the monitoring system gathers this data for a short period of only a few minutes and processes the traffic volume. The routers and switches send and receive monitoring packets among different items of equipment and control the equipment’s vital processes. This data is used to operate the network such as eliminating bandwidth shortages or avoiding trouble. However, recent network services are sensitive to delay or consume bandwidth in a bursty manner, so we need more detailed monitoring data than the conventional network monitoring system can provide.
The need for high-resolution network monitoring is demonstrated by the example shown in Fig. 2. Results for processing the bitrate changes observed on a monitoring scale of 5 mins are shown in Fig. 2(a) and those on a scale of 10 ms are shown in Fig. 2(b). When the scale used for monitoring the traffic volume is coarse, the network bandwidth seems to be sufficient because bursty traffic is averaged. On the other hand, when the monitoring scale is fine, it is clear that traffic is bursty and sometimes briefly exceeds the available bandwidth, enabling us to detect potential packet loss or speed deterioration.
In practice, network bandwidth is designed with a safety margin, and momentary bandwidth shortages occur only infrequently. However, considering the management of several network qualities, e.g., not only bandwidth but also delay and jitter, it is important to grasp the network situation accurately.
3. New monitoring system: PRESTA 10G
My colleagues and I have developed a high-precision network monitoring system called PRESTA 10G. It consists of a general-purpose personal computer equipped with a network interface card (NIC) with network measurement extensions, device drivers, dedicated API (application programming interface) libraries, and general-purpose packet capture libraries .
The NIC supports three protocols—10GbE-LAN PHY, WAN PHY, and OC-192c POS—for a 10-Gbit/s network and has two main functions, which achieve high-accuracy, high-resolution network monitoring (10GbE: 10-Gbit/s Ethernet, LAN: local area network, PHY: physical layer, WAN: wide area network, POS: packet over SONET, SONET: synchronous optical network).
(1) 10-Gbit/s wire-rate packet capture and generation
(2) Externally synchronized precise timestamping based on timing signals using GPS (global positioning system)
These hardware functions make it easy to develop a network monitoring system that can monitor several network quality parameters precisely. For example, we can detect bursty traffic with microsecond-order time resolution of the traffic volume processing or we can measure one-way delay and jitter with microsecond time resolution by using synchronized timestamps.
3.2 Multilayer and multipoint monitoring
By using PRESTA 10G for monitoring, we can determine the network quality in detail. It is important to detect any network quality deterioration or failure occurrence promptly and to identify the location and cause of the failure in order to provide stable network service. In particular, once we know the location of the failure, we can quickly take the measures necessary for service restoration.
3.2.1 Analysis using multilayer monitoring
As an example, I introduce the workflow of trouble-shooting in the case of multilayer monitoring of a video service. To detect video disturbances, which are easy for the user to notice, we check for the presence of packet loss in each video frame by analyzing the header of the video transfer stream such as the RTP (Real Time Protocol) header in the application layer. When packet loss occurs, we perform a flow analysis in layers 3 and 4 to determine the traffic volume of each flow and identify the cause of packet loss. For example, we can identify the flow causing the trouble when there is a momentary shortage of network bandwidth caused by another bursty flow on the same line. We characterize jitter and delay in the lower layers and analyze the packet loss cause in detail. Serious jitter triggers a buffer overflow in a switch or router leading to packet loss.
We have developed software that monitors individual layers and designed all of it to run on PRESTA 10G (Fig. 3). In particular, PRESTA 10G can monitor multiple layers at the same time under a low load on a 10-Gbit/s broadband network by means of hardware functions.
3.2.2 Analysis by multipoint monitoring
To identify the location of a failure, it is important to determine where the packet loss occurs and the jitter worsens by monitoring multiple layers at multiple points. We have applied perfSONAR (performance service oriented network monitoring architecture) , which is an infrastructure for network performance monitoring and its standardization is being promoted by the Open Grid Forum. By utilizing the SOAP/XML-based data transmission protocol defined by perfSONAR and the middleware in which it is implemented, it is possible to share remote data in a common format by conforming to various published policies (SOAP: simple object access protocol, XML: extensible markup language). Moreover, perfSONAR provides a lookup service that reports where and what kind of data or monitoring function is being used; this makes network quality easier to manage over the domain.
However, the current perfSONAR handles monitoring data at a resolution of more than 1 s, so we have enhanced it by adding three new functions to enable perfSONAR to process data at a microsecond rate (Fig. 4).
(1) High-resolution monitoring routine
We have implemented a monitoring routine that stores monitoring data at a microsecond rate and processes it at a resolution higher than 1 s. We suppress further increases in the amount of data to be processed and in the required processing time in the case of high-resolution monitoring by using the existing routine when the requested resolution is more than 1 s and using our routine when requested resolution is less than 1 s.
(2) High-resolution file format
Our file format has an index part that stores the received packet count in units of seconds and a data part that stores monitoring data together with its arrival time. This format enables fast data extraction because the necessary number of packets can be determined by searching by only the time.
(3) High-resolution XML format
We have expanded the XML format to enable an XML message to handle the start and end times and a microsecond resolution. As a result, it is now possible to request data by the microsecond with the perfSONAR protocol.
Our high-precision network monitoring system, PRESTA 10G, can monitor network quality, which is the key factor for network management, with high precision. We are planning to develop technology for monitoring the factors that affect QoE, such as environmental changes and application behavior, and a technology for stabilizing the network through the use of monitoring data.