To view PDF files

You need Adobe Reader 7.0 or later in order to read PDF files on this site.
If Adobe Reader is not installed on your computer, click the button below and go to the download site.

Feature Articles: QoE Estimation Technologies

Playback State Estimation of Progressive Download-based Video Services

Hirotada Honda, Hiroshi Yamamoto, Sorami Nakamura,
and Akira Takahashi


The recent proliferation of smartphones and tablet terminals has resulted in the rapid development of video applications. However, in wireless access lines such as those for LTE (Long Term Evolution) and 3G (third-generation) services, packet loss rates and variations in the delay are larger than those in fixed access lines. As a result, video services implemented in TCP (transmission control protocol), known as progressive download (PDL)-based video services, are especially pervasive on wireless access lines.

In this article, we introduce a method of playback state estimation of PDL-based video services. This method makes it possible to accurately estimate the playback state, which will improve the efficiency of monitoring the quality of video services in network and service operation processes.


1. Introduction

The use of smartphones and tablet terminals has increased substantially recently, and with this increase, more and more applications are now being provided via the Internet. In particular, video applications are remarkably pervasive; these are technically divided into three categories according to their delivery mechanism: progressive download (PDL)-based video, download type video, and real-time streaming. The first two are implemented in TCP (transmission control protocol) and the third in UDP (user datagram protocol). PDL-based video services account for about 40% of the volume of domestic Internet traffic. Representative PDL-based video services such as YouTube and Hulu are targeted to users of personal computers, smartphones, and tablet terminals that access the services over the Internet. In addition, some broadcasters also provide PDL-based video services that enable users to access their news and archived content. NTT Plala started Hikari TV (television) dokodemo/mobile in 2011, NTT WEST Corporation provides the network platform for SKY PerfecTV! on Demand, and NTT DOCOMO offers BeeTV and the d-market video store. The NTT Group believes that as a network and service provider, one of its most important tasks is improving the quality of these services.

The reasons for this are as follows. First, for network providers, PDL-based video services are considered to be the benchmark of the network. Low-quality PDL-based video services will directly lead to a decrease in customer satisfaction. Second, for a service provider, attractive content and reasonable prices are of course important, but so is the quality of the service itself. Nevertheless, there are no methods for either continuously monitoring the service quality or comparing the service quality with other competitive services. Thus, we have developed a method for addressing these issues for the network and service providers in the NTT Group.

2. Objectives of proposed method

The following two issues are considered to be the main objectives of the proposed method:

(1) Continuously monitoring the service quality;

(2) Comparing the service quality with other competitive services.

The first item refers to efficiently enhancing service quality monitoring in the network operation processes, which will enable earlier detection and resolution of quality degradation incidents. The latter item implies that information on the actual service quality is provided, and this information is useful for making decisions on service strategies (Fig. 1).

Fig. 1. Possible applications of proposed method.

3. Principal factor affecting the quality of PDL-based video services

In this section, we discuss the main factor affecting the quality of PDL-based video services (Fig. 2). In real-time streaming applications implemented in UDP, packet losses directly result in block noise of video images, or they halt the playback. When forward error correction (FEC)*1 is applied, a packet loss that exceeds the redundancy of the FEC configuration leads to the block noise or to the halt of playback.

Fig. 2. Factors affecting service quality.

In PDL-based video, on the other hand, the lost packet will be retransmitted by the server thanks to the functionality of TCP. If the variation in the packet arrival interval caused by packet losses or jitter exceeds a certain level, the video playback will halt due to buffer starvation. Consequently, playback halt is the principal reason for quality degradation of PDL-based video services.

*1 FEC: When data are lost due to bit errors or packet losses, the FEC mechanism enables detection and correction at the client side by using the redundant data included in packets.

4. Proposed estimation method

4.1 Technical details

The explanation in the previous section indicates that it is most important to accurately estimate the playback state for PDL-based video applications. The proposed method makes it possible to estimate the playback state, especially the total halt duration and the number of playback halt events, by using packet capture data. While existing methods such as YouTube API (application programming interface) [1] are applicable only to specific services, the proposed method is basically applicable to arbitrary PDL-based video services. Furthermore, it does not require any additional implementations to either applications or terminals, and is achieved with only passive monitoring. These are the main advantages of the proposed method.

We discuss here the technical details of the proposed method [2]–[4], which is depicted in Fig. 3. Before the estimation process is carried out, we first estimate parameters of the play-out buffer (Fig. 3(a)) such as the decoding rate, the thresholds of the playback start, and the halt and restart. This process is executed in advance of the estimation process. These parameters vary because of the different operating systems in client terminals, applications, and video resolution levels, so we have to estimate them through careful observation of the playback state and the corresponding packet capture data.

Fig. 3. Input and output of proposed method.

In the estimation process, we first extract the time sequence of the downloaded data volume in each session by using the TCP header in the packet capture data. Next, we calculate the amount of temporal data in the play-out buffer (Fig. 3(b)). The playback state at each measurement time, which is the final output of this method, is determined by the amount of data in the buffer at each measurement time and the thresholds estimated beforehand (Fig. 3(c)). In some cases, additional analysis of the payload information can enhance the accuracy of the playback state estimation.

4.2 Verification

We verified the effectiveness of the proposed method by applying it to the PDL-based video services listed in Table 1. An example of the playback state estimation is shown in Fig. 4. We evaluated the accuracy of the estimation by calculating the matching rate*2, which was about 92% in this case.

Table 1. List of verified services.

Fig. 4. Example of estimation result.

*2 Matching rate: In this article, we define the matching rate as the total time when the estimated and actual playback states are consistently divided by the total time of the video playback.

5. Future work

We plan to further enhance the accuracy of estimation by considering the variable decoding rate if the adaptive video quality transitions take place. We will also investigate the detection of the pause state initiated by users and the estimation of the pause state duration [4]. We will continue our research and development of methods to estimate the quality of various application services.


[1] YouTube API.
[2] D. Ikegami, H. Honda, H. Yamamoto, H. Nojiri, and A. Takahashi, “Playing State Estimation Method for Progressive Download Services,” IEICE-CQ2011-59, pp. 91–96, 2011 (in Japanese).
[3] H. Honda, D. Ikegami, and H. Yamamoto, “Estimating Video Playback Quality by Observing TCP ACK Packets,” Proc. of the IEICE General Conf, B-11-25, Okayama, Japan, 2012 (in Japanese).
[4] H. Honda, D. Ikegami, H. Yamamoto, H. Nojiri, and A. Takahashi, “Playback and Pause State Estimation of Progressive Download-based Video Services,” IEICE-CQ2012-30, pp. 77–81, 2012 (in Japanese).
Hirotada Honda
Research Engineer, IP Service Network Engineering Group, NTT Network Technology Laboratories.
He received the B.E., M.E., and Ph.D. degrees in science from Keio University, Kanagawa, in 2000, 2002, and 2011, respectively. He joined NTT in 2002. He is currently investigating the playback quality estimation of progressive download-based video services. He is a member of the Institute of Electronics, Information and Communication Engineers (IEICE).
Hiroshi Yamamoto
Senior Research Engineer, IP Service Network Engineering Group, NTT Network Technology Laboratories.
He received the B.S. and M.S. degrees in information and computer science from Waseda University, Tokyo, in 1999 and 2001, respectively. He joined NTT Service Integration Laboratories (now NTT Network Technology Laboratories) in 2001. He has been working on the architecture and performance evaluation of IP networks and web applications. He is a member of IEICE.
Sorami Nakamura
Research Engineer, IP Service Network Engineering Group, NTT Network Technology Laboratories.
She received the B.S. degree in architecture and building engineering and the M.S. degree in mathematical and computing sciences from Tokyo Institute of Technology in 2008 and 2010, respectively. Since joining NTT in 2011, she has been working on quality design and management in networks. She is a member of IEICE and the Operation Research Society of Japan.
Akira Takahashi
Manager of the IP Service Network Engineering Group, Communication Traffic & Service Quality Project, NTT Network Technology Laboratories.
He received the B.S. degree in mathematics from Hokkaido University in 1988, the M.S. degree in electrical engineering from California Institute of Technology, USA, in 1993, and the Ph.D. degree in engineering from the University of Tsukuba, Ibaraki, in 2007. He joined NTT in 1988 and has been engaged in the quality assessment of audio and visual communications. He was a co-Rapporteur of ITU-T Question 13/12 on Multimedia QoE and its assessment during the 2004–2008 Study Period. He is a Vice-Chairman of ITU-T Study Group 12 (SG12) for the 2009–2012 and 2013–2016 Study Periods. He is a Vice-Chairman of the Technical Committee of Communication Quality in IEICE. He received the Telecommunication Technology Committee Award in Japan in 2004 and the ITU-AJ Award in Japan in 2005. He also received the Best Tutorial Paper Award from IEICE in Japan in 2006 and the Telecommunications Advancement Foundation Award in Japan in 2007 and 2008.