Objective Quality Evaluation Model for Videophone Services
This article describes an objective quality evaluation model that can estimate videophone quality using quality parameters instead of captured media signals and IP (Internet protocol) packets. It enables effective design and management of videophone services. Its video quality estimation and multimedia quality integration functions were standardized as ITU-T Rec. G.1070 in 2007.
Videophone services over IP (Internet protocol) will become key services in the next generation network (NGN). To provide a high-quality service for users, it is extremely important to design and manage the quality of experience (QoE) appropriately. To do this, it is desirable to develop an objective quality evaluation model that can estimate subjective quality from physical quality parameters of videophone services.
2. Objective quality measurement
Objective quality assessment  can be categorized into media-layer objective models, packet-layer objective models, parametric models, and hybrid models from the viewpoint of the input information. To estimate the quality perceived by users, media-layer objective models use media signals –, packet-layer objective models use information about IP packets –, parametric models use quality parameters –, and hybrid models use a combination of media signals, IP packets, and quality parameters. Media-layer objective models are highly correlated with subjective quality and used for benchmarking and management. However, this approach is inconvenient for QoE planning because relationships among media quality and quality parameters are not directly considered. Packet-layer objective models are mainly used for in-service quality management . Parametric models are convenient for QoE planning because they formulate the relationships among subjective quality and quality parameters and can estimate quality using quality parameters instead of using captured media signals and IP packets. They enable QoE planners to help ensure that users will be satisfied with end-to-end transmission performance, which avoids over-engineering. These models incorporate network, application, and terminal equipment parameters of high importance to QoE planners. In this article, we describe a parametric model for estimating videophone quality that can be used for application and/or network planning and monitoring.
3. Framework of the model
The framework of the parametric model is shown in Fig. 1. Its input parameters are video and speech quality parameters that are considered important in QoE planning and monitoring. The model consists of three functions for speech quality estimation, video quality estimation, and multimedia quality integration. The degradation caused by pure delay is considered only in the multimedia quality integration function.
1) Speech quality estimation function
This function estimates the listening speech quality using speech quality parameters. The E-model, which has been standardized as ITU-T Recommendation G.107 , is widely used for speech services including IP telephony. It can estimate the overall communication quality using a combination of quality factors. However, NTT proposed a new parametric model ,  that can achieve better performance than the E-model and is also applicable to wideband IP telephony services. We use this NTT model for the speech quality estimation model.
2) Video quality estimation function
This function estimates viewing video quality using video quality parameters. It has three features: One is to estimate the video quality affected by coding distortion. Specifically, we consider the optimal frame rate that maximizes the video quality at each bit rate. Another feature is to create a packet loss robustness factor that indicates the degree of video quality degradation due to packet loss rate. The third is to change the coefficient tables for estimating the video quality of various implementations of videophone applications because the video quality cannot be estimated based simply on codec information. For the E-model, methodologies for deriving speech quality factors affected by the codec are provided in ITU-T Recommendation G.113, Appendix I.
3) Multimedia quality integration function
This function can be used for estimating the overall quality from the listening speech quality, viewing video quality, and end-to-end delay. It considers the individual media qualities, delay (i.e., absolute audiovisual delay and audiovisual media synchronization), and their interactions. The output of this model is multimedia quality, as shown in Fig. 1.
4. Accuracy of parametric model for videophone services
The accuracy of the speech quality estimation functions is described in , , so this section describes the accuracy of the video quality estimation and multimedia quality integration functions. Using the optimized video quality estimation function for a codec, we estimated the subjective video qualities, as shown in Fig. 2. The mean opinion score (MOS) was given by a five-grade absolute category rating (ACR) method (excellent, good, fair, poor, or bad) . As shown in Fig. 2, its estimation correlates very well with subjective quality. By changing the table of coefficients calculated in advance for each codec, we found that our model could estimate video quality , . Next, we estimated the overall quality using estimated individual media qualities and delays. The results for the multimedia quality integration function are shown in Fig. 3. The accuracy of the function was sufficient for practical use. These functions provide reliable information about what users actually require for videophone services, so this method can be used for effective design, implementation, and management of both interactive audiovisual applications and communication networks.
5. Quality planning using our model
Application and/or network planning to improve video quality is extremely important to avoid over-engineering and to provide users with services appropriately, as shown in Fig. 4. In this section, we show an application and network planning example. It considers the following questions.
– What is the optimal frame rate Ofr for each bit rate condition?
– What is the minimum bit rate BrMin to achieve the QoE requirement (e.g., MOS3.5)?
– How much packet loss rate PplMax is acceptable for maintaining that QoE?
When the bit rate was set to 40, 82, 160, and 384 kbps, our model gave Ofr of 15, 20, 30, and 30 frames per second (fps), respectively, as shown in Fig. 5. When a QoE planner requires MOS3.5, our model enables us to find BrMin=82 kbit/s, as shown in Fig. 5. To achieve the above QoE requirement when the bit rate is set to 160 or 384 kbit/s, our model indicates that the maximum packet loss rate PplMax is about 1.2%, as shown in Fig. 6. When the QoE planner requires MOS2.5 (MOS=2.5 is the threshold that is acceptable to 50% of users), the PplMax values were about 6 and 4%, respectively. That reveals that a higher bit rate does not necessarily lead to a high-quality service. Our model is a powerful tool for reflecting such characteristics in network planning.
Our model based on quality parameters is a promising way of designing and monitoring the quality of videophone services. Experimental results showed that its estimates correlate very well with subjective quality, which represents the user’s perception of a service. This model will make it easy to design and manage the QoE appropriately, which is important when providing a high-quality service for users in the NGN.