Practical Field Information about Telecommunication Technologies
Recent Case Study of Fault in IP Phone User System
This article describes a problem a customer was having with disconnects in inter-office calling using IP-VPN (Internet protocol virtual private network) services and how we rectified it. This is the thirty-sixth article in a series on telecommunication technologies. This month’s contribution is from the Network Interface Engineering Group, Technical Assistance and Support Center, Maintenance and Service Operations Department, Network Business Headquarters, NTT EAST.
Keywords: IP phone, IP-VPN, VoIP-GW
With the spread of FLET’S HIKARI NEXT and other Internet protocol (IP) access services, individual customers of voice calling services have been increasingly shifting to IP phone services from conventional calling services such as subscriber telephone (analog line) and INS-Net 64 (ISDN: integrated services digital network) services. Corporate customers have also been shifting to user systems that perform outside calling using IP phone services and inter-office extension calling using IP-VPN (virtual private network) services with the aim of reducing communication costs.
In this article, we introduce a case study involving occasional disconnects in inter-office calling using IP-VPN services; extension calling using the IP network.
2. Background of problem
The customer who experienced the problem has been using an IP-VPN service to make voice calls between the head office and branch offices by private branch exchange (PBX)-based IP extension calling. When both the voice-over-IP gateway (VoIP-GW) and PBX installed in the head office and branch offices were upgraded, a problem would occasionally occur in which an IP extension incoming call would disconnect on being answered, thereby terminating the call (Fig. 1). This event was not restricted to specific offices—it would occur at any office that used IP extension calling. Replacing the entire VoIP-GW or PBX package substrates did not solve the problem, so the Technical Assistance and Support Center was consulted to troubleshoot the problem.
3. On-site troubleshooting
Packet capture equipment was installed in the IP interval on the head-office side that had been experiencing a high frequency of disconnections in order to examine the conditions under which the event occurred. This equipment was used to collect and analyze IP packet traffic and PBX logs at the time of the event occurrence (Fig. 2).
The following results were obtained from this analysis. (1) Packet capture data revealed that the head-office VoIP-GW on the call-terminating side transmitted a BYE disconnect signal to the call-originating side (Fig. 3). (2) A putt-putt sound was output from the head-office VoIP-GW at the time of the disconnect, coinciding with the regeneration of voice data as indicated in packet capture data. (3) A survey of PBX logs over a period of about one month revealed that a disconnection (incomplete call) occurring within two seconds of answering an incoming call occurred 65 times on multiple phone sets. It was therefore concluded that the event was not caused by a defect in a phone set or by the intentional termination of calls.
The above results suggested that the cause of this problem lie in either the VoIP-GW or PBX in the head office where the IP extension call terminated. With this in mind, the Technical Assistance and Support Center constructed a pseudo-environment of customer facilities to carry out more detailed testing.
4. Reproducibility testing
To find the cause of this event, the Technical Assistance and Support Center conducted a reproducibility test by preparing VoIP-GW and PBX equipment and IP-VPN services as used by the customer and constructing an environment that emulated the system environment in which the event was occurring. Furthermore, to troubleshoot both the VoIP-GW and PBX at the event occurrence, a standalone phone set was prepared to check whether there were any differences between the call-terminating operation when connecting the standalone phone to the VoIP-GW and the call-terminating operation when connecting the PBX to the VoIP-GW. It was also decided to check for any changes in electrical operation in the analog interval below the VoIP-GW using a waveform recorder and to check the electrical operation between the VoIP-GW and terminals connected below the VoIP-GW when terminating a call (Fig. 4).
5. Results of reproducibility testing
(1) A reproducibility test was conducted several times when directly connecting a standalone phone set to the VoIP-GW and terminating a call, and not a single disconnection event occurred.
(2) A reproducibility test was also done several times when directly connecting the PBX to the VoIP-GW and terminating a call, and in this case, a call-disconnection event occurred on multiple occasions immediately after answering the incoming call.
Consequently, in reproducibility testing using the PBX, the technicians examined the voltage waveform between the layer 1 and 2 (L1–L2) lines during normal answering of an incoming call and the voltage waveform between the L1–L2 lines when a call was disconnected immediately after answering. First, for a normal connection, the VoIP-GW transmits a calling signal to the PBX on terminating the incoming call, and the PBX-connected phone goes off-hook. At this time, the polarity between L1–L2 is depolarized, and the call progresses (Fig. 5).
However, at the time of the event occurrence, the L1–L2 loop state cannot be maintained after depolarization of L1–L2 polarity when the PBX-connected phone goes off-hook, and as a result, the circuit is released (disconnected), and the call is terminated (Fig. 6).
Additionally, on examining these waveforms in detail, it was apparent that the time from off-hook to depolarization was 50–575 ms when connecting normally, but 606–640 ms whenever the event occurred.
The above reproducibility test therefore showed that this event in which an incoming call failed to connect would occur whenever the time from off-hook to depolarization was 606–640 ms. In view of this fact, the settings related to the analog connection between the VoIP-GW and PBX were examined. It was found that the period for detecting the closing of the VoIP-GW TEL port (analog telephone port) circuit due to a terminal off-hook operation was set to 640 ms as a default value in the VoIP-GW. Meanwhile, on the PBX side, the guard time setting turned out to be 600 ms as a default value. This setting serves to prevent an unstable waveform generated at depolarization after the off-hook operation from being falsely recognized as an on-hook operation.
From these results, it was confirmed that changing any of the following settings in the VoIP-GW or PBX could avoid the problem of an incoming call being disconnected.
(1) In the VoIP-GW, change the default value of the period for detecting closing of the TEL port circuit from 640 ms to 560 ms.
(2) In the PBX, change the default value of the guard timer for preventing an unstable waveform from being falsely recognized as on-hook from 600 ms to 650 ms.
(3) Change the PBX on-hook signal detection method from its default value of polarity detection or BT detection to BT detection so that depolarization after the call is off-hook is not treated as if it were on-hook.
On the basis of the results of this reproducibility test examining disconnections of incoming calls, the Technical Assistance and Support Center consulted with the customer and recommended that the setting for the PBX on-hook signal detection method be changed (item (3) above). This effectively eliminated the event in which an incoming IP extension call was disconnected upon answering.
The VoIP-GW used by the customer in this case study had a period of 640 ms for detecting the closing of the TEL port circuit due to a terminal off-hook operation. Thus, depending on the timing of this off-hook operation, a maximum time lag of 640 ms from VoIP-GW detection of the off-hook state to L1–L2 depolarization might occur. In the PBX, meanwhile, a time of 600–640 ms from off-hook to L1–L2 depolarization would exceed the guard timer (600 ms), which is used to prevent an unstable waveform at the time of depolarization from being falsely recognized as an on-hook operation. In such a case, it appears that the PBX would erroneously judge that an on-hook operation had occurred and would disconnect the circuit as a result (Fig. 7).
In this case study, the combination of specific VoIP-GW and PBX models resulted in the disconnection of incoming calls immediately after answering.
When equipment is selected for constructing a new user system or upgrading an existing one, it is important that system operation be checked beforehand as much as possible based on the customer’s usage format. However, if a problem occurs that cannot be rectified by a straightforward measure such as replacing equipment, a technique that measures waveforms and signals to isolate the source of the problem can provide a shortcut to a solution.