Practical Field Information about Telecommunication Technologies
Case Studies of Using the Gigabit-compatible Protocol Checker as a Troubleshooting Tool for Home IP Systems
This article introduces case studies of using the gigabit-compatible protocol checker as a troubleshooting tool for home Internet protocol (IP) systems. It is the thirteenth in a bimonthly series on the theme of practical field information about telecommunication technologies. This month's contribution is from the Network Interface Engineering Group, Technical Assistance and Support Center, Maintenance and Service Operations Department, Network Business Headquarters, NTT EAST.
Customer needs are changing as access lines become faster and a wide range of application services come to be provided. There is demand for high-quality Internet protocol (IP) services with even higher levels of reliability than in the past. However, the range of configuration patterns for IP equipment within customers’ residences has been growing. As a result, the Technical Assistance and Support Center has been receiving many inquiries concerning faults having low reproducibility, such as ones that occur once a day or once a week, as well as enquiries about unusual faults for which a solution cannot be found without analyzing the packets flowing in the customer’s home network.
This article presents case studies of using the gigabit-compatible protocol checker, which can easily capture the packet data needed for fault analysis.
2. Overview of gigabit-compatible protocol checker
The gigabit-compatible protocol checker (Fig. 1)  has two local area network (LAN) ports for mirroring and one LAN port for management. It can be installed within the Ethernet segment in a residence as a packet-capture tool that obtains packet data and stores it in a built-in memory. Although it is compact, it provides long-term file acquisition and storage so it is effective for low-reproducibility faults because it can capture data when the fault reappears; the captured data can be analyzed using a LAN analyzer such as Wireshark. Its main functions are: support for packet capture at about 200 Mbit/s over gigabit Ethernet, an operation panel for starting and stopping packet capture without the need to attach a personal computer (PC), and support for remote control from a maintenance center and automatic transfer of data.
3. Case studies
3.1 Incoming calls disabled on 050 IP phone service
3.1.1 Description of fault
A customer using NTT EAST’s B FLET’S service and an ISP’s 050 IP phone service upgraded the VoIP-gateway from model A to model B, but then found that incoming calls were disabled even though outgoing calls could be made as usual (VoIP: voice over IP). At the time of the fault, the party making the call simply heard a busy signal although the receiving party (the NTT customer) was not using the line. Even when the new model-B gateway equipment was replaced, the problem persisted.
3.1.2 Inspection method
To troubleshoot this problem, we duplicated the customer’s system configuration in the office of a field maintenance group and conducted a test to inspect calls made from the Technical Assistance and Support Center to that office. In this test, we inserted the gigabit-compatible protocol checker between the optical network unit (ONU) and the VoIP gateway in the customer’s premises to analyze call conditions at the time of the fault (incoming calls not OK) and we investigated any differences in the session initiation protocol (SIP) sequence for two cases: (1) when the VoIP gateway was model A and (2) when it was model B. The test setup is shown in Fig. 2.
3.1.3 Inspection results
We found that calls could be received normally when the VoIP gateway was model A (Fig. 3). However, when it was model B, after returning a 100Trying message in response to an INVITE message from the receive-side SIP server, the gateway returned an error “404 Not Found” (Fig. 4). This error is returned whenever the called party cannot be detected, which means that the model-B VoIP gateway had judged that the terminal targeted by the incoming call did not exist.
3.1.4 Cause of fault
The results of this inspection led us to consider that although the model-B VoIP gateway received the dial-in number contained in the INVITE message, it could not recognize the terminal targeted by it and returned “404 Not Found”. When the model-B VoIP gateway was designed, no assumptions were made about the provider of the 050 IP telephone services; consequently, the maintenance manual did not describe how to set the additional dial-in numbers for a particular provider. This lack of settings caused this problem.
3.1.5 Countermeasure and results
We solved this problem by creating settings for additional dial-in numbers in the model-B VoIP gateway. We also concluded that no problems occurred with incoming calls in the model-A VoIP gateway because it does not refer to settings of additional dial-in numbers in the INVITE message of the ISP’s 050 IP phone service.
3.2 Occasionally slow Internet access
3.2.1 Description of fault
A customer using a FLET’S optical line reported that Internet access slowed down sometimes. Upon checking the customer’s router log, we found frequent occurrences of the message “PPPoE*1 session released” (Fig. 5). Even when the router and ONU were replaced and the accommodation line was changed, the problem remained.
3.2.2 Inspection method
We inserted the gigabit-compatible protocol checker between the ONU and router installed in the customer’s residence and proceeded to capture and analyze packets. The system configuration at that time is shown in Fig. 6.
3.2.3 Inspection results
Upon analyzing the router log and data captured at the time of this phenomenon, we found that the reception of a PPP (point-to-point protocol) Termination Request from the network side was followed by a PPPoE Active Discovery Termination (PADT) request. Furthermore, upon checking the packet data captured at times other than when this phenomenon occurred, we found that in all cases the reception of a PPP Termination Request was preceded by an idle state with the Internet service provider (ISP) lasting about 10 minutes (Fig. 7).
3.2.4 Cause of fault
The PPPoE session was terminated when the idle state with the ISP lasted longer than 10 minutes, and it took about 6 seconds for the session to be reestablished. Therefore, we concluded that the customer would have experienced a slow response when attempting to access the Internet during this 6-s period.
When we contacted the customer’s ISP, we were advised that a PPP session is terminated if the idle time exceeds 10 minutes to prevent unnecessary distribution of global IP addresses. We therefore told the customer that the problem was not caused by NTT equipment and we asked the customer to consult with the ISP.
3.3 Failure of LAN communications between fax and PC
3.3.1 Description of fault
A customer with a home fax machine receives images via a FLET’S optical line and uses a function for automatically transferring those images from the fax to a PC via a LAN. The customer reported that this transfer occasionally failed. Replacing devices within the customer’s home did not solve the problem.
3.3.2 Inspection method
To examine the image-data transmission conditions between the fax and PC, we inserted a gigabit-compatible protocol checker between the fax and switch in the customer’s home and analyzed the captured data. The system configuration at that time is shown in Fig. 8.
3.3.3 Inspection results
Upon analyzing the captured data, we found that communications between the fax and PC were conducted in the form of a TCP (transmission control protocol) session. During normal operation, the session was established on the basis of a TCP 3-way handshake*2 (Fig. 9) and communications proceeded without any problems. However, when the fault occurred, no Syn/Ack packets were returned from the PC in response to a Syn packet sent from the fax. As a consequence, no session was established and normal communications were not performed. Data captured during normal operations is shown in Fig. 10 and data captured when the fault occurred is shown in Fig. 11.
3.3.4 Cause of fault
Upon inspecting the customer’s PC, we found that the firewall settings of the security software installed on it had been modified from their standard values: the settings closed the TCP port, which disrupted communications.
3.3.5 Countermeasure and results
The firewall settings modified by the customer were restored to their standard values. This action eliminated the fault.
4. Concluding remarks
This article presented case studies of using a gigabit-compatible protocol checker to troubleshoot faults. Packet analysis is becoming an essential tool for investigating IP-related faults, and it is hoped that the reader has found the case studies described here to be informative and useful.