Global Standardization Activities

International Standards Adopted by ITU-T to Address Soft Errors Affecting Telecommunication Equipment

Hidenori Iwashita

Abstract

ITU-T (International Telecommunication Union - Telecommunication Standardization Sector), a specialized agency of the United Nations, approved standards relating to soft errors that affect telecommunication equipment on November 13, 2018. These standards stipulate design, testing, and quality estimation methods, as well as reliability requirements concerning measures designed to mitigate malfunctions (soft errors) in telecommunication equipment on the ground chiefly caused by cosmic rays. The adopted standards will help in establishing more reliable networks.

Keywords: soft error, irradiation test, neutron

PDF

1. Introduction

In recent years, the number of soft errors*1 caused by cosmic radiation neutrons has been increasing gradually even in telecommunication equipment located on the ground (Fig. 1). The soft error disappears as soon as the semiconductor device concerned is restarted or the affected data are overwritten. A soft error in data can cause a malfunction or system outage, but it is difficult to reproduce such a transient error and identify its cause. Soft errors can have a serious impact on users, so such errors are a major problem for system operators. Telecommunication equipment is designed so that such malfunctions do not affect network services. However, because soft errors are difficult to reproduce, they have not been sufficiently verified at the development stage.


Fig. 1. Mechanism of soft error occurrence.

Recently, however, it has become possible to measure the effects of soft errors on telecommunication equipment by using a compact accelerator-driven neutron source*2. This makes it possible to determine the effects of soft errors and take preventive measures in advance before vendors sell products and telecommunication carriers introduce telecommunication equipment into operating networks [1]. Nevertheless, while it has become possible for carriers to improve network quality dramatically by mitigating soft errors at the stages of equipment development and introduction, there is a need for requirements that serve as the benchmark for countermeasure methods and evaluation.

*1 Soft error: Unlike a hard error, which is a fault that causes permanent malfunctioning of a semiconductor device, a soft error is a temporary error that disappears as soon as the semiconductor device concerned is restarted or the data concerned are overwritten.
*2 Accelerator-driven neutron source: A facility for producing neutrons through a nuclear reaction caused by irradiating the target with protons or electrons that are sped up by an accelerator.

2. ITU-T Recommendations concerning soft errors

Against this background, at the October 2015 meeting of the International Telecommunication Union - Telecommunication Standardization Sector (ITU-T) Study Group 5 (SG5)*3, commencement of a study on soft errors in telecommunication equipment was approved with the intention of defining requirements on measures to mitigate soft errors, ranging from design techniques to evaluation methods. The Ad Hoc Committee on Soft Error Testing (SOET Adhoc) member companies worked together and developed draft Recommendations [2]. The ITU-T has now approved these Recommendations.

The Recommendations stipulate the design, testing, and quality estimation methods and reliability requirements concerning soft errors. They include benchmarks that vendors and carriers can use to select measures against soft errors that are appropriate for the required reliability level.

The soft-error-related standards approved by ITU-T consist of five Recommendations and a supplement. An overview of the Recommendations is shown in Fig. 2, and the list of Recommendations is given in Table 1. A timeline of the standardization of Recommendations and measures to mitigate soft errors is given in Table 2.


Fig. 2. Overview of soft error Recommendations.


Table 1. List of soft error Recommendations.


Table 2. Timeline of standardization of measures against soft errors.

The Recommendations and supplement define the following items.

2.1 K.124 (Overview): Overview of particle radiation effects*4 on telecommunication systems [3]

This Recommendation describes the mechanism by which particle radiation causes soft errors, the impact of soft errors generated in telecommunication equipment, mitigation methods, and the need for further Recommendations to address soft errors. Soft errors are mainly caused by particle radiation of neutrons and alpha particles. Neutrons are generated by cosmic rays, and alpha particles are generated by minute quantities of radioisotopes contained in materials used in semiconductor devices.

The occurrence of soft errors caused by alpha particles can be reduced by using high purity materials such as low-alpha-particle plastics.

Soft errors caused by cosmic rays are caused by the following factors. In space, high-energy particles, mainly protons, are dispersed as a result of sun and supernova explosions. When these high-energy particles enter the Earth’s atmosphere, they collide with nitrogen and oxygen nuclei in the atmosphere, causing a nuclear reaction. At this time, neutrons inside the nucleus are scattered. Although most of the neutrons generated in the atmosphere normally penetrate semiconductor devices and have no effect, on rare occasions they undergo nuclear reactions with the silicon nuclei that make up semiconductor devices, and these reactions generate various charged particles. This creates electrical noise and causes a temporary soft error.

2.2 K.130 (Test): Neutron irradiation test methods for telecommunication equipment [4]

This Recommendation describes methods and test procedures for generating soft errors in telecommunication equipment using accelerator neutron sources. When proton/electron particles accelerated by an accelerator are irradiated to a target (lead, tungsten, beryllium, lithium, etc.), a nuclear reaction occurs, and neutrons are generated. When telecommunication equipment is irradiated with these neutrons, it is possible to irradiate 1 million to several hundred million times more neutrons than in the natural world and to reproduce soft errors in a short time.

2.3 K.131 (Design): Design methodologies for telecommunication systems applying soft error measures [5]

This Recommendation describes a method of designing telecommunication devices constituting a carrier communication network to prevent or reduce soft errors. First, the basic configuration of the telecommunication devices to be covered is described as it relates to mitigating soft errors. A definition and method of regulating reliability of equipment in the event of soft errors are explained, and a procedure for developing equipment to prevent soft errors in order to comply with the reliability regulation is described. Countermeasures to soft errors are particularly important with field-programmable gate arrays (FPGAs), so details on the soft error occurrence rate in FPGAs are also described in this article, and K.Sup11, supplementary material to K.131, is introduced as a measure to alleviate the effects of soft errors in FPGAs [6].

2.4 K.139 (Requirements): Reliability requirements for telecommunication systems affected by particle radiation [7]

This Recommendation describes the reliability requirements for equipment needed to ensure reliable networks in the event of soft errors.

As semiconductor devices become highly integrated, the number of soft errors compared to hard errors is increasing rapidly (Fig. 3). However, unlike hard errors, soft errors can be greatly reduced by introducing appropriate countermeasures. Therefore, using the rate of conventional hard errors as a guide, we set a range within which the number of failures caused by soft errors and the occurrence rate of the main signal interruption fell within a statistical error range (Fig. 4). However, in rare cases, a silent failure may occur due to a soft error. All silent faults must be prevented in the operation of network services. Therefore, we established a reliability standard such that no silent faults would occur even after approximately 10,000 years of neutron irradiation.


Fig. 3. Failure rates of hard errors and soft errors of LSIs (large-scale integrated circuits).


Fig. 4. Approach to setting reliability requirements.

As a result, we defined three reliability standards designed to reduce the failure exchange rate, reduce the main signal interruption rate, and prevent silent failures. The reliability of large networks can be ensured by meeting these criteria.

2.5 K.138 (Quality estimation): Quality estimation methods and application guidelines for mitigation measures based on particle radiation tests [8]

This Recommendation describes how to evaluate whether the reliability requirements for soft errors of telecommunication devices defined in K.138 (quality estimation) are satisfied based on the results of neutron irradiation tests described in K.130 (test). The test described in K.130 indicates that soft errors can be reproduced in a short time by irradiating neutrons with an intensity of 1 million to several hundred million times that of natural fields. An example of the evaluation is shown in Fig. 5. First, a neutron beam is irradiated to generate a soft error. The main signal condition is confirmed with the measuring instrument, and the alarm generation condition is confirmed by the alarm monitoring terminal.


Fig. 5. Example of quality estimation methods.

The generated event is classified into three reliability criteria. For example, in the first soft error in the figure, a device alarm occurred, and the main signal was cut off. In this case, it is counted as an event corresponding to MR (maintenance reliability) because maintenance is assumed to be necessary. Also, the main signal has been cut off, so it is counted as an event corresponding to SR (service reliability). The second soft error was automatically corrected; thus, there was no device alarm and no effect on the main signal. In this case, it is not counted in any confidence level.

The effect of the main signal and the equipment alarm condition are continuously checked during the test. For example, with the eighth soft error, the main signal is disconnected, but there is no device alarm. This corresponds to a silent failure and can be counted as one AR (alert function reliability). In this way, events corresponding to each reliability criterion are counted. In addition, the natural operating time and frequency converted from the irradiation time can be used to determine whether the standard value is satisfied.

*3 ITU-T SG5: ITU-T is an ITU organization that issues Recommendations with a view to standardizing telecommunications. SG5 investigates issues related to the environment and climate change.
*4 Particle radiation effects: The impact of particle radiation (emitted energy in the form of neutrons, alpha particles, etc.) on semiconductors. In recent years, the number of soft errors caused by neutrons generated in the atmosphere by cosmic rays has been increasing in semiconductors used in ground-level equipment.

3. Future outlook

It is expected that widespread deployment of telecommunication equipment that satisfies the requirements defined in these Recommendations will improve the reliability of telecommunication services.

References

[1] Xilinx, “Device Reliability Report,” UG116 (v10.9), Sept. 2018.
https://www.xilinx.com/support/documentation/user_guides/ug116.pdf
[2] Website of The Telecommunication Technology Committee, SOET Adhoc (in Japanese),
http://www.ttc.or.jp/j/info/bosyu/20150804/
[3] Recommendation ITU-T K.124,
https://www.itu.int/rec/T-REC-K.124-201612-I
[4] Recommendation ITU-T K.130,
https://www.itu.int/rec/T-REC-K.130-201801-I/en
[5] Recommendation ITU-T K.131,
https://www.itu.int/rec/T-REC-K.131-201801-I/en
[6] Supplement 11 to ITU-T K-series Recommendations,
https://www.itu.int/rec/T-REC-K.Sup11-201809-I/en
[7] Recommendation ITU-T K.139,
https://www.itu.int/rec/T-REC-K.139-201811-I/en
[8] Recommendation ITU-T K.138,
https://www.itu.int/rec/T-REC-K.138-201811-I/en
Hidenori Iwashita
Research Engineer, Transport Network Innovation Project, NTT Network Service Systems Laboratories.
He received a B.S. and M.S. in nuclear engineering from Hokkaido University in 2006 and 2008. He joined NTT Network Service Systems Laboratories as a researcher in 2008. He is involved in researching and developing a packet transport multiplexer (PTM), PTM cross-connect (PTM-XC), PTM adapter for dedicated services, and an Ethernet private line system. He is a member of the Institute of Electronics, Information and Communication Engineers (IEICE).

↑ TOP