To view PDF files

You need Adobe Reader 7.0 or later in order to read PDF files on this site.
If Adobe Reader is not installed on your computer, click the button below and go to the download site.

Regular Articles

Hardware Acceleration Technique for Radio Resource Scheduling in 5G Mobile Systems

Yuki Arikawa, Takeshi Sakamoto, and Shunji Kimura

Abstract

This article presents a hardware acceleration technique for the scheduling process in ultra-high-density distributed antenna systems for fifth-generation (5G) mobile communications systems. In 5G systems, the overall system throughputs for a huge number of combinations of antennas and user equipment (UE) for communications have to be calculated in the scheduling process. To speed up the calculation, this acceleration technique calculates the throughputs of each UE simultaneously. Experimental results show that the acceleration technique calculates the system throughput approximately 60 times faster than without the acceleration. As a result, the acceleration technique improved the throughput by about 73% for a system with 32 antennas and 256 UEs. The hardware acceleration technique therefore enables a future practical 5G system.

Keywords: 5G mobile communications systems, resource scheduling, hardware acceleration

PDF PDF

1. Introduction

In mobile communications systems, resource scheduling assigns user equipment (UE) to each antenna for downlink transmission. Scheduling that decides the optimal combination of antennas and UEs is needed for efficient communications and improved overall system throughput [1]. In systems preceding fifth-generation (5G) mobile communications systems, this assignment has been executed by software-based processing due to the small number of antennas, as shown in Fig. 1(a), and the small number of possible combinations of antennas and UEs.

For 5G systems, researchers have been studying flexible antenna deployment such as localized massive multiple-input multiple-output (MIMO) and distributed massive MIMO [2]. In this article, we focus on distributed massive MIMO (i.e., distributed antenna systems). In distributed antenna systems, as shown in Fig. 1(b), a huge number of antennas are deployed at ultra-high density in order to increase the overall system throughput [3, 4]. The number of possible combinations reaches approximately 1076 in a system with 32 antennas and 256 UEs, which is based on a 5G-system model in the Mobile and wireless communications Enablers for the Twenty-twenty Information Society (METIS) project [5].


Fig. 1. Illustrations of antenna deployment (a) before the 5G system and (b) in the 5G system (ultra-high-density distributed antenna systems).

To obtain the appropriate combination from this explosive increase in the number of possible combinations, the scheduler approximately searches for the appropriate combination [6]. In general, an approximate search can approach the appropriate combination as the number of searched combinations increases. However, it will be difficult to increase the number of searched combinations by using software-based processing within the limited scheduling period of 1 ms [7] because of the limitation in the number of CPU (central processing unit) cores.

To overcome this issue, we devised a hardware acceleration technique that enables the scheduler to accelerate the scheduling process in ultra-high-density distributed antenna systems, and a search process for quickly obtaining the right combination. The details of the search process and the hardware acceleration technique are respectively described in sections 2 and 3.

2. Search process

The search process approximately decides the combination by iterating processes for improving the combination so that the system throughput increases. In general, the UEs having the highest throughput are simultaneously chosen for all antennas when the scheduling searches for a combination. In the search process, the UE having the highest system throughput is chosen only for one antenna. Then UEs are chosen for the other antennas. This choosing of the UE is carried out one by one by other antennas. In this way, the best UEs are assigned to antennas so that the system throughput always increases. The combination can be approximated to the better combination by iterating this assignment of UEs, and the system throughput is improved.

The procedure for deciding the combination in the search technique is shown in Fig. 2. First, all antennas are set to blank, which means the radio transmission is stopped. Then, one of the antennas is selected. In the case shown in Fig. 2(a), antenna A is selected. In order to select the UE to which antenna A should transmit data, the system throughputs under this condition are calculated. In this case, the system throughputs of (UE#1, Blank, Blank), (UE#2, Blank, Blank), and (UE#3, Blank, Blank) are calculated, and these three system throughputs are compared. In this example, (UE#1, Blank, Blank) has the highest system throughput. Consequently, UE#1 is provisionally selected for antenna A, and the combination is updated.


Fig. 2. Procedure for deciding the combination in the search process.

Next, antenna B is selected. In the case shown in Fig. 2(b), in order to select the UE for antenna B, the system throughputs are calculated taking the interference power from antenna A and B into account. The throughput of UE#1 may change because the interference power from antenna B changes. When UE#4 is chosen, the system throughput is calculated by summing the throughputs of UE#1 and UE#4. With the same calculation as above, the system throughput is calculated by summing the throughputs of UE#1 and UE#5. These two system throughputs are compared, and the highest one is obtained when antenna B transmits data to UE#4 in this example. Thus, UE#4 is provisionally selected for antenna B, and the combination is updated. This technique enables the scheduler to take inter-cell interference power into account when scheduling decides the combination for the antennas.

In this technique, all antennas are selected one by one. This UE selection for each antenna is carried out sequentially to other antennas. In the case shown in Fig. 2(c), antenna A is selected again, and the throughputs of UE#1, UE#2, and UE#3 are calculated again because the interference powers from antenna B and antenna C change.

In this way, the combination is always updated so that the system throughput increases. This increases the system throughput monotonically as the number of iterations increases.

3. Hardware acceleration technique

The hardware acceleration technique accelerates the calculation of the system throughput for each combination during the search procedure. The search procedure finds the optimal combination to achieve higher system throughput by iterating the processing for improving the combination. As shown in Fig. 3, the conventional scheme (without acceleration) requires a longer processing time to obtain the optimal combination. In contrast, the system throughput is quickly improved in the acceleration technique. In this way, the combination that achieves higher system throughput is obtained within the required period.


Fig. 3. Concept for achieving higher system throughput.

The hardware acceleration technique for the scheduling process is shown in Fig. 4. In this technique, the search process is executed by a dedicated circuit in order to increase the number of searched combinations, which improves the scheduling performance. The software sets possible UEs for each antenna in the hardware accelerator before starting the search process. The details of the hardware acceleration technique are described below.


Fig. 4. Hardware acceleration for the scheduling process.

The flow of the search process performed by the hardware accelerator is shown in Fig. 5. First, the combinations of antennas and UEs are generated. Next, the system throughput is calculated by summing the throughputs of all the UEs in the generated combinations. Then, the combination for which the scheduling achieves higher system throughput is decided by comparing the system throughput for each combination. These three steps are iterated until the scheduling period expires so that the optimal combination can be obtained.


Fig. 5. Flow chart of scheduling process.

We investigated the processing time for each step in order to clarify the steps that should be accelerated in the above search process. The investigation by software-based processing revealed that the system-throughput calculation accounts for more than 90% of the processing time to execute the search process. On the basis of the results, we devised a parallel and pipeline processing technique to accelerate the system-throughput calculation.

A block diagram of the circuit is depicted in Fig. 6. The circuit comprises three parts: a combination-generation part that outputs the combination of antennas and UEs, a system-throughput-calculation part, and a combination-decision part that decides the combination by comparing the system throughput for each combination. Our proposed technique consists of two kinds of processing: parallel processing to calculate the throughput for all UEs simultaneously and pipeline processing to obtain the system throughput for generated combinations at every clock cycle.


Fig. 6. Circuit block diagram.

The system-throughput-calculation part consists of multiple throughput-calculation blocks that output the throughputs of the UEs in parallel. The throughput-calculation blocks are provided with the same number of antennas. The throughput-summation block outputs the system throughput by summing the throughputs of the UEs. The throughputs of the UEs are simultaneously calculated at the throughput-calculation blocks. Hence, the circuit executes the search process at high speed. Furthermore, the circuit scale can be minimized by optimizing a parallel number for the same number of antennas.

The timing chart of the system-throughput-calculation block is shown in Fig. 7. In this block, the received signal power to interference power and noise ratio (SINR) is calculated. Next, the calculated SINR is converted to the throughput. Then the throughputs of the UEs are summed. These steps are independent of the preceding and the following combinations. Therefore, these steps are executed in the pipeline. This enables the scheduler to obtain the system throughput for generated combinations at every clock cycle.


Fig. 7. Timing chart.

4. Performance evaluation

We carried out experimental measurements and system-level simulations in order to evaluate the performance of the acceleration technique and the search process.

To verify the number of searched combinations within the scheduling period of 1 ms, we measured the processing time spent for the search process. The proposed technique was implemented on a field-programmable gate array (FPGA) (Xilinx Zynq-7045) at the clock frequency of 100 MHz. The processing time was measured with the FPGA. The processing time without acceleration was measured with a general-purpose processor (Intel Core i5) at the clock frequency of 2.67 GHz.

We carried out system-level simulations to evaluate the performance of the proposed scheme. The performance was evaluated in practical conditions based on the small-cell scenario in LTE (Long-Term Evolution) specifications [8]. The simulation conditions are listed in Table 1. The simulation conditions were based on an assumption of ultra-high-density distributed antenna systems, so 32 antennas were uniformly distributed in a circle with a radius of 155 m. The minimum distance between antennas was 20 m.


Table 1. Simulation conditions.

The results of the performance evaluation are given in Table 2. The processing time per searched combination measured with acceleration was 10 ns. The processing time without acceleration was 596 ns. The circuit executed the search process about 60 times faster than without acceleration. These results indicate that the number of searched combinations within the scheduling period of 1 ms using the proposed technique was 105, and the number was 1679 in the processing without acceleration. These results show that the number of searched combinations with the proposed technique is about 60 times larger than without the acceleration.


Table 2. Results of performance evaluation.

Furthermore, we carried out system-level simulations to evaluate the system throughput. The acceleration technique improved the system throughput by about 73% when there were 32 antennas and 256 UEs. Consequently, the proposed technique enables the scheduler to obtain the appropriate combination in ultra-high-density distributed antenna systems.

5. Summary

In this article, we proposed a hardware acceleration technique that accelerates the search process in the scheduling of ultra-high-density distributed antenna systems. Our technique consists of parallel processing to calculate the throughputs of each UE simultaneously and pipeline processing to obtain the system throughput for combinations at every clock cycle. As a result, it performs the search process 60 times faster than processing without acceleration. The proposed technique enables the scheduler to substantially increase the number of searched combinations. Consequently, the proposed technique enables the scheduler to obtain the appropriate combination in ultra-high-density distributed antenna systems. With the acceleration techniques, the scheduler improved the system throughput by about 73% when there were 32 antennas and 256 UEs. The scheduling with the proposed technique therefore enables a practical 5G system.

Acknowledgments

This article includes part of the results of “The research and development project for realization of the fifth-generation mobile communications system” commissioned by The Ministry of Internal Affairs and Communications, Japan.

References

[1] 3GPP TS36.300: “Evolved Universal Terrestrial Radio Access (E-UTRA) and Evolved Universal Terrestrial Radio Access Network (E-UTRAN): Overall Description,” V13.3.0, Apr. 2016.
[2] S. Suyama, J. Mashino, Y. Kishiyama, and Y. Okumura, “5G Multi-antenna Technology and Experimental Trials,” Proc. of 2016 IEEE International Conference on Communication Systems (ICCS), pp. 1–6, Shenzhen, China, Dec. 2016.
[3] NTT DOCOMO, “DOCOMO 5G White Paper, 5G Radio Access: Requirements, Concept and Technologies,” July 2014.
[4] T. Seyama, M. Tsutsui, T. Oyama, T. Kobayashi, T. Dateki, H. Seki, M. Minowa, T. Okuyama, S. Suyama, and Y. Okumura, “Study of Coordinated Radio Resource Scheduling Algorithm for 5G Ultra High-density Distributed Antenna Systems,” Proc. of 13th IEEE VTS Asia Pacific Wireless Communications Symposium (APWCS 2016), S3-5, Tokyo, Japan, Aug. 2016.
[5] ICT-317669 METIS project, “Simulation Guidelines,” Del. D6.1, Oct. 2013.
https://www.metis2020.com/wp-content/uploads/deliverables/METIS_D6.1_v1.pdf
[6] Y. Arikawa, K. Kawai, H. Uzawa, and S. Shigematsu, “Practical Resource Scheduling in Massive-cell Deployment for 5G Mobile Communications Systems,” Proc. of the 2015 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 456–461, Bali, Indonesia, Nov. 2015.
[7] 3GPP TS36.321: “Evolved Universal Terrestrial Radio Access (E-UTRA); Medium Access Control (MAC) protocol specification,” V12.5.0, Mar. 2015.
[8] 3GPP TR 36.872: “Small Cell Enhancements for E-UTRA and EUTRAN - Physical Layer Aspects,” V12.1.0, Dec. 2013.

Trademark notes

Intel Core is a trademark of Intel Corporation or its subsidiaries in the United States and/or other countries.

Yuki Arikawa
Research Engineer, Fixed Mobile Convergence Device Development Project, NTT Device Innovation Center.
He received a B.E. and M.E. from Waseda University, Tokyo, in 2008 and 2010. Since joining NTT in 2010, he has been studying device architecture for 10G-EPON and 5G mobile communications systems. He is currently in charge of the research and development project for realization of the 5G mobile communications system commissioned by The Ministry of Internal Affairs and Communications, Japan. He is a member of the Institute of Electronics, Information and Communication Engineers (IEICE).
Takeshi Sakamoto
Senior Research Engineer, Project Leader of Fixed Mobile Convergence Device Development Project, Metro-Access Network Device Project, NTT Device Innovation Center.
He received a B.E. and M.E. in electronic engineering from Kyoto University in 1994 and 1996. He joined NTT Opto-electrical Laboratories in 1996. He is currently engaged in developing digital devices for optical access networks. He is a member of IEICE.
Shunji Kimura
Senior Research Engineer, Supervisor, Project Manager of Metro-Access Network Device Project, NTT Device Innovation Center.
He received a B.E. and M.E. in electrical engineering in 1989 and 1991, and a Ph.D. in electronics, information, and communication engineering in 1997, all from Waseda University, Tokyo. In 1991, he joined NTT LSI Laboratories, where he worked on technology for designing and evaluating analog front-end integrated circuits for 40-to-100-Gbit/s-class optical transmission systems. In 2003, he was assigned to NTT Access Network Service Systems Laboratories, where he was engaged in the development of the next-generation optical access networks, 10G-EPON and NG-PON2. He is currently a project manager at NTT Device Innovation Center and is leading the development of metro-access network devices. He received the 1994 Japan Microwave Prize at APMC94, the 1996 Young Engineer Award from IEICE, and Best Paper Awards at OECC2010 and COIN2010. He is a senior member of IEEE (Institute of Electrical and Electronics Engineers) and IEICE.

↑ TOP