Document Type: Original Research Paper

Authors

Electrical and Computer Engineering Department, University of Tehran, Tehran, Iran

10.22061/jecei.2020.6969.350

Abstract

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor.
Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power general purpose processor. Afterwards, we make some modifications to add new instructions to the processor instruction set for better adapting to signal processing applications. In the second step, employing sub-threshold cores in many-core architectures, we use the proposed processor as simple basic cores in a many-core architecture.
Results: In comparison with the baseline architecture, these modifications reduce the program memory size about 42% in average. In addition, data memory accesses are reduced about 60% in average, and more than 90% speed-up is achieved. According to the improvements in total execution time (93%) and power consumption (27%), the total consumed energy is reduced about 95% in average with at most 2.6% area overhead and without increasing the process variation effects on processor specifications.
Conclusion: The results show that for parallel applications, such as FFT in LTE standard, exploiting sub-threshold processors in a many-core architecture not only can satisfy the required performance, but also reduce the power consumption about 50% or even more.

Keywords

Main Subjects

[1] H. Iwai, “echnology Roadmap for 22nm and beyond (invited paper),” Microelectronic Engineering, 86(79): 1520-1528, 2009.

[2] ISSCC Trends, 2013.

[3] H. Nejatollahi. M. E. Salehi, “Voltage scaling and dark silicon in symmetric multicore processors,” Journal of Supercomputing, 71(10): 3958-3973, 2015.

[4] International Technology Roadmap for Semiconductors, 2013.

[5] H. Dorosti. A. Teymouri. S. M. Fakhraie. M. E. Salehi, “Ultralow-energy variation-aware design: adder architecture study,” IEEE transaction on Very Large Scale Integration (TVLSI), 24(3): 1165-1168, 2016.

[6] A. Wang, B. H. Calhoun, A. P. Chandrakasan, Design for Ultra Low-Power Systems, New York: Springer, 2006.

[7] J. L. Hennessy, D. A. Patterson, Computer Architecture: A Quantitative Approach, 4th Edition, San Francisco: Morgan Kaufmann, 2006.

[8] B. Zhai,S. Pant, L. Nazhandali, S. Hanson, J. Olson, A. Reeves, M. Minuth, R. Helfand, T. Austin, D. Sylvester, D. Blaauw, “Energy-efficient subthreshold processor design, IEEE Trans. On Very Large Scale Integration (VLSI) Systems, 17(8): 1127-1137, 2009.

[9] J.  Constantin,  A.  Dogan,  O.  Andersson,  P.  Meinerzhagen, J. Rodrigues, D. Atienza, A. Burg, "TamaRISC-CS: An ultra-low-power  application-specific  processor  for  compressed  sensing," in Proc. 2012 IEEE/IFIP  20th  International  Conference  on VLSI  and  System-on-Chip  (VLSI-SoC): 159-164, 2012.

[10] N.    Ickes,     D.    Finchelstein,     A.    Chandrakasan, " A         10-pj/instruction,  4-MIPS  micropower  DSP  for  sensor  applications," in Proc. 2008 IEEE  Asian  Solid-State  Circuits  Conference: 289-292., 2008.

[11] V. Ekanayake, C. Kelly, R. Manohar, “An ultra low-power processor for sensor networks,” SIGARCH Comput. Archit. News 32(5): 27-36, 2004.

[12] V. Ekanayake,  I. Kelly,  C.,  R. Manohar, "BitSNAP: Dynamic significance compression for a low-energy sensor network asynchronous  processor," in Proc. IEEE  International  Symposium on  Asynchronous  Circuits  and  Systems: 144-154, 2005.

[13] B. Warneke, K. Pister, "An utra-low energy microcontroller for Smart Dust wireless sensor networks," in Proc. IEEE International Solid-State Circuits Conference: 316-317, 2004.

[14] M. Hempstead, N. Tripathi, P. Mauro, G.-Y. Wei, D. Brooks, "An ultra low power system architecture for sensor network applications," in Proc. IEEE International Symposium on Computer Architecture, ISCA: 208-219, 2005.

[15] M.  Hempstead, D.  Brooks, G.-Y.  Wei, “An accelerator-based wireless sensor network processor in 130 nm CMOS,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems. 1(2): 193-202, 2011.

[16] L.  Nazhandali,  B.  Zhai,  J.  Olson, A.  Reeves, M.  Minuth, R. Helf, S. Pant, T. Austin, D. Blaauw, “Energy optimization of  subthreshold-voltage sensor network processors,” SIGARCH Comput. Archit. News, 33(2): 197-207, 2005.

[17] B.  Zhai,  L.  Nazhandali,  J.  Olson, A.  Reeves, M.  Minuth, R. Helfand, S. Pant, D. Blaauw, T. Austin, "A 2.60pj/inst subthreshold sensor processor for optimal energy efficiency," in Proc. 2006 Symposium on VLSI Circuits, 2006. Digest of Technical Papers.: 154-155, 2006.

[18] L.  Nazhandali, M.  Minuth, B.  Zhai, J.  Olson, T.  Austin, D. Blaauw, "A second-generation sensor network processor with application-driven memory optimizations and out-of-order execution," in Proc. 2005 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES ’05, ACM: 249-256, 2015.

[19] F. J. Pollack, "New microarchitecture challenges in the coming generations of CMOS process technologies," in Proc. 1999 Annual ACM/IEEE International Symposium on Microarchitecture, MICRO 32, IEEE Computer Society, 1999.

[20] S. Borkar, " Thousand core chips:  A technology perspective," in Proc. 2007 Annual Design Automation Conference, DAC ’07, ACM: 746-749, 2007.

[21] M. Aliasgari. A. Abbasfar. S. Fakhraie. “Coding techniques to mitigate out-of-band radiation in high data rate OFDM-based cognitive radios,” Computers & Electrical Engineering 39(2): 373-385, 2013.

[22] A. Salari, S. Fakhraie, A. Abbasfar, “Algorithm and FPGA implementation of interpolation-based soft output MMSE MIMO detector for 3GPP LTE,” IET Communications 8(4), 21(3): 492-499, 2014.

[23] I.  Kelly, C., V.  Ekanayake, R.  Manohar, “SNAP:  A  sensor-network asynchronous processor,” in Proc. International Symposium on Asynchronous Circuits and Systems, ASYNC ’03, IEEE Computer Society: 24-33, 2003.

[24] Y. Pu, G. Samson, C. Shi, D. Park, K. Easton, R. Beraha, J. Hadi, M. Lin, E. Arvelo, J. Fatehi, J. Kumar, M. Derkalousdian, P. Aghera, A. Newham, H. Sheraji, K. Chatha, R. McLaren, V. Ganesan, S. Namasivayam, D. Butterfield, R. Shenoy, R. Attar, " Blackghost: “An ultra-low-power all-in-one 28nm CMOS SoC for Internet-of-Things,” in Proc. IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS): 1-3, 2017.

[25] H. Cherupalli, H. Duwe, W. Ye, R. Kumar, J. Sartori,  (2018).Bespoke Processors for Applications with Ultra-Low Area and Power Constraints IEEE Micro, 38(3): 32-39, 2018.

[26] S. Yin, P. Ouyang, J. Yang, T. Lu, X. Li, L. Liu, S. Wei, "An Ultra-High Energy-Efficient Reconfigurable Processor for Deep Neural Networks with Binary/Ternary Weights in 28nm CMOS," in Proc. IEEE Symposium on VLSI Circuits: 37-38, 2018.

[27] M. Wang, N. Yu, W. Ma, Q. Sheng, W. Zhang, Z. Huang, " An Ultra Low-power Processor with Dynamic Regfile Configuration," in Proc. 2018 IEEE International Conference on Solid-State and Integrated Circuits Technology (ICSICT): 1-3, 2018.

[28] P.  Meinerzhagen,  S.  Sherazi,  A.  Burg, J.  Rodrigues, “Benchmarking of standard-cell based memories in the sub-vt domain in 65-nm CMOS technology,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems 1(2): 173-182, 2011.

[29] L. Nazhandali, M. Minuth, T. Austin, "Sensebench:  toward an accurate evaluation of sensor network processors," in Proc. IEEE Workload Characterization Symposium: 197-203, 2005.

[30] S. Mysore, B. Agrawal, F. Chong, T. Sherwood, "Exploring the processor and ISA design for wireless sensor network applications," in Proc. 21st International Conference on VLSI Design (VLSID 2008): 59-64, 2008.

[31] A. Srivastava, D. Sylvester, D. Blaauw, Statistical Analysis and Optimization for VLSI: Timing and Power.  New York: Springer, 2005.

[32]S. Sarangi, B. Greskamp, R. Teodorescu, J. Nakano, A. Tiwari, J. Torrellas, VARIUS: A model of process variation and resulting timing errors for microarchitects. IEEE Trans. On Semiconductor Manufacturing. 21(1): 3-13, 2008.

[33]LTE; Evolved Universal Terrestrial Radio Access (E-UTRA); Physical channels and modulation (3GPP TS 36.211 version 12.5.0 Release 12), ETSI, 2015.

[34]   LTE in a Nutshell: The Physical Layer, Telesystem Innovations, 2010.

[35]   T. Patyk, D. Guevorkian, T. Pitkanen, P. Jaaskelainen, J. Takala, "Low-power application-specific FFT processor for LTE applications," in Proc. IEEE International Conf. on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS): 28-32, 2013.

[36] S. Y. Peng, K. T. Shr, C. M. Chen, Y. H. Huang, "Energy-efficient 128~2048/1536-point FFT processor with resource block mapping for 3GPP-LTE system," in Proc. 2010 IEEE International Conference on Green Circuits and Systems: 14-17, 2010.

[37]C. H. Yang, T. H. Yu, D. Markovic, “Power and area minimization of reconfigurable FFT processors: A 3GPP-LTE example,” IEEE Journal of Solid-State Circuits. 47(3): 757-768, 2012.