الفهرس | Only 14 pages are availabe for public view |
Abstract An accelerator is a specialized hardware unit that performs defined tasks with higher performance or more energy efficient than a general-purpose CPU. Recently, advances in the Internet Of Things (IoT), video imaging analytics, autonomous vehicles, and Artificial Intelligence (AI) robotics have created a significant demand to develop and improve hardware accelerators. Video Processing Units (VPU) and GPUs held the largest market share by more than 60% in 2020. VPUs and GPUs are based on instruction-level accelerators that are designed for single primitives, like arithmetic operations. As a result, both academia and industry have proposed numerous methods to optimize arithmetic operations and improve accelerators’ efficiency. This thesis proposes an optimized PAU to be placed in the core of hardware accelerators specially the image classification and object detection NNs. Our main contributions can be summarized as pursues. First, modeling Register Transfer Level (RTL) implementation of PAU using two’s complement algorithm. Second, modeling RTL implementation of compact PAU using two’s complement algorithm. Finally, we modeled the different NNs used in different image classification and object detection tasks using Qtorch+ framework with different data types to show the exceeding results of the low bit posits than 16-bit or 32-bit floating/fixed-point representations in both activation and weight data. The thesis is divided into seven chapters as listed below: Chapter 1: presents the hardware accelerators history, its performance when it is compared to general-purpose processors. Also, it shows its applications, research topics, and types of hardware accelerators based on its location inside the system and the level of abstraction. Chapter 2: introduces the NNs, their definition, and different types of NNs that are used in the core of various deep learning tasks. Chapter 3: explains the motivation behind our optimization methodology and the different image classification and object detection models that we used during our modeling. We discuss the differences between the models and the specifications of each one. Also, we introduced the posit numbering system that we used to build the arithmetic units, the core of the hardware accelerators and the posit decoding algorithms. Chapter 4: summarizes the related work of using the posits in the arithmetic units in the core of the hardware accelerators and evaluates their performance and limitations. Chapter 5: highlights the proposed arithmetic unit used in the heart of the hardware accelerators for various NNs and its results. Additionally, the compact PAU is presented. Chapter 6: states our contribution for modeling the different image classification and object detection models using Qtorch+ framework and their accuracy results Chapter 7: concludes the results and proposes some future work as well. |