Design of Energy-efficient Processing Elements for Near-threshold Parallel Computing

Design of Energy-efficient Processing Elements for Near-threshold Parallel Computing PDF Author: Michael Andreas Gautschi
Publisher:
ISBN: 9783866285958
Category :
Languages : en
Pages : 177

Get Book Here

Book Description

Design of Energy-efficient Processing Elements for Near-threshold Parallel Computing

Design of Energy-efficient Processing Elements for Near-threshold Parallel Computing PDF Author: Michael Andreas Gautschi
Publisher:
ISBN: 9783866285958
Category :
Languages : en
Pages : 177

Get Book Here

Book Description


Near Threshold Computing

Near Threshold Computing PDF Author: Michael Hübner
Publisher: Springer
ISBN: 3319233890
Category : Technology & Engineering
Languages : en
Pages : 105

Get Book Here

Book Description
This book explores near-threshold computing (NTC), a design-space using techniques to run digital chips (processors) near the lowest possible voltage. Readers will be enabled with specific techniques to design chips that are extremely robust; tolerating variability and resilient against errors. Variability-aware voltage and frequency allocation schemes will be presented that will provide performance guarantees, when moving toward near-threshold manycore chips. · Provides an introduction to near-threshold computing, enabling reader with a variety of tools to face the challenges of the power/utilization wall; · Demonstrates how to design efficient voltage regulation, so that each region of the chip can operate at the most efficient voltage and frequency point; · Investigates how performance guarantees can be ensured when moving towards NTC manycores through variability-aware voltage and frequency allocation schemes.

An Event-Driven Parallel-Processing Subsystem for Energy-Efficient Mobile Medical Instrumentation

An Event-Driven Parallel-Processing Subsystem for Energy-Efficient Mobile Medical Instrumentation PDF Author: Florian Stefan Glaser
Publisher: BoD – Books on Demand
ISBN: 3866287771
Category : Technology & Engineering
Languages : en
Pages : 216

Get Book Here

Book Description
Aging population and the thereby ever-rising cost of health services call for novel and innovative solutions for providing medical care and services. So far, medical care is primarily provided in the form of time-consuming in-person appointments with trained personnel and expensive, stationary instrumentation equipment. As for many current and past challenges, the advances in microelectronics are a crucial enabler and offer a plethora of opportunities. With key building blocks such as sensing, processing, and communication systems and circuits getting smaller, cheaper, and more energy-efficient, personal and wearable or even implantable point-of-care devices with medicalgrade instrumentation capabilities become feasible. Device size and battery lifetime are paramount for the realization of such devices. Besides integrating the required functionality into as few individual microelectronic components as possible, the energy efficiency of such is crucial to reduce battery size, usually being the dominant contributor to overall device size. In this thesis, we present two major contributions to achieve the discussed goals in the context of miniaturized medical instrumentation: First, we present a synchronization solution for embedded, parallel near-threshold computing (NTC), a promising concept for enabling the required processing capabilities with an energy efficiency that is suitable for highly mobile devices with very limited battery capacity. Our proposed solution aims at increasing energy efficiency and performance for parallel NTC clusters by maximizing the effective utilization of the available cores under parallel workloads. We describe a hardware unit that enables fine-grain parallelization by greatly optimizing and accelerating core-to-core synchronization and communication and analyze the impact of those mechanisms on the overall performance and energy efficiency of an eight-core cluster. With a range of digital signal processing (DSP) applications typical for the targeted systems, the proposed hardware unit improves performance by up to 92% and 23% on average and energy efficiency by up to 98% and 39% on average. In the second part, we present a MCU processing and control subsystem (MPCS) for the integration into VivoSoC, a highly versatile single-chip solution for mobile medical instrumentation. In addition to the MPCS, it includes a multitude of analog front-ends (AFEs) and a multi-channel power management IC (PMIC) for voltage conversion. ...

Energy-Efficient VLSI Architectures for Real-Time and 3D Video Processing

Energy-Efficient VLSI Architectures for Real-Time and 3D Video Processing PDF Author: Michael Stefano Fritz Schaffner
Publisher: BoD – Books on Demand
ISBN: 3866286244
Category : Science
Languages : en
Pages : 294

Get Book Here

Book Description
Multiview autostereoscopic displays (MADs) make it possible to view video content in 3D without wearing special glasses, and such displays have recently become available. The main problem of MADs is that they require several (typically 8 or 9) views, while most of the 3D video content is in stereoscopic 3D today. To bridge this content-display gap, the research community started to devise automatic multiview synthesis (MVS) methods. Common MVS methods are based on depth-image-based rendering, where a dense depth map of the scene is used to reproject the image to new viewpoints. Although physically correct, this approach requires accurate depth maps and additional inpainting steps. Our work uses an alternative conversion concept based on image domain warping (IDW) which has been successfully applied to related problems such as aspect ratio retargeting for streaming video, and dispa- rity remapping for depth adjustments in stereoscopic 3D content. IDW shows promising performance in this context as it only requires robust, sparse point- correspondences and no inpainting steps. However, MVS, using IDW as well as alternative approaches, is computationally demanding and requires realtime processing - yet such methods should be portable to end-user and even mobile devices to develop their full potential. To this end, this thesis investigates efficient algorithms and hardware architectures for a variety of subproblems arising in the MVS pipeline.

Fighting Back the Von Neumann Bottleneck with Small- and Large-Scale Vector Microprocessors

Fighting Back the Von Neumann Bottleneck with Small- and Large-Scale Vector Microprocessors PDF Author: Matheus Cavalcante
Publisher: BoD – Books on Demand
ISBN: 3866288018
Category :
Languages : en
Pages : 224

Get Book Here

Book Description
In his seminal Turing Award Lecture, Backus discussed the issues stemming from the word-at-a-time style of programming inherited from the von Neumann computer. More than forty years later, computer architects must be creative to amortize the von Neumann Bottleneck (VNB) associated with fetching and decoding instructions which only keep the datapath busy for a very short period of time. In particular, vector processors promise to be one of the most efficient architectures to tackle the VNB, by amortizing the energy overhead of instruction fetching and decoding over several chunks of data. This work explores vector processing as an option to build small and efficient processing elements for large-scale clusters of cores sharing access to tightly-coupled L1 memory

An Open-Source Research Platform for Heterogeneous Systems on Chip

An Open-Source Research Platform for Heterogeneous Systems on Chip PDF Author: Andreas Dominic Kurth
Publisher: BoD – Books on Demand
ISBN: 3866287747
Category : Science
Languages : en
Pages : 282

Get Book Here

Book Description
Heterogeneous systems on chip (HeSoCs) combine general-purpose, feature-rich multi-core host processors with domain-specific programmable many-core accelerators (PMCAs) to unite versatility with energy efficiency and peak performance. By virtue of their heterogeneity, HeSoCs hold the promise of increasing performance and energy efficiency compared to homogeneous multiprocessors, because applications can be executed on hardware that is designed for them. However, this heterogeneity also increases system complexity substantially. This thesis presents the first research platform for HeSoCs where all components, from accelerator cores to application programming interface, are available under permissive open-source licenses. We begin by identifying the hardware and software components that are required in HeSoCs and by designing a representative hardware and software architecture. We then design, implement, and evaluate four critical HeSoC components that have not been discussed in research at the level required for an open-source implementation: First, we present a modular, topology-agnostic, high-performance on-chip communication platform, which adheres to a state-of-the-art industry-standard protocol. We show that the platform can be used to build high-bandwidth (e.g., 2.5 GHz and 1024 bit data width) end-to-end communication fabrics with high degrees of concurrency (e.g., up to 256 independent concurrent transactions). Second, we present a modular and efficient solution for implementing atomic memory operations in highly-scalable many-core processors, which demonstrates near-optimal linear throughput scaling for various synthetic and real-world workloads and requires only 0.5 kGE per core. Third, we present a hardware-software solution for shared virtual memory that avoids the majority of translation lookaside buffer misses with prefetching, supports parallel burst transfers without additional buffers, and can be scaled with the workload and number of parallel processors. Our work improves accelerator performance for memory-intensive kernels by up to 4×. Fourth, we present a software toolchain for mixed-data-model heterogeneous compilation and OpenMP offloading. Our work enables transparent memory sharing between a 64-bit host processor and a 32-bit accelerator at overheads below 0.7 % compared to 32-bit-only execution. Finally, we combine our contributions to a research platform for state-of-the-art HeSoCs and demonstrate its performance and flexibility.

Energy Efficient High Performance Processors

Energy Efficient High Performance Processors PDF Author: Jawad Haj-Yahya
Publisher: Springer
ISBN: 9811085544
Category : Technology & Engineering
Languages : en
Pages : 176

Get Book Here

Book Description
This book explores energy efficiency techniques for high-performance computing (HPC) systems using power-management methods. Adopting a step-by-step approach, it describes power-management flows, algorithms and mechanism that are employed in modern processors such as Intel Sandy Bridge, Haswell, Skylake and other architectures (e.g. ARM). Further, it includes practical examples and recent studies demonstrating how modem processors dynamically manage wide power ranges, from a few milliwatts in the lowest idle power state, to tens of watts in turbo state. Moreover, the book explains how thermal and power deliveries are managed in the context this huge power range. The book also discusses the different metrics for energy efficiency, presents several methods and applications of the power and energy estimation, and shows how by using innovative power estimation methods and new algorithms modern processors are able to optimize metrics such as power, energy, and performance. Different power estimation tools are presented, including tools that break down the power consumption of modern processors at sub-processor core/thread granularity. The book also investigates software, firmware and hardware coordination methods of reducing power consumption, for example a compiler-assisted power management method to overcome power excursions. Lastly, it examines firmware algorithms for dynamic cache resizing and dynamic voltage and frequency scaling (DVFS) for memory sub-systems.

Circuits and Systems Advances in Near Threshold Computing

Circuits and Systems Advances in Near Threshold Computing PDF Author: Sanghamitra Roy
Publisher: MDPI
ISBN: 3036507205
Category : Technology & Engineering
Languages : en
Pages : 120

Get Book Here

Book Description
Modern society is witnessing a sea change in ubiquitous computing, in which people have embraced computing systems as an indispensable part of day-to-day existence. Computation, storage, and communication abilities of smartphones, for example, have undergone monumental changes over the past decade. However, global emphasis on creating and sustaining green environments is leading to a rapid and ongoing proliferation of edge computing systems and applications. As a broad spectrum of healthcare, home, and transport applications shift to the edge of the network, near-threshold computing (NTC) is emerging as one of the promising low-power computing platforms. An NTC device sets its supply voltage close to its threshold voltage, dramatically reducing the energy consumption. Despite showing substantial promise in terms of energy efficiency, NTC is yet to see widescale commercial adoption. This is because circuits and systems operating with NTC suffer from several problems, including increased sensitivity to process variation, reliability problems, performance degradation, and security vulnerabilities, to name a few. To realize its potential, we need designs, techniques, and solutions to overcome these challenges associated with NTC circuits and systems. The readers of this book will be able to familiarize themselves with recent advances in electronics systems, focusing on near-threshold computing.

Energy Efficient Microarchitectures for On-chip Voltage Regulation and Low Noise Computing

Energy Efficient Microarchitectures for On-chip Voltage Regulation and Low Noise Computing PDF Author: Yuxin Bai
Publisher:
ISBN:
Category :
Languages : en
Pages : 128

Get Book Here

Book Description
"Power- and energy-efficiency are significant requirements in virtually all computer systems, from mobile devices to large-scale data centers. Power delivery is a process that distributes stable supply voltages to gates within an integrated circuit (IC). The design of such a delivery network is a critical task to guarantee functionality, timing, and operation reliability, and significantly affects the power- and energy-efficiency of a high performance IC. Therefore, microarchitectural solutions that are aware of the power delivery system, should be capable of exploring a larger optimization space for energy efficient computer systems. This thesis proposes two microarchitectural techniques that leverage the design tradeoffs of the underlying power delivery networks to achieve energy-efficient computing. First, the use of MOS current-mode logic (MCML) is explored as a fast and low-noise alternative to static CMOS logic in microprocessors, thereby improving the performance, energy-efficiency, and signal integrity of future computer systems. The power and ground noise generated by an MCML circuit is typically 10 × -100× smaller than the noise generated by a static CMOS circuit, and therefore can significantly relax the typical design constraints imposed on the power delivery network. Unlike a static CMOS circuit, in which dynamic power is proportional to the clock frequency, an MCML circuit dissipates a constant power independent of the clock frequency. Although these traits make MCML highly energy-efficient when operating at high speeds, the constant static power of MCML poses a challenge for a microarchitecture that operates at a modest clock rate and with a low activity factor. To address this challenge, this thesis explores a single-core microarchitecture for MCML that takes advantage of the C-slow retiming technique, and runs at a high frequency with low complexity to save energy. This design principle differs fundamentally from the contemporary multicore design paradigm for static CMOS, which relies on a large number of gates running in parallel at modest speeds. The proposed architecture generates 10-40× lower power and ground noise, and operates at a level of performance within 13% of a conventional, eight-core static CMOS system, while exhibiting 1.6× lower energy and 9% less area. Moreover, the operation of the MCML processor is robust under both systematic and random variations in transistor threshold voltage and effective channel length. Dynamic voltage and frequency scaling (DVFS) is an effective technique used in power management. Voltage regulators are key components for power generation during the power delivery process. Emerging on-chip voltage regulators has the potential to increase the energy efficiency of computer systems by enabling the control of DVFS at a fine granularity in both space and time. A low dropout voltage regulator (LDO) is suitable for on-chip integration due to its speed, regulation quality, and area advantages. The energy conversion efficiency of an LDO, however, is dependent on the ratio of the input and output voltages, which results in energy waste when DVFS is applied over a wide voltage range. A DVFS framework that relies on a hierarchy of off-chip switching regulators and per-core on-chip LDOs is proposed. It ensures fast DVFS in nanoseconds and a more than 90% regulator efficiency over a wide voltage range. A control policy using a reinforcement learning (RL) approach is proposed to exploit the fine-granularity control of power and the high regulator efficiency enabled by the framework. Per-core RL agents learn and improve their DVFS policies independently, while retaining the ability to coordinate their actions to accomplish system level power management objectives. The proposed framework achieves 18% greater energy efficiency than a typical per-core DVFS framework using on-chip switching regulators when evaluated on a mix of 14 parallel and 13 multiprogrammed workloads. Moreover, the proposed RL policy is 21% more energy efficient as compared to an oracle policy with coarse-grained DVFS"--Pages vi-vii.

并行程序设计

并行程序设计 PDF Author: Foster
Publisher:
ISBN: 9787115103475
Category : Computer programming
Languages : zh-CN
Pages : 381

Get Book Here

Book Description
国外著名高等院校信息科学与技术优秀教材