GPU Power Modeling and Architectural Enhancements for GPU Energy Efficiency

GPU Power Modeling and Architectural Enhancements for GPU Energy Efficiency PDF Author: Jan Lucas
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description

GPU Power Modeling and Architectural Enhancements for GPU Energy Efficiency

GPU Power Modeling and Architectural Enhancements for GPU Energy Efficiency PDF Author: Jan Lucas
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Get Book Here

Book Description


Power Modeling and Architectural Techniques for Energy-efficient GPUs

Power Modeling and Architectural Techniques for Energy-efficient GPUs PDF Author: Sohan Lal
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description


GPU Energy Modeling and Analysis

GPU Energy Modeling and Analysis PDF Author: Zain Asgar
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Over the past couple of decades GPUs have enjoyed tremendous scaling in both functionality and performance by focusing on area efficient processing. However, the slowdown in supply voltage scaling has created a new hurdle to continued scaling of GPU performance. This slowdown in voltage scaling has caused power consumption to limit the achievable GPU performance. Since GPUs currently use many of the well-known hardware techniques for reduced power consumption, GPU designers need to start looking at architectural techniques to improve energy efficiency. To enable this exploration, we create an accurate power model of GPU architectures and apply this model to explore a couple of methods to save power. As part of these studies we will look at overdraw (which occurs when a given pixel's value is computed more than once) and thread level redundancy in the shader processor of the GPU. Through the use of our model and GPU performance data, we will show that significant opportunities exist for improving energy efficiency. These studies demonstrate both the utility of our power model, and the potential of architectural changes to make GPUs more energy efficient.

Modeling Performance and Power for Energy-efficient GPGPU Computing

Modeling Performance and Power for Energy-efficient GPGPU Computing PDF Author: Sunpyo Hong
Publisher:
ISBN:
Category : Computer architecture
Languages : en
Pages :

Get Book Here

Book Description
The objective of the proposed research is to develop an analytical model that predicts performance and power for many-core architecture and further propose a mechanism, which leverages the analytical model, to enable energy-efficient execution of an application. The key insight of the model is to investigate and quantify a complex relationship that exists between the thread-level parallelism and memory-level parallelism for an application on a given many-core architecture. Two metrics are proposed: memory warp parallelism (MWP), which refers to the number of overlapping memory accesses per core, and computation warp parallelism (CWP), which characterizes an application type. By using these metrics in addition to the architectural and application parameters, the overall application performance is produced. The model uses statically-available parameters such as instruction-mixture information and input-data size, and the prediction accuracy is 13.3% for the GPU-computing benchmarks. Another important aspect of using many-core architecture is reducing peak power and achieving energy savings. By using the proposed integrated power and performance (IPP) framework, the results showed that different optimization points exist for GPU architecture depending on the application type. The work shows that by activating fewer cores, 10.99% of run-time energy consumption can be saved for the bandwidth-limited benchmarks, and a projection of 25.8% energy savings is predicted when power-gating at core level is employed. Finally, the model is shifted to throughput using OpenCL for targeting more variety of processors. First, multiple outputs relating to performance are predicted, including upper-bound and lower-bound values. Second, by using the model parameters, an application can be categorized into a different category, each with its own suggestions for improving performance and energy efficiency. Third, the bandwidth saturation point accuracy is significantly improved by considering independent memory accesses and updating the performance model. Furthermore, a trade-off analysis using architectural and application parameters is straightforward, which provides more insights to improve energy efficiency. In the future, a computer system will contain hundreds of heterogeneous cores. Hence, it is mandatory that a workload gets scheduled to an efficient core or distributed on both types of cores. A preliminary work by using the analytical model to do scheduling between CPU and GPU is demonstrated in the appendix. Since profiling phase is not required, the kernel code can be transformed to run more efficiently on the specific architecture. Another extension of the work regarding the relationship between the speed-up and energy efficiency is mathematically derived. Finally, future research ideas are presented regarding the usage of the model for programmer, compiler, and runtime for future heterogeneous systems.

General-Purpose Graphics Processor Architectures

General-Purpose Graphics Processor Architectures PDF Author: Tor M. Aamodt
Publisher: Morgan & Claypool Publishers
ISBN: 1627056181
Category : Computers
Languages : en
Pages : 142

Get Book Here

Book Description
Originally developed to support video games, graphics processor units (GPUs) are now increasingly used for general-purpose (non-graphics) applications ranging from machine learning to mining of cryptographic currencies. GPUs can achieve improved performance and efficiency versus central processing units (CPUs) by dedicating a larger fraction of hardware resources to computation. In addition, their general-purpose programmability makes contemporary GPUs appealing to software developers in comparison to domain-specific accelerators. This book provides an introduction to those interested in studying the architecture of GPUs that support general-purpose computing. It collects together information currently only found among a wide range of disparate sources. The authors led development of the GPGPU-Sim simulator widely used in academic research on GPU architectures. The first chapter of this book describes the basic hardware structure of GPUs and provides a brief overview of their history. Chapter 2 provides a summary of GPU programming models relevant to the rest of the book. Chapter 3 explores the architecture of GPU compute cores. Chapter 4 explores the architecture of the GPU memory system. After describing the architecture of existing systems, Chapters \ref{ch03} and \ref{ch04} provide an overview of related research. Chapter 5 summarizes cross-cutting research impacting both the compute core and memory system. This book should provide a valuable resource for those wishing to understand the architecture of graphics processor units (GPUs) used for acceleration of general-purpose applications and to those who want to obtain an introduction to the rapidly growing body of research exploring how to improve the architecture of these GPUs.

High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation

High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation PDF Author: Stephen A. Jarvis
Publisher: Springer
ISBN: 3319102141
Category : Computers
Languages : en
Pages : 303

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 4th International Workshop, PMBS 2013 in Denver, CO, USA in November 2013. The 14 papers presented in this volume were carefully reviewed and selected from 37 submissions. The selected articles broadly cover topics on massively parallel and high-performance simulations, modeling and simulation, model development and analysis, performance optimization, power estimation and optimization, high performance computing, reliability, performance analysis, and network simulations.

Performance and Power Modeling of GPU Systems with Dynamic Voltage and Frequency Scaling

Performance and Power Modeling of GPU Systems with Dynamic Voltage and Frequency Scaling PDF Author: Qiang Wang
Publisher:
ISBN:
Category : Computer systems
Languages : en
Pages : 141

Get Book Here

Book Description
To address the ever-increasing demand for computing capacities, more and more heterogeneous systems have been designed to use both general-purpose and special-purpose processors. The huge energy consumption of them raises new environmental concerns and challenges. Besides performance, energy efficiency is another key factor to be considered by system designers and consumers. In particular, contemporary graphics processing units (GPUs) support dynamic voltage and frequency scaling (DVFS) to balance computational performance and energy consumption. However, accurate and straightforward performance and power estimation for a given GPU kernel under different frequency settings is still lacking for real hardware, which is essential to determine the best frequency configuration for energy saving. In this thesis, we investigate how to improve the energy efficiency of GPU systems by accurately modeling the effects of GPU DVFS on the target GPU kernel. We also propose efficient algorithms to solve the communication contention problem in scheduling multiple distributed deep learning (DDL) jobs on GPU clusters. We introduce our studies as follows. First, we present a benchmark suite EPPMiner for evaluating the performance, power, and energy of different heterogeneous systems. EPPMiner consists of 16 benchmark programs that cover a broad range of application domains, and it shows a great variety in the intensity of utilizing the processors. We have implemented a prototype of EPPMiner that supports OpenMP, CUDA, and OpenCL, and demonstrated its usage by three showcases. The showcases justify that GPUs provide much better energy efficiency than other types of computing systems, and especially illustrate the effectiveness of GPU Dynamic Voltage and Frequency Scaling (DVFS) on the energy efficiency of GPU applications. Second, we reveal a fine-grained analytical model to estimate the execution time of GPU kernels with both core and memory frequency scaling. Compared to the cycle-level simulators, which are too slow to apply on real hardware, our model only needs one-off micro-benchmarks to extract a set of hardware parameters and kernel performance counters without any source code analysis. Our experimental results show that the proposed performance model can capture the kernel performance scaling behaviors under different frequency settings and achieve decent accuracy. Third, we design a cross-benchmarking suite, which simulates kernels with a wide range of instruction distributions. The synthetic kernels generated by this suite can be used for model pre- training or as supplementary training samples. We then build machine learning models to predict the execution time and runtime power of a GPU kernel under different voltage and frequency settings. Validated on three modern GPUs with a wide frequency scaling range, by using a collection of 24 real application kernels, the model trained only with our cross-benchmarking suite is able to achieve considerably accurate results. At last, we establish a new DDL job scheduling framework which organizes DDL jobs as Directed Acyclic Graphs (DAGs) and considers communication contention between nodes. We then propose an efficient job placement algorithm, Least-Workload-First- (LWF-), to balance the GPU utilization and consolidate the allocated GPUs for each job. When scheduling the communication tasks, we propose Ada-SRSF for the DDL job scheduling problem to address the communication contention issue. Our simulation results show that LWF- achieves up to 1.59x improvement over the classical first-fit algorithms. More importantly, Ada-SRSF reduces the average job completion time by up to 36.7%, as compared to the solutions of either avoiding all the communication contention or accepting all of it

Power and Hotspot Modeling for Modern GPUs

Power and Hotspot Modeling for Modern GPUs PDF Author: Md Mainul- Hassan
Publisher:
ISBN:
Category :
Languages : en
Pages : 57

Get Book Here

Book Description
As General Purpose GPUs (GPGPU) are increasingly becoming a prominent component of high performance computing platforms, power and thermal dissipation are getting more attention. The trade-offs among performance, power, and heat must be well modeled and evaluated from the early stage of GPU design. This necessitates a tool that allows GPU architects to quickly and accurately evaluate their design. There are a few models for GPU power but most of them estimate power at a higher level than architecture, which are therefore missing hardware reconfigurability. In this thesis, we propose a framework that models power and heat dissipation at the hardware architecture level, which allows for configuring and investigating individual hardware components. Our framework is also capable of visualizing the heat map of the processor over different clock cycles. To the best of our knowledge, this is the first comprehensive framework that integrates and visualizes power consumption and heat dissipation of GPUs.

General-Purpose Graphics Processor Architecture

General-Purpose Graphics Processor Architecture PDF Author: Tor M. Aamodt
Publisher: Synthesis Lectures on Computer
ISBN: 9781627059237
Category : Computers
Languages : en
Pages : 140

Get Book Here

Book Description
Originally developed to support video games, graphics processor units (GPUs) are now increasingly used for general-purpose (non-graphics) applications ranging from machine learning to mining of cryptographic currencies. GPUs can achieve improved performance and efficiency versus central processing units (CPUs) by dedicating a larger fraction of hardware resources to computation. In addition, their general-purpose programmability makes contemporary GPUs appealing to software developers in comparison to domain-specific accelerators. This book provides an introduction to those interested in studying the architecture of GPUs that support general-purpose computing. It collects together information currently only found among a wide range of disparate sources. The authors led development of the GPGPU-Sim simulator widely used in academic research on GPU architectures. The first chapter of this book describes the basic hardware structure of GPUs and provides a brief overview of their history. Chapter 2 provides a summary of GPU programming models relevant to the rest of the book. Chapter 3 explores the architecture of GPU compute cores. Chapter 4 explores the architecture of the GPU memory system. After describing the architecture of existing systems, Chapters \ref{ch03} and \ref{ch04} provide an overview of related research. Chapter 5 summarizes cross-cutting research impacting both the compute core and memory system. This book should provide a valuable resource for those wishing to understand the architecture of graphics processor units (GPUs) used for acceleration of general-purpose applications and to those who want to obtain an introduction to the rapidly growing body of research exploring how to improve the architecture of these GPUs.

Computer Architecture Techniques for Power-efficiency

Computer Architecture Techniques for Power-efficiency PDF Author: Stefanos Kaxiras
Publisher: Morgan & Claypool Publishers
ISBN: 1598292080
Category : Computers
Languages : en
Pages : 220

Get Book Here

Book Description
In the last few years, power dissipation has become an important design constraint, on par with performance, in the design of new computer systems. Whereas in the past, the primary job of the computer architect was to translate improvements in operating frequency and transistor count into performance, now power efficiency must be taken into account at every step of the design process. While for some time, architects have been successful in delivering 40% to 50% annual improvement in processor performance, costs that were previously brushed aside eventually caught up. The most critical of these costs is the inexorable increase in power dissipation and power density in processors. Power dissipation issues have catalyzed new topic areas in computer architecture, resulting in a substantial body of work on more power-efficient architectures. Power dissipation coupled with diminishing performance gains, was also the main cause for the switch from single-core to multi-core architectures and a slowdown in frequency increase. This book aims to document some of the most important architectural techniques that were invented, proposed, and applied to reduce both dynamic power and static power dissipation in processors and memory hierarchies. A significant number of techniques have been proposed for a wide range of situations and this book synthesizes those techniques by focusing on their common characteristics.