GPU Energy Modeling and Analysis

GPU Energy Modeling and Analysis PDF Author: Zain Asgar
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Over the past couple of decades GPUs have enjoyed tremendous scaling in both functionality and performance by focusing on area efficient processing. However, the slowdown in supply voltage scaling has created a new hurdle to continued scaling of GPU performance. This slowdown in voltage scaling has caused power consumption to limit the achievable GPU performance. Since GPUs currently use many of the well-known hardware techniques for reduced power consumption, GPU designers need to start looking at architectural techniques to improve energy efficiency. To enable this exploration, we create an accurate power model of GPU architectures and apply this model to explore a couple of methods to save power. As part of these studies we will look at overdraw (which occurs when a given pixel's value is computed more than once) and thread level redundancy in the shader processor of the GPU. Through the use of our model and GPU performance data, we will show that significant opportunities exist for improving energy efficiency. These studies demonstrate both the utility of our power model, and the potential of architectural changes to make GPUs more energy efficient.

GPU Energy Modeling and Analysis

GPU Energy Modeling and Analysis PDF Author: Zain Asgar
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Over the past couple of decades GPUs have enjoyed tremendous scaling in both functionality and performance by focusing on area efficient processing. However, the slowdown in supply voltage scaling has created a new hurdle to continued scaling of GPU performance. This slowdown in voltage scaling has caused power consumption to limit the achievable GPU performance. Since GPUs currently use many of the well-known hardware techniques for reduced power consumption, GPU designers need to start looking at architectural techniques to improve energy efficiency. To enable this exploration, we create an accurate power model of GPU architectures and apply this model to explore a couple of methods to save power. As part of these studies we will look at overdraw (which occurs when a given pixel's value is computed more than once) and thread level redundancy in the shader processor of the GPU. Through the use of our model and GPU performance data, we will show that significant opportunities exist for improving energy efficiency. These studies demonstrate both the utility of our power model, and the potential of architectural changes to make GPUs more energy efficient.

Performance and Power Modeling of GPU Systems with Dynamic Voltage and Frequency Scaling

Performance and Power Modeling of GPU Systems with Dynamic Voltage and Frequency Scaling PDF Author: Qiang Wang
Publisher:
ISBN:
Category : Computer systems
Languages : en
Pages : 141

Get Book Here

Book Description
To address the ever-increasing demand for computing capacities, more and more heterogeneous systems have been designed to use both general-purpose and special-purpose processors. The huge energy consumption of them raises new environmental concerns and challenges. Besides performance, energy efficiency is another key factor to be considered by system designers and consumers. In particular, contemporary graphics processing units (GPUs) support dynamic voltage and frequency scaling (DVFS) to balance computational performance and energy consumption. However, accurate and straightforward performance and power estimation for a given GPU kernel under different frequency settings is still lacking for real hardware, which is essential to determine the best frequency configuration for energy saving. In this thesis, we investigate how to improve the energy efficiency of GPU systems by accurately modeling the effects of GPU DVFS on the target GPU kernel. We also propose efficient algorithms to solve the communication contention problem in scheduling multiple distributed deep learning (DDL) jobs on GPU clusters. We introduce our studies as follows. First, we present a benchmark suite EPPMiner for evaluating the performance, power, and energy of different heterogeneous systems. EPPMiner consists of 16 benchmark programs that cover a broad range of application domains, and it shows a great variety in the intensity of utilizing the processors. We have implemented a prototype of EPPMiner that supports OpenMP, CUDA, and OpenCL, and demonstrated its usage by three showcases. The showcases justify that GPUs provide much better energy efficiency than other types of computing systems, and especially illustrate the effectiveness of GPU Dynamic Voltage and Frequency Scaling (DVFS) on the energy efficiency of GPU applications. Second, we reveal a fine-grained analytical model to estimate the execution time of GPU kernels with both core and memory frequency scaling. Compared to the cycle-level simulators, which are too slow to apply on real hardware, our model only needs one-off micro-benchmarks to extract a set of hardware parameters and kernel performance counters without any source code analysis. Our experimental results show that the proposed performance model can capture the kernel performance scaling behaviors under different frequency settings and achieve decent accuracy. Third, we design a cross-benchmarking suite, which simulates kernels with a wide range of instruction distributions. The synthetic kernels generated by this suite can be used for model pre- training or as supplementary training samples. We then build machine learning models to predict the execution time and runtime power of a GPU kernel under different voltage and frequency settings. Validated on three modern GPUs with a wide frequency scaling range, by using a collection of 24 real application kernels, the model trained only with our cross-benchmarking suite is able to achieve considerably accurate results. At last, we establish a new DDL job scheduling framework which organizes DDL jobs as Directed Acyclic Graphs (DAGs) and considers communication contention between nodes. We then propose an efficient job placement algorithm, Least-Workload-First- (LWF-), to balance the GPU utilization and consolidate the allocated GPUs for each job. When scheduling the communication tasks, we propose Ada-SRSF for the DDL job scheduling problem to address the communication contention issue. Our simulation results show that LWF- achieves up to 1.59x improvement over the classical first-fit algorithms. More importantly, Ada-SRSF reduces the average job completion time by up to 36.7%, as compared to the solutions of either avoiding all the communication contention or accepting all of it

Modeling Performance and Power for Energy-efficient GPGPU Computing

Modeling Performance and Power for Energy-efficient GPGPU Computing PDF Author: Sunpyo Hong
Publisher:
ISBN:
Category : Computer architecture
Languages : en
Pages :

Get Book Here

Book Description
The objective of the proposed research is to develop an analytical model that predicts performance and power for many-core architecture and further propose a mechanism, which leverages the analytical model, to enable energy-efficient execution of an application. The key insight of the model is to investigate and quantify a complex relationship that exists between the thread-level parallelism and memory-level parallelism for an application on a given many-core architecture. Two metrics are proposed: memory warp parallelism (MWP), which refers to the number of overlapping memory accesses per core, and computation warp parallelism (CWP), which characterizes an application type. By using these metrics in addition to the architectural and application parameters, the overall application performance is produced. The model uses statically-available parameters such as instruction-mixture information and input-data size, and the prediction accuracy is 13.3% for the GPU-computing benchmarks. Another important aspect of using many-core architecture is reducing peak power and achieving energy savings. By using the proposed integrated power and performance (IPP) framework, the results showed that different optimization points exist for GPU architecture depending on the application type. The work shows that by activating fewer cores, 10.99% of run-time energy consumption can be saved for the bandwidth-limited benchmarks, and a projection of 25.8% energy savings is predicted when power-gating at core level is employed. Finally, the model is shifted to throughput using OpenCL for targeting more variety of processors. First, multiple outputs relating to performance are predicted, including upper-bound and lower-bound values. Second, by using the model parameters, an application can be categorized into a different category, each with its own suggestions for improving performance and energy efficiency. Third, the bandwidth saturation point accuracy is significantly improved by considering independent memory accesses and updating the performance model. Furthermore, a trade-off analysis using architectural and application parameters is straightforward, which provides more insights to improve energy efficiency. In the future, a computer system will contain hundreds of heterogeneous cores. Hence, it is mandatory that a workload gets scheduled to an efficient core or distributed on both types of cores. A preliminary work by using the analytical model to do scheduling between CPU and GPU is demonstrated in the appendix. Since profiling phase is not required, the kernel code can be transformed to run more efficiently on the specific architecture. Another extension of the work regarding the relationship between the speed-up and energy efficiency is mathematically derived. Finally, future research ideas are presented regarding the usage of the model for programmer, compiler, and runtime for future heterogeneous systems.

Architecture of Computing Systems – ARCS 2019

Architecture of Computing Systems – ARCS 2019 PDF Author: Martin Schoeberl
Publisher: Springer
ISBN: 3030186563
Category : Computers
Languages : en
Pages : 335

Get Book Here

Book Description
This book constitutes the proceedings of the 32nd International Conference on Architecture of Computing Systems, ARCS 2019, held in Copenhagen, Denmark, in May 2019. The 24 full papers presented in this volume were carefully reviewed and selected from 40 submissions. ARCS has always been a conference attracting leading-edge research outcomes in Computer Architecture and Operating Systems, including a wide spectrum of topics ranging from embedded and real-time systems all the way to large-scale and parallel systems. The selected papers are organized in the following topical sections: Dependable systems; real-time systems; special applications; architecture; memory hierarchy; FPGA; energy awareness; NoC/SoC. The chapter 'MEMPower: Data-Aware GPU Memory Power Model' is open access under a CC BY 4.0 license at link.springer.com.

GPU Power Modeling and Architectural Enhancements for GPU Energy Efficiency

GPU Power Modeling and Architectural Enhancements for GPU Energy Efficiency PDF Author: Jan Lucas
Publisher:
ISBN:
Category : Energy consumption
Languages : en
Pages :

Get Book Here

Book Description


Algorithms and Architectures for Parallel Processing

Algorithms and Architectures for Parallel Processing PDF Author: Joanna Kolodziej
Publisher: Springer
ISBN: 3319038591
Category : Computers
Languages : en
Pages : 502

Get Book Here

Book Description
This two volume set LNCS 8285 and 8286 constitutes the proceedings of the 13th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2013, held in Vietri sul Mare, Italy in December 2013. The first volume contains 10 distinguished and 31 regular papers selected from 90 submissions and covering topics such as big data, multi-core programming and software tools, distributed scheduling and load balancing, high-performance scientific computing, parallel algorithms, parallel architectures, scalable and distributed databases, dependability in distributed and parallel systems, wireless and mobile computing. The second volume consists of four sections including 35 papers from one symposium and three workshops held in conjunction with ICA3PP 2013 main conference. These are 13 papers from the 2013 International Symposium on Advances of Distributed and Parallel Computing (ADPC 2013), 5 papers of the International Workshop on Big Data Computing (BDC 2013), 10 papers of the International Workshop on Trusted Information in Big Data (TIBiDa 2013) as well as 7 papers belonging to Workshop on Cloud-assisted Smart Cyber-Physical Systems (C-Smart CPS 2013).

Energy Abstracts for Policy Analysis

Energy Abstracts for Policy Analysis PDF Author:
Publisher:
ISBN:
Category : Power resources
Languages : en
Pages : 1438

Get Book Here

Book Description


Green and Sustainable Computing: Part II

Green and Sustainable Computing: Part II PDF Author:
Publisher: Academic Press
ISBN: 012407829X
Category : Computers
Languages : en
Pages : 283

Get Book Here

Book Description
Since its first volume in 1960, Advances in Computers has presented detailed coverage of innovations in computer hardware, software, theory, design, and applications. It has also provided contributors with a medium in which they can explore their subjects in greater depth and breadth than journal articles usually allow. As a result, many articles have become standard references that continue to be of sugnificant, lasting value in this rapidly expanding field. In-depth surveys and tutorials on new computer technology Well-known authors and researchers in the field Extensive bibliographies with most chapters Many of the volumes are devoted to single themes or subfields of computer science

Modeling, Analysis, Design, and Tests for Electronics Packaging beyond Moore

Modeling, Analysis, Design, and Tests for Electronics Packaging beyond Moore PDF Author: Hengyun Zhang
Publisher: Woodhead Publishing
ISBN: 0081025335
Category : Technology & Engineering
Languages : en
Pages : 436

Get Book Here

Book Description
Modeling, Analysis, Design and Testing for Electronics Packaging Beyond Moore provides an overview of electrical, thermal and thermomechanical modeling, analysis, design and testing for 2.5D/3D. The book addresses important topics, including electrically and thermally induced issues, such as EMI and thermal issues, which are crucial to package signal and thermal integrity. It also covers modeling methods to address thermomechanical stress related to the package structural integrity. In addition, practical design and test techniques for packages and systems are included. Includes advanced modeling and analysis methods and techniques for state-of-the art electronics packaging Features experimental characterization and qualifications for the analysis and verification of electronic packaging design Provides multiphysics modeling and analysis techniques of electronic packaging

Power System Simulation, Control and Optimization

Power System Simulation, Control and Optimization PDF Author: José Antonio Domínguez-Navarro
Publisher: MDPI
ISBN: 3036507485
Category : Technology & Engineering
Languages : en
Pages : 242

Get Book Here

Book Description
This Special Issue “Power System Simulation, Control and Optimization” offers valuable insights into the most recent research developments in these topics. The analysis, operation, and control of power systems are increasingly complex tasks that require advanced simulation models to analyze and control the effects of transformations concerning electricity grids today: Massive integration of renewable energies, progressive implementation of electric vehicles, development of intelligent networks, and progressive evolution of the applications of artificial intelligence.