Kernel-Based Energy Optimization In GPUs

Kernel-Based Energy Optimization In GPUs PDF Author: Amin Jadidi
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Emerging GPU architectures offer a cost-effective computing platform by providing thousandsof energy-efficient compute cores and high bandwidth memory that facilitate the execution ofhighly parallel applications. In this paper, we show that different applications, and in fact differentkernels from the same application might exhibit significantly varying utilizations of computeand memory resources. However, we observe that the same kernel displays similar behaviorin its different invocations; moreover, most of the kernels are invoked many times during thecourse of execution. By exploiting these properties of kernels, in order to improve the energy efficiencyof the GPU system, we propose a dynamic resource configuration strategy that classifieskernels as compute-intensive or memory-intensive based on their resource utilizations and dynamicallyemploys memory voltage/frequency scaling or core shut-down techniques for computeandmemory-intensive kernels, respectively. This strategy uses performance and memory bandwidthutilization information from the first few invocations of a kernel to determine the optimalhardware configuration for future invocations. Experimental evaluations show that our strategysaves about 20% of total chip energy and 70% of total memory leakage power for memory andcompute-intensive kernels respectively, which are within 8% of the optimal savings that can beobtained from an oracle scheme.

Kernel-Based Energy Optimization In GPUs

Kernel-Based Energy Optimization In GPUs PDF Author: Amin Jadidi
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Emerging GPU architectures offer a cost-effective computing platform by providing thousandsof energy-efficient compute cores and high bandwidth memory that facilitate the execution ofhighly parallel applications. In this paper, we show that different applications, and in fact differentkernels from the same application might exhibit significantly varying utilizations of computeand memory resources. However, we observe that the same kernel displays similar behaviorin its different invocations; moreover, most of the kernels are invoked many times during thecourse of execution. By exploiting these properties of kernels, in order to improve the energy efficiencyof the GPU system, we propose a dynamic resource configuration strategy that classifieskernels as compute-intensive or memory-intensive based on their resource utilizations and dynamicallyemploys memory voltage/frequency scaling or core shut-down techniques for computeandmemory-intensive kernels, respectively. This strategy uses performance and memory bandwidthutilization information from the first few invocations of a kernel to determine the optimalhardware configuration for future invocations. Experimental evaluations show that our strategysaves about 20% of total chip energy and 70% of total memory leakage power for memory andcompute-intensive kernels respectively, which are within 8% of the optimal savings that can beobtained from an oracle scheme.

System-Level Design of GPU-Based Embedded Systems

System-Level Design of GPU-Based Embedded Systems PDF Author: Arian Maghazeh
Publisher: Linköping University Electronic Press
ISBN: 9176851753
Category :
Languages : en
Pages : 62

Get Book Here

Book Description
Modern embedded systems deploy several hardware accelerators, in a heterogeneous manner, to deliver high-performance computing. Among such devices, graphics processing units (GPUs) have earned a prominent position by virtue of their immense computing power. However, a system design that relies on sheer throughput of GPUs is often incapable of satisfying the strict power- and time-related constraints faced by the embedded systems. This thesis presents several system-level software techniques to optimize the design of GPU-based embedded systems under various graphics and non-graphics applications. As compared to the conventional application-level optimizations, the system-wide view of our proposed techniques brings about several advantages: First, it allows for fully incorporating the limitations and requirements of the various system parts in the design process. Second, it can unveil optimization opportunities through exposing the information flow between the processing components. Third, the techniques are generally applicable to a wide range of applications with similar characteristics. In addition, multiple system-level techniques can be combined together or with application-level techniques to further improve the performance. We begin by studying some of the unique attributes of GPU-based embedded systems and discussing several factors that distinguish the design of these systems from that of the conventional high-end GPU-based systems. We then proceed to develop two techniques that address an important challenge in the design of GPU-based embedded systems from different perspectives. The challenge arises from the fact that GPUs require a large amount of workload to be present at runtime in order to deliver a high throughput. However, for some embedded applications, collecting large batches of input data requires an unacceptable waiting time, prompting a trade-off between throughput and latency. We also develop an optimization technique for GPU-based applications to address the memory bottleneck issue by utilizing the GPU L2 cache to shorten data access time. Moreover, in the area of graphics applications, and in particular with a focus on mobile games, we propose a power management scheme to reduce the GPU power consumption by dynamically adjusting the display resolution, while considering the user's visual perception at various resolutions. We also discuss the collective impact of the proposed techniques in tackling the design challenges of emerging complex systems. The proposed techniques are assessed by real-life experimentations on GPU-based hardware platforms, which demonstrate the superior performance of our approaches as compared to the state-of-the-art techniques.

A Tool for Automatic Suggestions for Irregular GPU Kernel Optimization

A Tool for Automatic Suggestions for Irregular GPU Kernel Optimization PDF Author: Saeed Taheri
Publisher:
ISBN:
Category : Computer science
Languages : en
Pages : 106

Get Book Here

Book Description
Future computing systems, from handhelds all the way to supercomputers, will be more parallel and more heterogeneous than today's systems to provide more performance without an increase in power consumption. Therefore, GPUs are increasingly being used to accelerate general-purpose applications, including applications with data-dependent, irregular memory access patterns and control flow. The growing complexity, non-uniformity, heterogeneity, and parallelism will make these systems, i.e., GPGPU-accelerated systems, progressively more difficult to program. In the foreseeable future, the vast majority of programmers will no longer be able to extract additional performance or energy-savings from next-generation systems because their programming will be too difficult, i.e., the programmer will no longer possess the necessary expertise to understand and exploit the systems effectively. In this project, the characteristics of GPU codes will be quantified and, based on these metrics, different optimization suggestions will be made.

Reactor Physics: Methods and Applications

Reactor Physics: Methods and Applications PDF Author: Tengfei Zhang
Publisher: Frontiers Media SA
ISBN: 2889764575
Category : Technology & Engineering
Languages : en
Pages : 272

Get Book Here

Book Description


Proceedings of the 2011 2nd International Congress on Computer Applications and Computational Science

Proceedings of the 2011 2nd International Congress on Computer Applications and Computational Science PDF Author: Ford Lumban Gaol
Publisher: Springer Science & Business Media
ISBN: 364228308X
Category : Technology & Engineering
Languages : en
Pages : 497

Get Book Here

Book Description
The latest inventions in computer technology influence most of human daily activities. In the near future, there is tendency that all of aspect of human life will be dependent on computer applications. In manufacturing, robotics and automation have become vital for high quality products. In education, the model of teaching and learning is focusing more on electronic media than traditional ones. Issues related to energy savings and environment is becoming critical. Computational Science should enhance the quality of human life, not only solve their problems. Computational Science should help humans to make wise decisions by presenting choices and their possible consequences. Computational Science should help us make sense of observations, understand natural language, plan and reason with extensive background knowledge. Intelligence with wisdom is perhaps an ultimate goal for human-oriented science. This book is a compilation of some recent research findings in computer application and computational science. This book provides state-of-the-art accounts in Computer Control and Robotics, Computers in Education and Learning Technologies, Computer Networks and Data Communications, Data Mining and Data Engineering, Energy and Power Systems, Intelligent Systems and Autonomous Agents, Internet and Web Systems, Scientific Computing and Modeling, Signal, Image and Multimedia Processing, and Software Engineering.

Performance and Power Modeling of GPU Systems with Dynamic Voltage and Frequency Scaling

Performance and Power Modeling of GPU Systems with Dynamic Voltage and Frequency Scaling PDF Author: Qiang Wang
Publisher:
ISBN:
Category : Computer systems
Languages : en
Pages : 141

Get Book Here

Book Description
To address the ever-increasing demand for computing capacities, more and more heterogeneous systems have been designed to use both general-purpose and special-purpose processors. The huge energy consumption of them raises new environmental concerns and challenges. Besides performance, energy efficiency is another key factor to be considered by system designers and consumers. In particular, contemporary graphics processing units (GPUs) support dynamic voltage and frequency scaling (DVFS) to balance computational performance and energy consumption. However, accurate and straightforward performance and power estimation for a given GPU kernel under different frequency settings is still lacking for real hardware, which is essential to determine the best frequency configuration for energy saving. In this thesis, we investigate how to improve the energy efficiency of GPU systems by accurately modeling the effects of GPU DVFS on the target GPU kernel. We also propose efficient algorithms to solve the communication contention problem in scheduling multiple distributed deep learning (DDL) jobs on GPU clusters. We introduce our studies as follows. First, we present a benchmark suite EPPMiner for evaluating the performance, power, and energy of different heterogeneous systems. EPPMiner consists of 16 benchmark programs that cover a broad range of application domains, and it shows a great variety in the intensity of utilizing the processors. We have implemented a prototype of EPPMiner that supports OpenMP, CUDA, and OpenCL, and demonstrated its usage by three showcases. The showcases justify that GPUs provide much better energy efficiency than other types of computing systems, and especially illustrate the effectiveness of GPU Dynamic Voltage and Frequency Scaling (DVFS) on the energy efficiency of GPU applications. Second, we reveal a fine-grained analytical model to estimate the execution time of GPU kernels with both core and memory frequency scaling. Compared to the cycle-level simulators, which are too slow to apply on real hardware, our model only needs one-off micro-benchmarks to extract a set of hardware parameters and kernel performance counters without any source code analysis. Our experimental results show that the proposed performance model can capture the kernel performance scaling behaviors under different frequency settings and achieve decent accuracy. Third, we design a cross-benchmarking suite, which simulates kernels with a wide range of instruction distributions. The synthetic kernels generated by this suite can be used for model pre- training or as supplementary training samples. We then build machine learning models to predict the execution time and runtime power of a GPU kernel under different voltage and frequency settings. Validated on three modern GPUs with a wide frequency scaling range, by using a collection of 24 real application kernels, the model trained only with our cross-benchmarking suite is able to achieve considerably accurate results. At last, we establish a new DDL job scheduling framework which organizes DDL jobs as Directed Acyclic Graphs (DAGs) and considers communication contention between nodes. We then propose an efficient job placement algorithm, Least-Workload-First- (LWF-), to balance the GPU utilization and consolidate the allocated GPUs for each job. When scheduling the communication tasks, we propose Ada-SRSF for the DDL job scheduling problem to address the communication contention issue. Our simulation results show that LWF- achieves up to 1.59x improvement over the classical first-fit algorithms. More importantly, Ada-SRSF reduces the average job completion time by up to 36.7%, as compared to the solutions of either avoiding all the communication contention or accepting all of it

Computational Science – ICCS 2020

Computational Science – ICCS 2020 PDF Author: Valeria V. Krzhizhanovskaya
Publisher: Springer Nature
ISBN: 3030503712
Category : Computers
Languages : en
Pages : 726

Get Book Here

Book Description
The seven-volume set LNCS 12137, 12138, 12139, 12140, 12141, 12142, and 12143 constitutes the proceedings of the 20th International Conference on Computational Science, ICCS 2020, held in Amsterdam, The Netherlands, in June 2020.* The total of 101 papers and 248 workshop papers presented in this book set were carefully reviewed and selected from 719 submissions (230 submissions to the main track and 489 submissions to the workshops). The papers were organized in topical sections named: Part I: ICCS Main Track Part II: ICCS Main Track Part III: Advances in High-Performance Computational Earth Sciences: Applications and Frameworks; Agent-Based Simulations, Adaptive Algorithms and Solvers; Applications of Computational Methods in Artificial Intelligence and Machine Learning; Biomedical and Bioinformatics Challenges for Computer Science Part IV: Classifier Learning from Difficult Data; Complex Social Systems through the Lens of Computational Science; Computational Health; Computational Methods for Emerging Problems in (Dis-)Information Analysis Part V: Computational Optimization, Modelling and Simulation; Computational Science in IoT and Smart Systems; Computer Graphics, Image Processing and Artificial Intelligence Part VI: Data Driven Computational Sciences; Machine Learning and Data Assimilation for Dynamical Systems; Meshfree Methods in Computational Sciences; Multiscale Modelling and Simulation; Quantum Computing Workshop Part VII: Simulations of Flow and Transport: Modeling, Algorithms and Computation; Smart Systems: Bringing Together Computer Vision, Sensor Networks and Machine Learning; Software Engineering for Computational Science; Solving Problems with Uncertainties; Teaching Computational Science; UNcErtainty QUantIficatiOn for ComputationAl modeLs *The conference was canceled due to the COVID-19 pandemic.

Advances in GPU Research and Practice

Advances in GPU Research and Practice PDF Author: Hamid Sarbazi-Azad
Publisher: Morgan Kaufmann
ISBN: 0128037881
Category : Computers
Languages : en
Pages : 776

Get Book Here

Book Description
Advances in GPU Research and Practice focuses on research and practices in GPU based systems. The topics treated cover a range of issues, ranging from hardware and architectural issues, to high level issues, such as application systems, parallel programming, middleware, and power and energy issues. Divided into six parts, this edited volume provides the latest research on GPU computing. Part I: Architectural Solutions focuses on the architectural topics that improve on performance of GPUs, Part II: System Software discusses OS, compilers, libraries, programming environment, languages, and paradigms that are proposed and analyzed to help and support GPU programmers. Part III: Power and Reliability Issues covers different aspects of energy, power, and reliability concerns in GPUs. Part IV: Performance Analysis illustrates mathematical and analytical techniques to predict different performance metrics in GPUs. Part V: Algorithms presents how to design efficient algorithms and analyze their complexity for GPUs. Part VI: Applications and Related Topics provides use cases and examples of how GPUs are used across many sectors. Discusses how to maximize power and obtain peak reliability when designing, building, and using GPUs Covers system software (OS, compilers), programming environments, languages, and paradigms proposed to help and support GPU programmers Explains how to use mathematical and analytical techniques to predict different performance metrics in GPUs Illustrates the design of efficient GPU algorithms in areas such as bioinformatics, complex systems, social networks, and cryptography Provides applications and use case scenarios in several different verticals, including medicine, social sciences, image processing, and telecommunications

Graphics Processing Unit-Based High Performance Computing in Radiation Therapy

Graphics Processing Unit-Based High Performance Computing in Radiation Therapy PDF Author: Xun Jia
Publisher: CRC Press
ISBN: 1351231669
Category : Medical
Languages : en
Pages : 286

Get Book Here

Book Description
Use the GPU Successfully in Your Radiotherapy Practice With its high processing power, cost-effectiveness, and easy deployment, access, and maintenance, the graphics processing unit (GPU) has increasingly been used to tackle problems in the medical physics field, ranging from computed tomography reconstruction to Monte Carlo radiation transport simulation. Graphics Processing Unit-Based High Performance Computing in Radiation Therapy collects state-of-the-art research on GPU computing and its applications to medical physics problems in radiation therapy. Tackle Problems in Medical Imaging and Radiotherapy The book first offers an introduction to the GPU technology and its current applications in radiotherapy. Most of the remaining chapters discuss a specific application of a GPU in a key radiotherapy problem. These chapters summarize advances and present technical details and insightful discussions on the use of GPU in addressing the problems. The book also examines two real systems developed with GPU as a core component to accomplish important clinical tasks in modern radiotherapy. Translate Research Developments to Clinical Practice Written by a team of international experts in radiation oncology, biomedical imaging, computing, and physics, this book gets clinical and research physicists, graduate students, and other scientists up to date on the latest in GPU computing for radiotherapy. It encourages you to bring this novel technology to routine clinical radiotherapy practice.

GPU Gems 2

GPU Gems 2 PDF Author: Matt Pharr
Publisher: Addison-Wesley Professional
ISBN: 9780321335593
Category : Computers
Languages : en
Pages : 814

Get Book Here

Book Description
More useful techniques, tips, and tricks for harnessing the power of the new generation of powerful GPUs.