Goal-directed Performance Tuning for Scientific Applications

Goal-directed Performance Tuning for Scientific Applications PDF Author: Tien-Pao Shih
Publisher:
ISBN:
Category : Cache memory
Languages : en
Pages : 346

Get Book Here

Book Description
Abstract: "Performance tuning, as carried out by compiler designers and application programmers to close the performance gap between the achievable peak and delivered performance, becomes increasingly important and challenging as the microprocessor speeds and system sizes increase. However, although performance tuning on scientific codes usually deals with relatively small program regions, it is not generally known how to establish a reasonable performance objective and how to efficiently achieve this objective. We suggest a goal-directed approach and develop such an approach for each of three major system performance components: central processor unit (CPU) computation, memory accessing, and communication. For the CPU, we suggest using a machine-application performance model that characterizes workloads on four key function units (memory, floating-point, issue, and a virtual 'dependence unit') to produce an upper bound performance objective, and derive a mechanism to approach this objective. A case study shows an average 1.79x speedup achieved by using this approach for the Livermore Fortran Kernels 1-12 running on the IBM RS/6000. For memory, as compulsory and capacity misses are relatively easy to characterize, we derive a method for building application-specific cache behavior models that report the number of misses for all three types of conflict misses: self, cross, and ping-pong. The method uses averaging concepts to determine the expected number of cache misses instead of attempting to count them exactly in each instance, which provides a more rapid, yet realistic assessment of expected cache behavior. For each type of conflict miss, we propose a reduction method that uses one or a combination of three techniques based on modifying or exploiting data layout: array padding, initial address adjustment, and access resequencing. A case study using a blocked matrix multiply program as an example shows that the model is within 11% of the simulation results, and that each type of conflict miss can be effectively reduced or completely eliminated. For communication in shared memory parallel systems, we derive an array grouping mechanism and related loop transformations to reduce communication caused by the problematic case of nonconsecutive references to shared arrays and prove several theorems that determine when and where to apply this technique. The experimental results show a 15% reduction in communication, a 40% reduction in data subcache misses, and an 18% reduction in maximum user time for a finite element application on a 56 processor KSR1 parallel computer."

Goal-directed Performance Tuning for Scientific Applications

Goal-directed Performance Tuning for Scientific Applications PDF Author: Tien-Pao Shih
Publisher:
ISBN:
Category : Cache memory
Languages : en
Pages : 346

Get Book Here

Book Description
Abstract: "Performance tuning, as carried out by compiler designers and application programmers to close the performance gap between the achievable peak and delivered performance, becomes increasingly important and challenging as the microprocessor speeds and system sizes increase. However, although performance tuning on scientific codes usually deals with relatively small program regions, it is not generally known how to establish a reasonable performance objective and how to efficiently achieve this objective. We suggest a goal-directed approach and develop such an approach for each of three major system performance components: central processor unit (CPU) computation, memory accessing, and communication. For the CPU, we suggest using a machine-application performance model that characterizes workloads on four key function units (memory, floating-point, issue, and a virtual 'dependence unit') to produce an upper bound performance objective, and derive a mechanism to approach this objective. A case study shows an average 1.79x speedup achieved by using this approach for the Livermore Fortran Kernels 1-12 running on the IBM RS/6000. For memory, as compulsory and capacity misses are relatively easy to characterize, we derive a method for building application-specific cache behavior models that report the number of misses for all three types of conflict misses: self, cross, and ping-pong. The method uses averaging concepts to determine the expected number of cache misses instead of attempting to count them exactly in each instance, which provides a more rapid, yet realistic assessment of expected cache behavior. For each type of conflict miss, we propose a reduction method that uses one or a combination of three techniques based on modifying or exploiting data layout: array padding, initial address adjustment, and access resequencing. A case study using a blocked matrix multiply program as an example shows that the model is within 11% of the simulation results, and that each type of conflict miss can be effectively reduced or completely eliminated. For communication in shared memory parallel systems, we derive an array grouping mechanism and related loop transformations to reduce communication caused by the problematic case of nonconsecutive references to shared arrays and prove several theorems that determine when and where to apply this technique. The experimental results show a 15% reduction in communication, a 40% reduction in data subcache misses, and an 18% reduction in maximum user time for a finite element application on a 56 processor KSR1 parallel computer."

Performance-oriented Application Development for Distributed Architectures

Performance-oriented Application Development for Distributed Architectures PDF Author: M. Gerndt
Publisher: IOS Press
ISBN: 9781586032678
Category : Computers
Languages : en
Pages : 112

Get Book Here

Book Description
Annotation This publication is devoted to programming models, languages, and tools for performance-oriented program development in commercial and scientific environments. The included papers have been written based on presentations given at the workshop PADDA 2001. The goal of the workshop was to identify common interests and techniques for performance-oriented program development in commercial and scientific environments. Distributed architectures currently dominate the field of highly parallel computing. Distributed architectures, based on Internet and mobile computing technologies, are important target architectures in the domain of commercial computing too. The papers in this publication come from the two areas: scientific computing and commercial computing.

Performance Evaluation and Benchmarking with Realistic Applications

Performance Evaluation and Benchmarking with Realistic Applications PDF Author: Rudolf Eigenmann
Publisher: MIT Press
ISBN: 9780262050661
Category : Business & Economics
Languages : en
Pages : 316

Get Book Here

Book Description
The book discusses rationales for creating and updating benchmarks, the use of benchmarks in academic research, benchmarking methodologies, the relation of SPEC benchmarks to other benchmarking activities, shortcomings of current benchmarks, and the need for further benchmarking efforts. Performance evaluation and benchmarking are of concern to all computer-related disciplines. A benchmark is a standard program or set of programs that can be run on different computers to give an accurate measure of their performance. This book covers a variety of aspects of computer performance evaluation, with a focus on Standard Performance Evaluation Corporation (SPEC) benchmarks. SPEC is a nonprofit organization whose members represent industry, academia, and other organizations. The book discusses rationales for creating and updating benchmarks, the use of benchmarks in academic research, benchmarking methodologies, the relation of SPEC benchmarks to other benchmarking activities, shortcomings of current benchmarks, and the need for further benchmarking efforts. Contributors Brian Armstrong, Frederica Darema, Edward S. Davidson, Sylvia Dieckmann, Jozo J. Dujmovic, Rudolf Eigenmann, J. Kelly Flanagan, Greg Gaertner, Jonathan Geisler, John Gustafson, Urs Hölzle, Shih-Hao Hung, Kathryn S. McKinley, Reinhard Riedl, Faisal Saied, Frank Sorenson, Mark Straka, Valerie Taylor, Olivier Temam, Rajat Todi, Reinhold Weicker

Improving Cache Performance Via Active Management

Improving Cache Performance Via Active Management PDF Author: Edward S. Tam
Publisher:
ISBN:
Category :
Languages : en
Pages : 270

Get Book Here

Book Description


Research and Technology Objectives and Plans Summary (RTOPS)

Research and Technology Objectives and Plans Summary (RTOPS) PDF Author:
Publisher:
ISBN:
Category : Aeronautics
Languages : en
Pages : 208

Get Book Here

Book Description


Advanced Topics in Database Research, Volume 5

Advanced Topics in Database Research, Volume 5 PDF Author: Siau, Keng
Publisher: IGI Global
ISBN: 1591409373
Category : Computers
Languages : en
Pages : 472

Get Book Here

Book Description
Advanced Topics in Database Research is a series of books on the fields of database, software engineering, and systems analysis and design. They feature the latest research ideas and topics on how to enhance current database systems, improve information storage, refine existing database models, and develop advanced applications. Advanced Topics in Database Research, Volume 5 is a part of this series. Advanced Topics in Database Research, Volume 5 presents the latest research ideas and topics on database systems and applications, and provides insights into important developments in the field of database and database management. This book describes the capabilities and features of new technologies and methodologies, and presents state-of-the-art research ideas, with an emphasis on theoretical issues regarding databases and database management.

NASA Technical Memorandum

NASA Technical Memorandum PDF Author:
Publisher:
ISBN:
Category : Aeronautics
Languages : en
Pages : 208

Get Book Here

Book Description


Database and Expert Systems Applications

Database and Expert Systems Applications PDF Author: Mohamed Ibrahim
Publisher: Springer
ISBN: 3540444696
Category : Computers
Languages : en
Pages : 1023

Get Book Here

Book Description
The Database and Expert Systems Applications (DEXA) conferences have established themselves as a platform for bringing together researchers and practitioners from various backgrounds and all regions of the world to exchange ideas, experiences and opinions in a friendly and stimulating environment. The papers presented at the conference represent recent developments in the field and important steps towards shaping the future of applied computer science and information systems. DEXA covers a broad field: all aspects of databases, knowledge based systems, knowledge management, web-based systems, information systems, related technologies and their applications. Once again there were a good number of submissions: out of 183 papers that were submitted, the program committee selected 92 to be presented. In the first year of this new millennium DEXA has come back to the United Kingdom, following events in Vienna, Berlin, Valencia, Prague, Athens, London, Zurich, Toulouse, Vienna and Florence. The past decade has seen several revolutionary developments, one of which was the explosion of Internet-related applications in the areas covered by DEXA, developments in which DEXA has played a role and in which DEXA will continue to play a role in its second decade, starting with this conference.

American Doctoral Dissertations

American Doctoral Dissertations PDF Author:
Publisher:
ISBN:
Category : Dissertation abstracts
Languages : en
Pages : 872

Get Book Here

Book Description


Computational Methods in Science and Technology

Computational Methods in Science and Technology PDF Author: Sukhpreet Kaur
Publisher: CRC Press
ISBN: 1040260640
Category : Computers
Languages : en
Pages : 580

Get Book Here

Book Description
This book contains the proceedings of the 4TH International Conference on Computational Methods in Science and Technology (ICCMST 2024). The proceedings explores research and innovation in the field of Internet of things, Cloud Computing, Machine Learning, Networks, System Design and Methodologies, Big Data Analytics and Applications, ICT for Sustainable Environment, Artificial Intelligence and it provides real time assistance and security for advanced stage learners, researchers and academicians has been presented. This will be a valuable read to researchers, academicians, undergraduate students, postgraduate students, and professionals within the fields of Computer Science, Sustainability and Artificial Intelligence.