Load Latency Tolerance in Dynamically Scheduled Processors

Load Latency Tolerance in Dynamically Scheduled Processors PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 13

Get Book Here

Book Description
This paper provides quantitative measurements of load latency tolerance in a dynamically scheduled processor. To determine the latency tolerance of each memory load operation, our simulations use flexible load completion policies instead of a fixed memory hierarchy that dictates the latency. Although our policies delay load completion as long as possible, they produce performance (instructions committed per cycle (IPC)) comparable to an ideal memory system where all loads complete in one cycle. Our measurements reveal that to produce IPC values within 8% of the ideal memory system, between 1% and 62% of loads need to be satisfied within a single cycle and that up to 84% can be satisfied in as many as 32 cycles, depending on the benchmark and processor configuration. Load latency tolerance is largely determined by whether an unpredictable branch is in the load s data dependence graph and the depth of the dependence graph. Our results also show that up to 36% of all loads miss in the level one cache yet have latency demands lower than second level cache access times. We also show that up to 37% of loads hit in the level one cache even though they possess enough latency tolerance to be satisfied by lower levels of the memory hierarchy.

Load Latency Tolerance in Dynamically Scheduled Processors

Load Latency Tolerance in Dynamically Scheduled Processors PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 13

Get Book Here

Book Description
This paper provides quantitative measurements of load latency tolerance in a dynamically scheduled processor. To determine the latency tolerance of each memory load operation, our simulations use flexible load completion policies instead of a fixed memory hierarchy that dictates the latency. Although our policies delay load completion as long as possible, they produce performance (instructions committed per cycle (IPC)) comparable to an ideal memory system where all loads complete in one cycle. Our measurements reveal that to produce IPC values within 8% of the ideal memory system, between 1% and 62% of loads need to be satisfied within a single cycle and that up to 84% can be satisfied in as many as 32 cycles, depending on the benchmark and processor configuration. Load latency tolerance is largely determined by whether an unpredictable branch is in the load s data dependence graph and the depth of the dependence graph. Our results also show that up to 36% of all loads miss in the level one cache yet have latency demands lower than second level cache access times. We also show that up to 37% of loads hit in the level one cache even though they possess enough latency tolerance to be satisfied by lower levels of the memory hierarchy.

Latency Tolerance for Dynamic Processors

Latency Tolerance for Dynamic Processors PDF Author: Stanford University. Computer Systems Laboratory
Publisher:
ISBN:
Category : Cache memory
Languages : en
Pages : 23

Get Book Here

Book Description
While a number of dynamically scheduled processors have recently been brought to market, work on hardware techniques for tolerating memory latency has mostly targeted statically scheduled processors. This paper attempts to remedy this situation by examining the applicability of hardware latency tolerance techniques to dynamically scheduled processors. The results so far indicate that the inherent ability of the dynamically scheduled processor to tolerate memory latency reduces the need for additional hardware such as stream buffers or stride prediction tables. However, the technique of victim caching, while not usually considered as a latency tolerating technique, proves to be quite effective in aiding the dynamically scheduled processor in tolerating memory latency. For a fixed size investment in microprocessor chip area, the victim cache outperforms both stream buffers and stride prediction.

Power-Aware Computer Systems

Power-Aware Computer Systems PDF Author: Babak Falsafi
Publisher: Springer
ISBN: 3540286411
Category : Technology & Engineering
Languages : en
Pages : 224

Get Book Here

Book Description
Welcome to the proceedings of the 3rd Power-Aware Computer Systems (PACS 2003) Workshop held in conjunction with the 36th Annual International Symposium on Microarchitecture (MICRO-36). The increase in power and - ergy dissipation in computer systems has begun to limit performance and has also resulted in higher cost and lower reliability. The increase also implies - ducedbatterylifeinportablesystems.Becauseofthemagnitudeoftheproblem, alllevelsofcomputersystems,includingcircuits,architectures,andsoftware,are being employed to address power and energy issues. PACS 2003 was the third workshop in its series to explore power- and energy-awareness at all levels of computer systems and brought together experts from academia and industry. These proceedings include 14 research papers, selected from 43 submissions, spanningawidespectrumofareasinpower-awaresystems.Wehavegrouped the papers into the following categories: (1) compilers, (2) embedded systems, (3) microarchitectures, and (4) cache and memory systems. The ?rst paper on compiler techniques proposes pointer reuse analysis that is biased by runtime information (i.e., the targets of pointers are determined based on the likelihood of their occurrence at runtime) to map accesses to ener- e?cient memory access paths (e.g., avoid tag match). Another paper proposes compiling multiple programs together so that disk accesses across the programs can be synchronized to achieve longer sleep times in disks than if the programs are optimized separately.

Hiding Memory Latency Using Dynamic Scheduling in Shared-memory Multiprocessors

Hiding Memory Latency Using Dynamic Scheduling in Shared-memory Multiprocessors PDF Author: Stanford University. Computer Systems Laboratory
Publisher:
ISBN:
Category : Multiprocessors
Languages : en
Pages : 14

Get Book Here

Book Description
This paper explores the use of dynamically scheduled processors to exploit the overlap allowed by relaxed models for hiding the latency of reads. Our results are based on detailed simulation studies of several parallel applications. The results show that a substantial fraction of the read latency can be hidden using this technique. However, the major improvements in performance are achieved only at large instruction window sizes."

Latency Tolerant Architectures

Latency Tolerant Architectures PDF Author: James Edward Bennett
Publisher:
ISBN:
Category :
Languages : en
Pages : 304

Get Book Here

Book Description


Two Case Studies in Latency Tolerant Architectures

Two Case Studies in Latency Tolerant Architectures PDF Author: Stanford University. Computer Systems Laboratory
Publisher:
ISBN:
Category : Computer architecture
Languages : en
Pages : 22

Get Book Here

Book Description
Researchers have proposed a variety of techniques for dealing with memory latency, such as dynamic scheduling, hardware prefetching, software prefetching, and multiple contexts. This paper presents the results of two case studies on the usefulness of some simple techniques for latency tolerance. These techniques are nonblocking caches, reordering of loads and stores, and basic block scheduling for the expected latency of loads. The effectiveness of these techniques was found to vary according to the type of application. While nonblocking caches and load/store reordering consistently improved performance, scheduling based on expected latency was found to decrease performance in most cases. This result shows that the assumption of a uniform miss rate used by the scheduler is incorrect, and suggests that techniques for estimating the miss rates of individual loads are needed. These results were obtained using a new simulation environment, MXS, currently under development.

Advances in Computers

Advances in Computers PDF Author: Marvin Zelkowitz
Publisher: Academic Press
ISBN: 9780120121632
Category : Computers
Languages : en
Pages : 312

Get Book Here

Book Description
The term computation gap has been defined as the difference between the computational power demanded by the application domain and the computational power of the underlying computer platform. Traditionally, closing the computation gap has been one of the major and fundamental tasks of computer architects. However, as technology advances and computers become more pervasive in the society, the domain of computer architecture has been extended. The scope of research in the computer architecture is no longer restricted to the computer hardware and organization issues. A wide spectrum of topics ranging from algorithm design to power management is becoming part of the computer architecture. Based on the aforementioned trend and to reflect recent research efforts, attempts were made to select a collection of articles that covers different aspects of contemporary computer architecture design. This volume of the Advances in Computers contains six chapters on different aspects of computer architecture. Key features: - Wide range of research topics. - Coverage of new topics such as power management, Network on Chip, Load balancing in distributed systems, and pervasive computing. - Simple writing style. · Wide range of research topics. · Coverage of new topics such as power management, Network on Chip, Load balancing in distributed systems, and pervasive computing. · Simple writing style

Load Latency Tolerance by Data Prefetching in Multi-gigahertz Processors

Load Latency Tolerance by Data Prefetching in Multi-gigahertz Processors PDF Author: Yogesh Patil
Publisher:
ISBN:
Category : Computer architecture
Languages : en
Pages : 314

Get Book Here

Book Description


Critical-path Aware Processor Architectures

Critical-path Aware Processor Architectures PDF Author: Eric Tune
Publisher:
ISBN:
Category :
Languages : en
Pages : 386

Get Book Here

Book Description


Advances in Computers

Advances in Computers PDF Author: Marvin Zelkowitz
Publisher: Elsevier
ISBN: 0080459145
Category : Computers
Languages : en
Pages : 313

Get Book Here

Book Description
The term computation gap has been defined as the difference between the computational power demanded by the application domain and the computational power of the underlying computer platform. Traditionally, closing the computation gap has been one of the major and fundamental tasks of computer architects. However, as technology advances and computers become more pervasive in the society, the domain of computer architecture has been extended. The scope of research in the computer architecture is no longer restricted to the computer hardware and organization issues. A wide spectrum of topics ranging from algorithm design to power management is becoming part of the computer architecture. Based on the aforementioned trend and to reflect recent research efforts, attempts were made to select a collection of articles that covers different aspects of contemporary computer architecture design. This volume of the Advances in Computers contains six chapters on different aspects of computer architecture. Key features: - Wide range of research topics - Coverage of new topics such as power management, Network on Chip, Load balancing in distributed systems, and pervasive computing - Simple writing style - Wide range of research topics - Coverage of new topics such as power management, Network on Chip, Load balancing in distributed systems, and pervasive computing - Simple writing style