Automatic Parallelization of Recursive Procedures

Automatic Parallelization of Recursive Procedures PDF Author: International Business Machines Corporation. Research Division
Publisher:
ISBN:
Category : Compilers (Computer programs)
Languages : en
Pages : 14

Get Book Here

Book Description
Abstract: "Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer style algorithms. We present compile-time analysis to detect the independence of multiple recursive calls in a procedure. This allows exploitation of a scalable form of nested parallelism, where each parallel task can further spawn off parallel work in subsequent recursive calls. We describe a run-time system which efficiently supports this kind of nested parallelism without unnecessarily blocking tasks, and facilitates load-balancing. We have implemented this framework in a parallelizing compiler for C and Fortran 90. We believe it is the first compiler which is able to automatically parallelize programs like quicksort and mergesort. For cases where even the advanced symbolic analysis and array section analysis we describe are not able to prove the independence of procedure calls, we propose novel techniques for speculative run-time parallelization, which are significantly more efficient and powerful than analogous techniques proposed previously for speculatively parallelizing loops. Our experimental results on an IBM G30 SMP machine show good speedups obtained by following our approach."

Automatic Parallelization of Recursive Procedures

Automatic Parallelization of Recursive Procedures PDF Author: International Business Machines Corporation. Research Division
Publisher:
ISBN:
Category : Compilers (Computer programs)
Languages : en
Pages : 14

Get Book Here

Book Description
Abstract: "Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer style algorithms. We present compile-time analysis to detect the independence of multiple recursive calls in a procedure. This allows exploitation of a scalable form of nested parallelism, where each parallel task can further spawn off parallel work in subsequent recursive calls. We describe a run-time system which efficiently supports this kind of nested parallelism without unnecessarily blocking tasks, and facilitates load-balancing. We have implemented this framework in a parallelizing compiler for C and Fortran 90. We believe it is the first compiler which is able to automatically parallelize programs like quicksort and mergesort. For cases where even the advanced symbolic analysis and array section analysis we describe are not able to prove the independence of procedure calls, we propose novel techniques for speculative run-time parallelization, which are significantly more efficient and powerful than analogous techniques proposed previously for speculatively parallelizing loops. Our experimental results on an IBM G30 SMP machine show good speedups obtained by following our approach."

Automatic Parallelization

Automatic Parallelization PDF Author: Samuel Midkiff
Publisher: Springer Nature
ISBN: 3031017366
Category : Technology & Engineering
Languages : en
Pages : 157

Get Book Here

Book Description
Compiling for parallelism is a longstanding topic of compiler research. This book describes the fundamental principles of compiling "regular" numerical programs for parallelism. We begin with an explanation of analyses that allow a compiler to understand the interaction of data reads and writes in different statements and loop iterations during program execution. These analyses include dependence analysis, use-def analysis and pointer analysis. Next, we describe how the results of these analyses are used to enable transformations that make loops more amenable to parallelization, and discuss transformations that expose parallelism to target shared memory multicore and vector processors. We then discuss some problems that arise when parallelizing programs for execution on distributed memory machines. Finally, we conclude with an overview of solving Diophantine equations and suggestions for further readings in the topics of this book to enable the interested reader to delve deeper into the field. Table of Contents: Introduction and overview / Dependence analysis, dependence graphs and alias analysis / Program parallelization / Transformations to modify and eliminate dependences / Transformation of iterative and recursive constructs / Compiling for distributed memory machines / Solving Diophantine equations / A guide to further reading

Automatic Parallelization

Automatic Parallelization PDF Author: Samuel P. Midkiff
Publisher: Morgan & Claypool Publishers
ISBN: 1608458415
Category : Computers
Languages : en
Pages : 172

Get Book Here

Book Description
Compiling for parallelism is a longstanding topic of compiler research. This book describes the fundamental principles of compiling regular numerical programs for parallelism. We begin with an explanation of analyses that allow a compiler to understand the interaction of data reads and writes in different statements and loop iterations during program execution. These analyses include dependence analysis, use-def analysis and pointer analysis. Next, we describe how the results of these analyses are used to enable transformations that make loops more amenable to parallelization, and discuss transformations that expose parallelism to target shared memory multicore and vector processors. We then discuss some problems that arise when parallelizing programs for execution on distributed memory machines. Finally, we conclude with an overview of solving Diophantine equations and suggestions for further readings in the topics of this book to enable the interested reader to delve deeper into the field. Table of Contents: Introduction and overview / Dependence analysis, dependence graphs and alias analysis / Program parallelization / Transformations to modify and eliminate dependences / Transformation of iterative and recursive constructs / Compiling for distributed memory machines / Solving Diophantine equations / A guide to further reading

Automatic Program Parallelization Using Traces

Automatic Program Parallelization Using Traces PDF Author: Borys Bradel
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description


Parallel Processing and Applied Mathematics, Part II

Parallel Processing and Applied Mathematics, Part II PDF Author: Roman Wyrzykowski
Publisher: Springer
ISBN: 3642315003
Category : Computers
Languages : en
Pages : 687

Get Book Here

Book Description
This two-volume-set (LNCS 7203 and 7204) constitutes the refereed proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics, PPAM 2011, held in Torun, Poland, in September 2011. The 130 revised full papers presented in both volumes were carefully reviewed and selected from numerous submissions. The papers address issues such as parallel/distributed architectures and mobile computing; numerical algorithms and parallel numerics; parallel non-numerical algorithms; tools and environments for parallel/distributed/grid computing; applications of parallel/distributed computing; applied mathematics, neural networks and evolutionary computing; history of computing.

Automatic Parallelization For A Class Of Regular Computations

Automatic Parallelization For A Class Of Regular Computations PDF Author: G M Megson
Publisher: World Scientific
ISBN: 9814498416
Category : Computers
Languages : en
Pages : 272

Get Book Here

Book Description
The automatic generation of parallel code from high level sequential description is of key importance to the wide spread use of high performance machine architectures. This text considers (in detail) the theory and practical realization of automatic mapping of algorithms generated from systems of uniform recurrence equations (do-lccps) onto fixed size architectures with defined communication primitives. Experimental results of the mapping scheme and its implementation are given.

Parallelization of Recursive Procedures

Parallelization of Recursive Procedures PDF Author: Wayne J. Staats
Publisher:
ISBN:
Category :
Languages : en
Pages : 124

Get Book Here

Book Description


Compiler Construction

Compiler Construction PDF Author: Reinhard Wilhelm
Publisher: Springer
ISBN: 3540453067
Category : Computers
Languages : en
Pages : 383

Get Book Here

Book Description
ETAPS 2001 was the fourth instance of the European Joint Conferences on Theory and Practice of Software. ETAPS is an annual federated conference that was established in 1998 by combining a number of existing and new conferences. This year it comprised ve conferences (FOSSACS, FASE, ESOP, CC, TACAS), ten satellite workshops (CMCS, ETI Day, JOSES, LDTA, MMAABS, PFM, RelMiS, UNIGRA, WADT, WTUML), seven invited lectures, a debate, and ten tutorials. The events that comprise ETAPS address various aspects of the system de- lopment process, including speci cation, design, implementation, analysis, and improvement. The languages, methodologies, and tools which support these - tivities are all well within its scope. Di erent blends of theory and practice are represented, with an inclination towards theory with a practical motivation on one hand and soundly-based practice on the other. Many of the issues involved in software design apply to systems in general, including hardware systems, and the emphasis on software is not intended to be exclusive.

Compiler-assisted Workload Consolidation to Efficiently Exploit Dynamic Parallelism for Recursive Applications

Compiler-assisted Workload Consolidation to Efficiently Exploit Dynamic Parallelism for Recursive Applications PDF Author: Hancheng Wu (Researcher on electrical engineering)
Publisher:
ISBN:
Category :
Languages : en
Pages : 40

Get Book Here

Book Description
GPUs have been widely used to parallelize and accelerate applications for its high throughput. Traditionally, a GPU function can only be launched from the CPU side. This results in the fact that GPUs are preferable for those application which express a flat data parallelism, a simple data parallelism that is known at compiling time and can be easily distributed to different GPU blocks and threads. However, for those applications that contain nested data parallelism, which is not known a priori and can only be discovered at running time, it is difficult to write a GPU function that achieve high performance on parallelization and acceleration. One can easily end up with either a too coarse-grained or too fine-grained GPU function. Since Kepler architecture, Nvidia introduced a new feature -- Dynamic Parallelism (DP), which enables the initiation of GPU functions from inside a GPU function. This makes the nested parallelism easy to be explored on GPU since one can program in a way that a new GPU function can be launched whenever a local nested parallelism is met during the execution. What is more, DP makes implementing recursion on GPU without the intervention of CPUs possible. Many computations exhibit a pattern of nested data parallelism and among those is parallel recursion. However, preliminary data shows that simple DP-based implementations of recursion result in poor performance. This work focus on how to efficiently exploit DP for parallel recursive applications on GPU. Specifically, the goal is to free the users from programming with the complexity of GPUs' hardware and software and to automatically generate high performance GPU recursive functions implemented with DP given the inputs of simple parallel CPU recursive functions. To this end, first, I propose several DP-based parallel recursive templates that can be generated from a serial CPU recursive function. I compare the parallel recursive templates with non DP-based counterparts (flat kernels) to see if using DP in parallel recursive application can be beneficial or not. Second, to reduce the overhead of DP, I propose compiler techniques that improve the efficiency of simple DP-based parallel recursive functions by performing workload consolidation. My evaluation shows that GPU kernels consolidated with the proposed code transformations achieve an average speedup in the order of 1500x over basic implementations using DP and an average speedup of 3.9x over optimized flat GPU kernels for both tree traversal and graph based applications.

Euro-Par 2014: Parallel Processing Workshops

Euro-Par 2014: Parallel Processing Workshops PDF Author: Luís Lopes
Publisher: Springer
ISBN: 3319143131
Category : Computers
Languages : en
Pages : 667

Get Book Here

Book Description
The two volumes LNCS 8805 and 8806 constitute the thoroughly refereed post-conference proceedings of 18 workshops held at the 20th International Conference on Parallel Computing, Euro-Par 2014, in Porto, Portugal, in August 2014. The 100 revised full papers presented were carefully reviewed and selected from 173 submissions. The volumes include papers from the following workshops: APCI&E (First Workshop on Applications of Parallel Computation in Industry and Engineering - BigDataCloud (Third Workshop on Big Data Management in Clouds) - DIHC (Second Workshop on Dependability and Interoperability in Heterogeneous Clouds) - FedICI (Second Workshop on Federative and Interoperable Cloud Infrastructures) - Hetero Par (12th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms) - HiBB (5th Workshop on High Performance Bioinformatics and Biomedicine) - LSDVE (Second Workshop on Large Scale Distributed Virtual Environments on Clouds and P2P) - MuCoCoS (7th International Workshop on Multi-/Many-core Computing Systems) - OMHI (Third Workshop on On-chip Memory Hierarchies and Interconnects) - PADAPS (Second Workshop on Parallel and Distributed Agent-Based Simulations) - PROPER (7th Workshop on Productivity and Performance) - Resilience (7th Workshop on Resiliency in High Performance Computing with Clusters, Clouds, and Grids) - REPPAR (First International Workshop on Reproducibility in Parallel Computing) - ROME (Second Workshop on Runtime and Operating Systems for the Many Core Era) - SPPEXA (Workshop on Software for Exascale Computing) - TASUS (First Workshop on Techniques and Applications for Sustainable Ultrascale Computing Systems) - UCHPC (7th Workshop on Un Conventional High Performance Computing) and VHPC (9th Workshop on Virtualization in High-Performance Cloud Computing.