Topology-aware MPI Communication and Scheduling for High Performance Computing Systems

Topology-aware MPI Communication and Scheduling for High Performance Computing Systems PDF Author: Hari Subramoni
Publisher:
ISBN:
Category :
Languages : en
Pages : 132

Get Book Here

Book Description
Abstract: The designs proposed in this thesis have been successfully tested at up to 4,096 processes on the Stampede supercomputing system at TACC. We observe up to 14% improvement in the latency of the broadcast operation using our proposed topology-aware scheme over the default scheme at the micro-benchmark level for 1,024 processes. The topology-aware point-to-point communication and process placement scheme is able to improve the performance the MILC application up to 6% and 15% improvement in total execution time on 1,024 cores of Hyperion and 2,048 cores of Ranger, respectively. We also observe that our network topology-aware communication schedules for Alltoall is able to significantly reduce the amount of network contention observed during the Alltoall / FFT operations. It is also able to deliver up to 12% improvement in the communication time of P3DFFT at 4,096 processes on Stampede. The proposed network topology-aware plugin for SLURM is able to improve the throughput of a 512 core cluster by up to 8%.

Topology-aware MPI Communication and Scheduling for High Performance Computing Systems

Topology-aware MPI Communication and Scheduling for High Performance Computing Systems PDF Author: Hari Subramoni
Publisher:
ISBN:
Category :
Languages : en
Pages : 132

Get Book Here

Book Description
Abstract: The designs proposed in this thesis have been successfully tested at up to 4,096 processes on the Stampede supercomputing system at TACC. We observe up to 14% improvement in the latency of the broadcast operation using our proposed topology-aware scheme over the default scheme at the micro-benchmark level for 1,024 processes. The topology-aware point-to-point communication and process placement scheme is able to improve the performance the MILC application up to 6% and 15% improvement in total execution time on 1,024 cores of Hyperion and 2,048 cores of Ranger, respectively. We also observe that our network topology-aware communication schedules for Alltoall is able to significantly reduce the amount of network contention observed during the Alltoall / FFT operations. It is also able to deliver up to 12% improvement in the communication time of P3DFFT at 4,096 processes on Stampede. The proposed network topology-aware plugin for SLURM is able to improve the throughput of a 512 core cluster by up to 8%.

Topology-aware Job Scheduling and Placement in High Performance Computing and Edge Computing Systems

Topology-aware Job Scheduling and Placement in High Performance Computing and Edge Computing Systems PDF Author: Kangkang Li
Publisher:
ISBN:
Category :
Languages : en
Pages : 124

Get Book Here

Book Description


Kernel-assisted and Topology-aware MPI Collective Communication Among Multicore Or Many-core Clusters

Kernel-assisted and Topology-aware MPI Collective Communication Among Multicore Or Many-core Clusters PDF Author: Teng Ma
Publisher:
ISBN:
Category :
Languages : en
Pages : 136

Get Book Here

Book Description
Multicore or many-core clusters have become the most prominent form of High Performance Computing (HPC) systems. Hardware complexity and hierarchies not only exist in the inter-node layer, i.e., hierarchical networks, but also exist in internals of multicore compute nodes, e.g., Non Uniform Memory Accesses (NUMA), network-style interconnect, and memory and shared cache hierarchies. Message Passing Interface (MPI), the most widely adopted in the HPC communities, suffers from decreased performance and portability due to increased hardware complexity of multiple levels. We identified three critical issues specific to collective communication: The first problem arises from the gap between logical collective topologies and underlying hardware topologies; Second, current MPI communications lack efficient shared memory message delivering approaches; Last, on distributed memory machines, like multicore clusters, a single approach cannot encompass the extreme variations not only in the bandwidth and latency capabilities, but also in features such as the aptitude to operate multiple concurrent copies simultaneously. To bridge the gap between logical collective topologies and hardware topologies, we developed a distance-aware framework to integrate the knowledge of hardware distance into collective algorithms in order to dynamically reshape the communication patterns to suit the hardware capabilities. Based on process distance information, we used graph partitioning techniques to organize the MPI processes in a multi-level hierarchy, mapping on the hardware characteristics. Meanwhile, we took advantage of the kernel-assisted one-sided single-copy approach (KNEM) as the default shared memory delivering method. Via kernel-assisted memory copy, the collective algorithms offload copy tasks onto non-leader/not-root processes to evenly distribute copy workloads among available cores. Finally, on distributed memory machines, we developed a technique to compose multi-layered collective algorithms together to express a multi-level algorithm with tight interoperability between the levels. This tight collaboration results in more overlaps between inter- and intra-node communication. Experimental results have confirmed that, by leveraging several technologies together, such as kernel-assisted memory copy, the distance-aware framework, and collective algorithm composition, not only do MPI collectives reach the potential maximum performance on a wide variation of platforms, but they also deliver a level of performance immune to modifications of the underlying process-core binding.

High Performance Computing and Communications

High Performance Computing and Communications PDF Author: Jack Dongarra
Publisher: Springer
ISBN: 3540320792
Category : Computers
Languages : en
Pages : 1140

Get Book Here

Book Description


Introduction to HPC with MPI for Data Science

Introduction to HPC with MPI for Data Science PDF Author: Frank Nielsen
Publisher: Springer
ISBN: 3319219030
Category : Computers
Languages : en
Pages : 304

Get Book Here

Book Description
This gentle introduction to High Performance Computing (HPC) for Data Science using the Message Passing Interface (MPI) standard has been designed as a first course for undergraduates on parallel programming on distributed memory models, and requires only basic programming notions. Divided into two parts the first part covers high performance computing using C++ with the Message Passing Interface (MPI) standard followed by a second part providing high-performance data analytics on computer clusters. In the first part, the fundamental notions of blocking versus non-blocking point-to-point communications, global communications (like broadcast or scatter) and collaborative computations (reduce), with Amdalh and Gustafson speed-up laws are described before addressing parallel sorting and parallel linear algebra on computer clusters. The common ring, torus and hypercube topologies of clusters are then explained and global communication procedures on these topologies are studied. This first part closes with the MapReduce (MR) model of computation well-suited to processing big data using the MPI framework. In the second part, the book focuses on high-performance data analytics. Flat and hierarchical clustering algorithms are introduced for data exploration along with how to program these algorithms on computer clusters, followed by machine learning classification, and an introduction to graph analytics. This part closes with a concise introduction to data core-sets that let big data problems be amenable to tiny data problems. Exercises are included at the end of each chapter in order for students to practice the concepts learned, and a final section contains an overall exam which allows them to evaluate how well they have assimilated the material covered in the book.

High Performance Computing in Biomimetics

High Performance Computing in Biomimetics PDF Author: Kamarul Arifin Ahmad
Publisher: Springer Nature
ISBN: 9819710170
Category :
Languages : en
Pages : 309

Get Book Here

Book Description


High-Performance Computing on Complex Environments

High-Performance Computing on Complex Environments PDF Author: Emmanuel Jeannot
Publisher: John Wiley & Sons
ISBN: 1118712072
Category : Computers
Languages : en
Pages : 512

Get Book Here

Book Description
With recent changes in multicore and general-purpose computing on graphics processing units, the way parallel computers are used and programmed has drastically changed. It is important to provide a comprehensive study on how to use such machines written by specialists of the domain. The book provides recent research results in high-performance computing on complex environments, information on how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems, detailed studies on the impact of applying heterogeneous computing practices to real problems, and applications varying from remote sensing to tomography. The content spans topics such as Numerical Analysis for Heterogeneous and Multicore Systems; Optimization of Communication for High Performance Heterogeneous and Hierarchical Platforms; Efficient Exploitation of Heterogeneous Architectures, Hybrid CPU+GPU, and Distributed Systems; Energy Awareness in High-Performance Computing; and Applications of Heterogeneous High-Performance Computing. • Covers cutting-edge research in HPC on complex environments, following an international collaboration of members of the ComplexHPC • Explains how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems • Twenty-three chapters and over 100 illustrations cover domains such as numerical analysis, communication and storage, applications, GPUs and accelerators, and energy efficiency

Production Scheduling

Production Scheduling PDF Author: Rodrigo Righi
Publisher: BoD – Books on Demand
ISBN: 9533079355
Category : Technology & Engineering
Languages : en
Pages : 246

Get Book Here

Book Description
Generally speaking, scheduling is the procedure of mapping a set of tasks or jobs (studied objects) to a set of target resources efficiently. More specifically, as a part of a larger planning and scheduling process, production scheduling is essential for the proper functioning of a manufacturing enterprise. This book presents ten chapters divided into five sections. Section 1 discusses rescheduling strategies, policies, and methods for production scheduling. Section 2 presents two chapters about flow shop scheduling. Section 3 describes heuristic and metaheuristic methods for treating the scheduling problem in an efficient manner. In addition, two test cases are presented in Section 4. The first uses simulation, while the second shows a real implementation of a production scheduling system. Finally, Section 5 presents some modeling strategies for building production scheduling systems. This book will be of interest to those working in the decision-making branches of production, in various operational research areas, as well as computational methods design. People from a diverse background ranging from academia and research to those working in industry, can take advantage of this volume.

High-Performance Computing on Complex Environments

High-Performance Computing on Complex Environments PDF Author: Emmanuel Jeannot
Publisher: John Wiley & Sons
ISBN: 1118712056
Category : Computers
Languages : en
Pages : 512

Get Book Here

Book Description
With recent changes in multicore and general-purpose computing on graphics processing units, the way parallel computers are used and programmed has drastically changed. It is important to provide a comprehensive study on how to use such machines written by specialists of the domain. The book provides recent research results in high-performance computing on complex environments, information on how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems, detailed studies on the impact of applying heterogeneous computing practices to real problems, and applications varying from remote sensing to tomography. The content spans topics such as Numerical Analysis for Heterogeneous and Multicore Systems; Optimization of Communication for High Performance Heterogeneous and Hierarchical Platforms; Efficient Exploitation of Heterogeneous Architectures, Hybrid CPU+GPU, and Distributed Systems; Energy Awareness in High-Performance Computing; and Applications of Heterogeneous High-Performance Computing. • Covers cutting-edge research in HPC on complex environments, following an international collaboration of members of the ComplexHPC • Explains how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems • Twenty-three chapters and over 100 illustrations cover domains such as numerical analysis, communication and storage, applications, GPUs and accelerators, and energy efficiency

Algorithms and Architectures for Parallel Processing

Algorithms and Architectures for Parallel Processing PDF Author: Jaideep Vaidya
Publisher: Springer
ISBN: 3030050513
Category : Computers
Languages : en
Pages : 660

Get Book Here

Book Description
The four-volume set LNCS 11334-11337 constitutes the proceedings of the 18th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2018, held in Guangzhou, China, in November 2018. The 141 full and 50 short papers presented were carefully reviewed and selected from numerous submissions. The papers are organized in topical sections on Distributed and Parallel Computing; High Performance Computing; Big Data and Information Processing; Internet of Things and Cloud Computing; and Security and Privacy in Computing.