User-aware Scheduling for High Performance Computing Clusters

User-aware Scheduling for High Performance Computing Clusters PDF Author: Michael J. North
Publisher:
ISBN:
Category :
Languages : en
Pages : 448

Get Book Here

Book Description

User-aware Scheduling for High Performance Computing Clusters

User-aware Scheduling for High Performance Computing Clusters PDF Author: Michael J. North
Publisher:
ISBN:
Category :
Languages : en
Pages : 448

Get Book Here

Book Description


Energy-aware Scheduling of Parallel Applications on High Performance Computing Platforms

Energy-aware Scheduling of Parallel Applications on High Performance Computing Platforms PDF Author: Ahmed AbdElHai ElRefae Ebaid
Publisher:
ISBN:
Category :
Languages : en
Pages : 196

Get Book Here

Book Description


Rack Aware Scheduling in HPC Data Centers

Rack Aware Scheduling in HPC Data Centers PDF Author: Vikas Ashok Patil
Publisher:
ISBN:
Category :
Languages : en
Pages : 44

Get Book Here

Book Description
Energy consumption in high performance computing data centers has become a long standing issue. With rising costs of operating the data center, various techniques need to be employed to reduce the overall energy consumption. Currently, there are techniques that guarantee reduced energy consumption by powering on/off the idle nodes. However, most of them do not consider the energy consumed by other components in a rack. Our study addresses this aspect of the data center. We show that we can gain considerable energy savings by reducing the energy consumed by these rack components. We propose a scheduling technique that will help schedule jobs with the above mentioned goal. We claim that by our scheduling method we can reduce the energy consumption considerably without affecting other performance metrics of a job. We implement this as an enhancement to the well-known and widely used Maui scheduler and present our results.^We compare our method with various currently available Maui scheduler configurations. We simulate a wide variety of workloads from real cluster deployments using the simulation mode of Maui. Our results consistently show about 7 to 14% savings over the currently available Maui scheduler configurations. We shall also see that our approach can be applied in tandem with most of the existing energy aware scheduling techniques to achieve enhanced energy savings. Further, we consider the side effects of power losses due to the network switches as a result of deploying our technique. We compare our technique with the existing techniques in terms of the power losses due to these switches based on the results and account for the power losses. We there on provide a best fit scheme with the rack considerations. We then propose an enhanced technique that merges the two extremes of node allocation based on rack information.^We also provide a way to configure the scheduler based on the kind of workload that it schedules and reduce the effect of job splitting across multiple racks. Finally, we discuss how the enhancement can be utilized to build a learning model which can be used to adaptively adjust the scheduling parameters based on the workload experienced.

On the User-scheduler Relationship in High-performance Computing

On the User-scheduler Relationship in High-performance Computing PDF Author: Cynthia Bailey Lee
Publisher:
ISBN:
Category :
Languages : en
Pages : 111

Get Book Here

Book Description
To effectively manage High-Performance Computing (HPC) resources, it is essential to maximize return on the substantial infrastructure investment they entail. One prerequisite to success is the ability of the scheduler and user to productively interact. This work develops criteria for measuring productivity, analyzes several aspects of the user-scheduler relationship via user studies, and develops solutions to some vexing barriers between users and schedulers. The five main contributions of this work are as follows. First, this work quantifies the desires of the user population and represents them as a utility function. This contribution is in four parts: a survey-based study collecting utility data from users of a supercomputer system, augmentation of the Standard Workload Format to enable scheduler research using utility functions, and a model for synthetically generating utility function-augmented workloads. Second, a number of the classic scheduling disciplines are evaluated by their ability to maximize aggregate utility of all users, using the synthetic utility functions. These evaluations show the performance impact of inaccurate runtime estimates, contradicting an oft quoted prior result [55] that inaccuracy of estimates leads to better scheduling. Third, a scheduler optimizing the aggregate utility of all users, using a genetic algorithm heuristic, is demonstrated. This contribution includes two software artifacts: an implementation of the genetic algorithm (GA) scheduler, and a modular, extensible scheduler simulation framework that simulates several classic scheduling disciplines and is interoperable with the Standard Workload Format. Fourth, the ability of users to productively interact with this scheduler by providing an accurate estimate of their resource (run time) needs is examined. This contribution consists of formalizing a frequent casual assertion from the scheduling literature, that users typically "pad" runtime estimates, into an explicit Padding Hypothesis, and then falsifying the hypothesis via a survey-based study of users of a supercomputer system. Specifically, absent an incentive to pad-and including incentives to be accurate-the inaccuracy of runtime estimates only improved from an average of 61% inaccurate to an average of 57% inaccurate. This contribution has implications not only for the proposed genetic algorithm scheduler, but for any scheduler that asks users for an estimate, which currently includes virtually all parallel job schedulers both in production use and proposed in the literature. Fifth, a survey of users of a supercomputer system and associated simulations explore the feasibility of removing one of the defining constraints of the parallel job scheduling problem-the non-preemptability of running jobs. An investigation of users' current checkpointing habits produced a workload labeled with per-job checkpoint information, enabling simulation of a checkpoint-aware GA scheduler that may preempt running jobs as it optimizes aggregate utility. Lifting the non-preemptability constraint improves performance of the GA scheduler by 16% (and 23% compared to classic EASY algorithm), including overhead penalties for job termination and restart.

Optimized Cloud Based Scheduling

Optimized Cloud Based Scheduling PDF Author: Rong Kun Jason Tan
Publisher: Springer
ISBN: 3319732145
Category : Technology & Engineering
Languages : en
Pages : 106

Get Book Here

Book Description
This book presents an improved design for service provisioning and allocation models that are validated through running genome sequence assembly tasks in a hybrid cloud environment. It proposes approaches for addressing scheduling and performance issues in big data analytics and showcases new algorithms for hybrid cloud scheduling. Scientific sectors such as bioinformatics, astronomy, high-energy physics, and Earth science are generating a tremendous flow of data, commonly known as big data. In the context of growing demand for big data analytics, cloud computing offers an ideal platform for processing big data tasks due to its flexible scalability and adaptability. However, there are numerous problems associated with the current service provisioning and allocation models, such as inefficient scheduling algorithms, overloaded memory overheads, excessive node delays and improper error handling of tasks, all of which need to be addressed to enhance the performance of big data analytics.

High-Performance Computing

High-Performance Computing PDF Author: Laurence T. Yang
Publisher: John Wiley & Sons
ISBN: 0471732702
Category : Computers
Languages : en
Pages : 818

Get Book Here

Book Description
The state of the art of high-performance computing Prominent researchers from around the world have gathered to present the state-of-the-art techniques and innovations in high-performance computing (HPC), including: * Programming models for parallel computing: graph-oriented programming (GOP), OpenMP, the stages and transformation (SAT) approach, the bulk-synchronous parallel (BSP) model, Message Passing Interface (MPI), and Cilk * Architectural and system support, featuring the code tiling compiler technique, the MigThread application-level migration and checkpointing package, the new prefetching scheme of atomicity, a new "receiver makes right" data conversion method, and lessons learned from applying reconfigurable computing to HPC * Scheduling and resource management issues with heterogeneous systems, bus saturation effects on SMPs, genetic algorithms for distributed computing, and novel task-scheduling algorithms * Clusters and grid computing: design requirements, grid middleware, distributed virtual machines, data grid services and performance-boosting techniques, security issues, and open issues * Peer-to-peer computing (P2P) including the proposed search mechanism of hybrid periodical flooding (HPF) and routing protocols for improved routing performance * Wireless and mobile computing, featuring discussions of implementing the Gateway Location Register (GLR) concept in 3G cellular networks, maximizing network longevity, and comparisons of QoS-aware scatternet scheduling algorithms * High-performance applications including partitioners, running Bag-of-Tasks applications on grids, using low-cost clusters to meet high-demand applications, and advanced convergent architectures and protocols High-Performance Computing: Paradigm and Infrastructure is an invaluable compendium for engineers, IT professionals, and researchers and students of computer science and applied mathematics.

High Performance Computing

High Performance Computing PDF Author: Gonzalo Hernandez
Publisher: Springer
ISBN: 3662454831
Category : Computers
Languages : en
Pages : 267

Get Book Here

Book Description
This book constitutes the refereed proceedings of the First HPCLATAM - CLCAR Joint Latin American High Performance Computing Conference, CARLA 2014, held in Valparaiso, Chile, in October 2014. The 17 revised full papers and the one paper presented were carefully reviewed and selected from 42 submissions. The papers are organized in topical sections on grid and cloud computing; HPC architectures and tools; parallel programming; scientific computing.

High-Performance Computing on Complex Environments

High-Performance Computing on Complex Environments PDF Author: Emmanuel Jeannot
Publisher: John Wiley & Sons
ISBN: 1118712072
Category : Computers
Languages : en
Pages : 512

Get Book Here

Book Description
With recent changes in multicore and general-purpose computing on graphics processing units, the way parallel computers are used and programmed has drastically changed. It is important to provide a comprehensive study on how to use such machines written by specialists of the domain. The book provides recent research results in high-performance computing on complex environments, information on how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems, detailed studies on the impact of applying heterogeneous computing practices to real problems, and applications varying from remote sensing to tomography. The content spans topics such as Numerical Analysis for Heterogeneous and Multicore Systems; Optimization of Communication for High Performance Heterogeneous and Hierarchical Platforms; Efficient Exploitation of Heterogeneous Architectures, Hybrid CPU+GPU, and Distributed Systems; Energy Awareness in High-Performance Computing; and Applications of Heterogeneous High-Performance Computing. • Covers cutting-edge research in HPC on complex environments, following an international collaboration of members of the ComplexHPC • Explains how to efficiently exploit heterogeneous and hierarchical architectures and distributed systems • Twenty-three chapters and over 100 illustrations cover domains such as numerical analysis, communication and storage, applications, GPUs and accelerators, and energy efficiency

High Performance Computing Systems

High Performance Computing Systems PDF Author: Calebe Bianchini
Publisher: Springer Nature
ISBN: 3030410501
Category : Computers
Languages : en
Pages : 205

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 19th Symposium on High Performance Computing System, WSCAD 2018, held in São Paulo, Brazil, in October 2018. The 12 revised full papers presented were carefully reviewed and selected out of 61 submissions. The papers included in this book are organized according to the following topics: cloud computing; performance; processors and memory architectures; power and energy.

Topology-aware MPI Communication and Scheduling for High Performance Computing Systems

Topology-aware MPI Communication and Scheduling for High Performance Computing Systems PDF Author: Hari Subramoni
Publisher:
ISBN:
Category :
Languages : en
Pages : 132

Get Book Here

Book Description
Abstract: The designs proposed in this thesis have been successfully tested at up to 4,096 processes on the Stampede supercomputing system at TACC. We observe up to 14% improvement in the latency of the broadcast operation using our proposed topology-aware scheme over the default scheme at the micro-benchmark level for 1,024 processes. The topology-aware point-to-point communication and process placement scheme is able to improve the performance the MILC application up to 6% and 15% improvement in total execution time on 1,024 cores of Hyperion and 2,048 cores of Ranger, respectively. We also observe that our network topology-aware communication schedules for Alltoall is able to significantly reduce the amount of network contention observed during the Alltoall / FFT operations. It is also able to deliver up to 12% improvement in the communication time of P3DFFT at 4,096 processes on Stampede. The proposed network topology-aware plugin for SLURM is able to improve the throughput of a 512 core cluster by up to 8%.