Author: Matt Fuller
Publisher: "O'Reilly Media, Inc."
ISBN: 1098137205
Category : Computers
Languages : en
Pages : 322
Book Description
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakehouse with Iceberg or Delta Lake, a different system like Cassandra, Kafka, or SingleStore, or a relational database like PostgreSQL or Oracle. Analysts, software engineers, and production engineers learn how to manage, use, and even develop with Trino and make it a critical part of their data platform. Authors Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Trino query can combine data from multiple sources to allow for analytics across your entire organization. Explore Trino's use cases, and learn about tools that help you connect to Trino for querying and processing huge amounts of data Learn Trino's internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Deploy and secure Trino at scale, monitor workloads, tune queries, and connect more applications Learn how other organizations apply Trino successfully
Trino: The Definitive Guide
Author: Matt Fuller
Publisher: "O'Reilly Media, Inc."
ISBN: 1098137205
Category : Computers
Languages : en
Pages : 322
Book Description
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakehouse with Iceberg or Delta Lake, a different system like Cassandra, Kafka, or SingleStore, or a relational database like PostgreSQL or Oracle. Analysts, software engineers, and production engineers learn how to manage, use, and even develop with Trino and make it a critical part of their data platform. Authors Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Trino query can combine data from multiple sources to allow for analytics across your entire organization. Explore Trino's use cases, and learn about tools that help you connect to Trino for querying and processing huge amounts of data Learn Trino's internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Deploy and secure Trino at scale, monitor workloads, tune queries, and connect more applications Learn how other organizations apply Trino successfully
Publisher: "O'Reilly Media, Inc."
ISBN: 1098137205
Category : Computers
Languages : en
Pages : 322
Book Description
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakehouse with Iceberg or Delta Lake, a different system like Cassandra, Kafka, or SingleStore, or a relational database like PostgreSQL or Oracle. Analysts, software engineers, and production engineers learn how to manage, use, and even develop with Trino and make it a critical part of their data platform. Authors Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Trino query can combine data from multiple sources to allow for analytics across your entire organization. Explore Trino's use cases, and learn about tools that help you connect to Trino for querying and processing huge amounts of data Learn Trino's internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Deploy and secure Trino at scale, monitor workloads, tune queries, and connect more applications Learn how other organizations apply Trino successfully
Trino: The Definitive Guide
Author: Matt Fuller
Publisher: "O'Reilly Media, Inc."
ISBN: 1098137191
Category : Computers
Languages : en
Pages : 333
Book Description
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakehouse with Iceberg or Delta Lake, a different system like Cassandra, Kafka, or SingleStore, or a relational database like PostgreSQL or Oracle. Analysts, software engineers, and production engineers learn how to manage, use, and even develop with Trino and make it a critical part of their data platform. Authors Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Trino query can combine data from multiple sources to allow for analytics across your entire organization. Explore Trino's use cases, and learn about tools that help you connect to Trino for querying and processing huge amounts of data Learn Trino's internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Deploy and secure Trino at scale, monitor workloads, tune queries, and connect more applications Learn how other organizations apply Trino successfully
Publisher: "O'Reilly Media, Inc."
ISBN: 1098137191
Category : Computers
Languages : en
Pages : 333
Book Description
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakehouse with Iceberg or Delta Lake, a different system like Cassandra, Kafka, or SingleStore, or a relational database like PostgreSQL or Oracle. Analysts, software engineers, and production engineers learn how to manage, use, and even develop with Trino and make it a critical part of their data platform. Authors Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Trino query can combine data from multiple sources to allow for analytics across your entire organization. Explore Trino's use cases, and learn about tools that help you connect to Trino for querying and processing huge amounts of data Learn Trino's internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Deploy and secure Trino at scale, monitor workloads, tune queries, and connect more applications Learn how other organizations apply Trino successfully
Presto: The Definitive Guide
Author: Matt Fuller
Publisher: O'Reilly Media
ISBN: 1492044245
Category : Computers
Languages : en
Pages : 310
Book Description
Perform fast interactive analytics against different data sources using the Presto high-performance, distributed SQL query engine. With this practical guide, you’ll learn how to conduct analytics on data where it lives, whether it’s Hive, Cassandra, a relational database, or a proprietary data store. Analysts, software engineers, and production engineers will learn how to manage, use, and even develop with Presto. Initially developed by Facebook, open source Presto is now used by Netflix, Airbnb, LinkedIn, Twitter, Uber, and many other companies. Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Presto query can combine data from multiple sources to allow for analytics across your entire organization. Get started: Explore Presto’s use cases and learn about tools that will help you connect to Presto and query data Go deeper: Learn Presto’s internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Put Presto in production: Secure Presto, monitor workloads, tune queries, and connect more applications; learn how other organizations apply Presto
Publisher: O'Reilly Media
ISBN: 1492044245
Category : Computers
Languages : en
Pages : 310
Book Description
Perform fast interactive analytics against different data sources using the Presto high-performance, distributed SQL query engine. With this practical guide, you’ll learn how to conduct analytics on data where it lives, whether it’s Hive, Cassandra, a relational database, or a proprietary data store. Analysts, software engineers, and production engineers will learn how to manage, use, and even develop with Presto. Initially developed by Facebook, open source Presto is now used by Netflix, Airbnb, LinkedIn, Twitter, Uber, and many other companies. Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Presto query can combine data from multiple sources to allow for analytics across your entire organization. Get started: Explore Presto’s use cases and learn about tools that will help you connect to Presto and query data Go deeper: Learn Presto’s internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Put Presto in production: Secure Presto, monitor workloads, tune queries, and connect more applications; learn how other organizations apply Presto
Cassandra: The Definitive Guide
Author: Jeff Carpenter
Publisher: "O'Reilly Media, Inc."
ISBN: 1491933631
Category : Computers
Languages : en
Pages : 369
Book Description
Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s non-relational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data Maintain a high level of performance in your cluster Deploy Cassandra on site, in the Cloud, or with Docker Integrate Cassandra with Spark, Hadoop, Elasticsearch, Solr, and Lucene
Publisher: "O'Reilly Media, Inc."
ISBN: 1491933631
Category : Computers
Languages : en
Pages : 369
Book Description
Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s non-relational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data Maintain a high level of performance in your cluster Deploy Cassandra on site, in the Cloud, or with Docker Integrate Cassandra with Spark, Hadoop, Elasticsearch, Solr, and Lucene
Streaming Systems
Author: Tyler Akidau
Publisher: "O'Reilly Media, Inc."
ISBN: 1491983825
Category : Computers
Languages : en
Pages : 362
Book Description
Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way. Expanded from Tyler Akidau’s popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You’ll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax. You’ll explore: How streaming and batch data processing patterns compare The core principles and concepts behind robust out-of-order data processing How watermarks track progress and completeness in infinite datasets How exactly-once data processing techniques ensure correctness How the concepts of streams and tables form the foundations of both batch and streaming data processing The practical motivations behind a powerful persistent state mechanism, driven by a real-world example How time-varying relations provide a link between stream processing and the world of SQL and relational algebra
Publisher: "O'Reilly Media, Inc."
ISBN: 1491983825
Category : Computers
Languages : en
Pages : 362
Book Description
Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way. Expanded from Tyler Akidau’s popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You’ll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax. You’ll explore: How streaming and batch data processing patterns compare The core principles and concepts behind robust out-of-order data processing How watermarks track progress and completeness in infinite datasets How exactly-once data processing techniques ensure correctness How the concepts of streams and tables form the foundations of both batch and streaming data processing The practical motivations behind a powerful persistent state mechanism, driven by a real-world example How time-varying relations provide a link between stream processing and the world of SQL and relational algebra
An Introduction to Modern Cosmology
Author: Andrew Liddle
Publisher: John Wiley & Sons
ISBN: 1118690273
Category : Science
Languages : en
Pages : 200
Book Description
An Introduction to Modern Cosmology Third Edition is an accessible account of modern cosmological ideas. The Big Bang Cosmology is explored, looking at its observational successes in explaining the expansion of the Universe, the existence and properties of the cosmic microwave background, and the origin of light elements in the universe. Properties of the very early Universe are also covered, including the motivation for a rapid period of expansion known as cosmological inflation. The third edition brings this established undergraduate textbook up-to-date with the rapidly evolving observational situation. This fully revised edition of a bestseller takes an approach which is grounded in physics with a logical flow of chapters leading the reader from basic ideas of the expansion described by the Friedman equations to some of the more advanced ideas about the early universe. It also incorporates up-to-date results from the Planck mission, which imaged the anisotropies of the Cosmic Microwave Background radiation over the whole sky. The Advanced Topic sections present subjects with more detailed mathematical approaches to give greater depth to discussions. Student problems with hints for solving them and numerical answers are embedded in the chapters to facilitate the reader’s understanding and learning. Cosmology is now part of the core in many degree programs. This current, clear and concise introductory text is relevant to a wide range of astronomy programs worldwide and is essential reading for undergraduates and Masters students, as well as anyone starting research in cosmology. The accompanying website for this text, http://booksupport.wiley.com, provides additional material designed to enhance your learning, as well as errata within the text.
Publisher: John Wiley & Sons
ISBN: 1118690273
Category : Science
Languages : en
Pages : 200
Book Description
An Introduction to Modern Cosmology Third Edition is an accessible account of modern cosmological ideas. The Big Bang Cosmology is explored, looking at its observational successes in explaining the expansion of the Universe, the existence and properties of the cosmic microwave background, and the origin of light elements in the universe. Properties of the very early Universe are also covered, including the motivation for a rapid period of expansion known as cosmological inflation. The third edition brings this established undergraduate textbook up-to-date with the rapidly evolving observational situation. This fully revised edition of a bestseller takes an approach which is grounded in physics with a logical flow of chapters leading the reader from basic ideas of the expansion described by the Friedman equations to some of the more advanced ideas about the early universe. It also incorporates up-to-date results from the Planck mission, which imaged the anisotropies of the Cosmic Microwave Background radiation over the whole sky. The Advanced Topic sections present subjects with more detailed mathematical approaches to give greater depth to discussions. Student problems with hints for solving them and numerical answers are embedded in the chapters to facilitate the reader’s understanding and learning. Cosmology is now part of the core in many degree programs. This current, clear and concise introductory text is relevant to a wide range of astronomy programs worldwide and is essential reading for undergraduates and Masters students, as well as anyone starting research in cosmology. The accompanying website for this text, http://booksupport.wiley.com, provides additional material designed to enhance your learning, as well as errata within the text.
Spark: The Definitive Guide
Author: Bill Chambers
Publisher: "O'Reilly Media, Inc."
ISBN: 1491912294
Category : Computers
Languages : en
Pages : 594
Book Description
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Publisher: "O'Reilly Media, Inc."
ISBN: 1491912294
Category : Computers
Languages : en
Pages : 594
Book Description
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Data Pipelines Pocket Reference
Author: James Densmore
Publisher: O'Reilly Media
ISBN: 1492087807
Category : Computers
Languages : en
Pages : 277
Book Description
Data pipelines are the foundation for success in data analytics. Moving data from numerous diverse sources and transforming it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack. You'll learn common considerations and key decision points when implementing pipelines, such as batch versus streaming data ingestion and build versus buy. This book addresses the most common decisions made by data professionals and discusses foundational concepts that apply to open source frameworks, commercial products, and homegrown solutions. You'll learn: What a data pipeline is and how it works How data is moved and processed on modern data infrastructure, including cloud platforms Common tools and products used by data engineers to build pipelines How pipelines support analytics and reporting needs Considerations for pipeline maintenance, testing, and alerting
Publisher: O'Reilly Media
ISBN: 1492087807
Category : Computers
Languages : en
Pages : 277
Book Description
Data pipelines are the foundation for success in data analytics. Moving data from numerous diverse sources and transforming it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack. You'll learn common considerations and key decision points when implementing pipelines, such as batch versus streaming data ingestion and build versus buy. This book addresses the most common decisions made by data professionals and discusses foundational concepts that apply to open source frameworks, commercial products, and homegrown solutions. You'll learn: What a data pipeline is and how it works How data is moved and processed on modern data infrastructure, including cloud platforms Common tools and products used by data engineers to build pipelines How pipelines support analytics and reporting needs Considerations for pipeline maintenance, testing, and alerting
Learning Spark
Author: Holden Karau
Publisher: "O'Reilly Media, Inc."
ISBN: 1449359051
Category : Computers
Languages : en
Pages : 289
Book Description
Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates. Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning. Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm Learn how to deploy interactive, batch, and streaming applications Connect to data sources including HDFS, Hive, JSON, and S3 Master advanced topics like data partitioning and shared variables
Publisher: "O'Reilly Media, Inc."
ISBN: 1449359051
Category : Computers
Languages : en
Pages : 289
Book Description
Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates. Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning. Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm Learn how to deploy interactive, batch, and streaming applications Connect to data sources including HDFS, Hive, JSON, and S3 Master advanced topics like data partitioning and shared variables
Supersymmetry and String Theory
Author: Michael Dine
Publisher: Cambridge University Press
ISBN: 113946244X
Category : Science
Languages : en
Pages : 481
Book Description
The past decade has witnessed dramatic developments in the field of theoretical physics. This book is a comprehensive introduction to these recent developments. It contains a review of the Standard Model, covering non-perturbative topics, and a discussion of grand unified theories and magnetic monopoles. It introduces the basics of supersymmetry and its phenomenology, and includes dynamics, dynamical supersymmetry breaking, and electric-magnetic duality. The book then covers general relativity and the big bang theory, and the basic issues in inflationary cosmologies before discussing the spectra of known string theories and the features of their interactions. The book also includes brief introductions to technicolor, large extra dimensions, and the Randall-Sundrum theory of warped spaces. This will be of great interest to graduates and researchers in the fields of particle theory, string theory, astrophysics and cosmology. The book contains several problems, and password protected solutions will be available to lecturers at www.cambridge.org/9780521858410.
Publisher: Cambridge University Press
ISBN: 113946244X
Category : Science
Languages : en
Pages : 481
Book Description
The past decade has witnessed dramatic developments in the field of theoretical physics. This book is a comprehensive introduction to these recent developments. It contains a review of the Standard Model, covering non-perturbative topics, and a discussion of grand unified theories and magnetic monopoles. It introduces the basics of supersymmetry and its phenomenology, and includes dynamics, dynamical supersymmetry breaking, and electric-magnetic duality. The book then covers general relativity and the big bang theory, and the basic issues in inflationary cosmologies before discussing the spectra of known string theories and the features of their interactions. The book also includes brief introductions to technicolor, large extra dimensions, and the Randall-Sundrum theory of warped spaces. This will be of great interest to graduates and researchers in the fields of particle theory, string theory, astrophysics and cosmology. The book contains several problems, and password protected solutions will be available to lecturers at www.cambridge.org/9780521858410.