DuckDB in Action

DuckDB in Action PDF Author: Mark Needham
Publisher: Simon and Schuster
ISBN: 1638355592
Category : Computers
Languages : en
Pages : 310

Get Book Here

Book Description
Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you’ll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you’ll learn everything you need to get the most out of DuckDB—all through hands-on examples. Open up DuckDB in Action and learn how to: • Read and process data from CSV, JSON and Parquet sources both locally and remote • Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables • Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames • Prepare, ingest and query large datasets • Build cloud data pipelines • Extend DuckDB with custom functionality Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won’t need to read through pages of documentation—you’ll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines. About the technology DuckDB makes data analytics fast and fun! You don’t need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres. About the book DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You’ll explore DuckDB’s handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action. What's inside • Prepare, ingest and query large datasets • Build cloud data pipelines • Extend DuckDB with custom functionality • Fast-paced SQL recap: From simple queries to advanced analytics About the reader For data pros comfortable with Python and CLI tools. About the author Mark Needham is a blogger and video creator at @?LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j.

DuckDB in Action

DuckDB in Action PDF Author: Mark Needham
Publisher: Simon and Schuster
ISBN: 1638355592
Category : Computers
Languages : en
Pages : 310

Get Book Here

Book Description
Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you’ll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you’ll learn everything you need to get the most out of DuckDB—all through hands-on examples. Open up DuckDB in Action and learn how to: • Read and process data from CSV, JSON and Parquet sources both locally and remote • Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables • Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames • Prepare, ingest and query large datasets • Build cloud data pipelines • Extend DuckDB with custom functionality Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won’t need to read through pages of documentation—you’ll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines. About the technology DuckDB makes data analytics fast and fun! You don’t need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres. About the book DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You’ll explore DuckDB’s handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action. What's inside • Prepare, ingest and query large datasets • Build cloud data pipelines • Extend DuckDB with custom functionality • Fast-paced SQL recap: From simple queries to advanced analytics About the reader For data pros comfortable with Python and CLI tools. About the author Mark Needham is a blogger and video creator at @?LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j.

ScyllaDB in Action

ScyllaDB in Action PDF Author: Bo Ingram
Publisher: Simon and Schuster
ISBN: 1638356122
Category : Computers
Languages : en
Pages : 390

Get Book Here

Book Description
Build, maintain, and run databases that are easy to scale and quick to query—all with ScyllaDB. ScyllaDB in Action is your guide to everything you need to know about ScyllaDB, from your very first queries to running it in a production environment. It starts you with the basics of creating, reading, and deleting data and expands your knowledge from there. You’ll soon have mastered everything you need to build, maintain, and run an effective and efficient database. Inside ScyllaDB in Action you’ll learn how to: • Read, write, and delete data in ScyllaDB • Design database schemas for ScyllaDB • Write performant queries against ScyllaDB • Connect and query a ScyllaDB cluster from an application • Configure, monitor, and operate ScyllaDB in production This book teaches you ScyllaDB the best way—through hands-on examples. Dive into the node-based architecture of ScyllaDB to understand how its distributed systems work, how you can troubleshoot problems, and how you can constantly improve performance. About the technology ScyllaDB is a versatile NoSQL database that can move large volumes of data fast. Very, very, very fast. This drop-in replacement for Cassandra takes full advantage of modern multi-core hardware and scales to handle large real-time data workloads with incredibly low latency. It features built-in monitoring and management tools, and its efficient use of computing resources can save a lot of money on high-volume applications. About the book ScyllaDB in Action demonstrates how to integrate ScyllaDB into data-intensive applications. You’ll work through a hands-on project step by step as you use ScyllaDB to store data and learn to configure, monitor, and safely operate a distributed database. Along the way, you’ll discover how ScyllaDB’s unique “shard per core” approach helps you deliver impressive performance in real-time systems. What's inside • Design schemas for ScyllaDB • Write performant queries • Get an instant speed boost over Cassandra About the reader For backend and infrastructure engineers who know the basics of SQL. About the author Bo Ingram is a staff software engineer at Discord working in database infrastructure. He has extensive experience working with ScyllaDB as an operator and developer. The technical editor on this book was Piotr Wiktor Sarna. Table of Contents Part 1 1 Introducing ScyllaDB 2 Touring ScyllaDB Part 2 3 Data modeling in ScyllaDB 4 Data types in ScyllaDB 5 Tables in ScyllaDB Part 3 6 Writing data to ScyllaDB 7 Reading data from ScyllaDB Part 4 8 ScyllaDB’s architecture 9 Running ScyllaDB in production 10 Application development with ScyllaDB 11 Monitoring ScyllaDB 12 Moving data in bulk with ScyllaDB Appendix Docker

Getting Started with DuckDB

Getting Started with DuckDB PDF Author: Simon Aubury
Publisher: Packt Publishing Ltd
ISBN: 1803232536
Category : Computers
Languages : en
Pages : 382

Get Book Here

Book Description
Analyze and transform data efficiently with DuckDB, a versatile, modern, in-process SQL database Key Features Use DuckDB to rapidly load, transform, and query data across a range of sources and formats Gain practical experience using SQL, Python, and R to effectively analyze data Learn how open source tools and cloud services in the broader data ecosystem complement DuckDB’s versatile capabilities Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionDuckDB is a fast in-process analytical database. Getting Started with DuckDB offers a practical overview of its usage. You'll learn to load, transform, and query various data formats, including CSV, JSON, and Parquet. The book covers DuckDB's optimizations, SQL enhancements, and extensions for specialized applications. Working with examples in SQL, Python, and R, you'll explore analyzing public datasets and discover tools enhancing DuckDB workflows. This guide suits both experienced and new data practitioners, quickly equipping you to apply DuckDB's capabilities in analytical projects. You'll gain proficiency in using DuckDB for diverse tasks, enabling effective integration into your data workflows.What you will learn Understand the properties and applications of a columnar in-process database Use SQL to load, transform, and query a range of data formats Discover DuckDB's rich extensions and learn how to apply them Use nested data types to model semi-structured data and extract and model JSON data Integrate DuckDB into your Python and R analytical workflows Effectively leverage DuckDB's convenient SQL enhancements Explore the wider ecosystem and pathways for building DuckDB-powered data applications Who this book is for If you’re interested in expanding your analytical toolkit, this book is for you. It will be particularly valuable for data analysts wanting to rapidly explore and query complex data, data and software engineers looking for a lean and versatile data processing tool, along with data scientists needing a scalable data manipulation library that integrates seamlessly with Python and R. You will get the most from this book if you have some familiarity with SQL and foundational database concepts, as well as exposure to a programming language such as Python or R.

Quantum Computing in Action

Quantum Computing in Action PDF Author: Johan Vos
Publisher: Simon and Schuster
ISBN: 1617296325
Category : Computers
Languages : en
Pages : 262

Get Book Here

Book Description
Quantum computing is on the horizon, ready to impact everything from scientific research to encryption and security. But you don't need a physics degree to get started in quantum computing. Quantum Computing for Developers shows you how to leverage your existing Java skills into writing your first quantum software so you're ready for the revolution. Rather than a hardware manual or academic theory guide, this book is focused on practical implementations of quantum computing algorithms. Using Strange, a Java-based quantum computer simulator, you'll go hands-on with quantum computing's core components including qubits and quantum gates as you write your very first quantum code. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

Spark: The Definitive Guide

Spark: The Definitive Guide PDF Author: Bill Chambers
Publisher: "O'Reilly Media, Inc."
ISBN: 1491912294
Category : Computers
Languages : en
Pages : 594

Get Book Here

Book Description
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation

Spring Data

Spring Data PDF Author: Mark Pollack
Publisher: "O'Reilly Media, Inc."
ISBN: 1449323952
Category : Computers
Languages : en
Pages : 315

Get Book Here

Book Description
You can choose several data access frameworks when building Java enterprise applications that work with relational databases. But what about big data? This hands-on introduction shows you how Spring Data makes it relatively easy to build applications across a wide range of new data access technologies such as NoSQL and Hadoop. Through several sample projects, you’ll learn how Spring Data provides a consistent programming model that retains NoSQL-specific features and capabilities, and helps you develop Hadoop applications across a wide range of use-cases such as data analysis, event stream processing, and workflow. You’ll also discover the features Spring Data adds to Spring’s existing JPA and JDBC support for writing RDBMS-based data access layers. Learn about Spring’s template helper classes to simplify the use of database-specific functionality Explore Spring Data’s repository abstraction and advanced query functionality Use Spring Data with Redis (key/value store), HBase (column-family), MongoDB (document database), and Neo4j (graph database) Discover the GemFire distributed data grid solution Export Spring Data JPA-managed entities to the Web as RESTful web services Simplify the development of HBase applications, using a lightweight object-mapping framework Build example big-data pipelines with Spring Batch and Spring Integration

Learning Spark

Learning Spark PDF Author: Holden Karau
Publisher: "O'Reilly Media, Inc."
ISBN: 1449359051
Category : Computers
Languages : en
Pages : 289

Get Book Here

Book Description
Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates. Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning. Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm Learn how to deploy interactive, batch, and streaming applications Connect to data sources including HDFS, Hive, JSON, and S3 Master advanced topics like data partitioning and shared variables

Data Structures and Algorithms in Python

Data Structures and Algorithms in Python PDF Author: Michael T. Goodrich
Publisher: Wiley Global Education
ISBN: 1118476735
Category : Computers
Languages : en
Pages : 770

Get Book Here

Book Description
Based on the authors' market leading data structures books in Java and C++, this book offers a comprehensive, definitive introduction to data structures in Python by authoritative authors. Data Structures and Algorithms in Python is the first authoritative object-oriented book available for Python data structures. Designed to provide a comprehensive introduction to data structures and algorithms, including their design, analysis, and implementation, the text will maintain the same general structure as Data Structures and Algorithms in Java and Data Structures and Algorithms in C++. Begins by discussing Python's conceptually simple syntax, which allows for a greater focus on concepts. Employs a consistent object-oriented viewpoint throughout the text. Presents each data structure using ADTs and their respective implementations and introduces important design patterns as a means to organize those implementations into classes, methods, and objects. Provides a thorough discussion on the analysis and design of fundamental data structures. Includes many helpful Python code examples, with source code provided on the website. Uses illustrations to present data structures and algorithms, as well as their analysis, in a clear, visual manner. Provides hundreds of exercises that promote creativity, help readers learn how to think like programmers, and reinforce important concepts. Contains many Python-code and pseudo-code fragments, and hundreds of exercises, which are divided into roughly 40% reinforcement exercises, 40% creativity exercises, and 20% programming projects.

arc42 by Example

arc42 by Example PDF Author: Dr. Gernot Starke
Publisher: Packt Publishing Ltd
ISBN: 1839219262
Category : Computers
Languages : en
Pages : 236

Get Book Here

Book Description
Document the architecture of your software easily with this highly practical, open-source template. Key FeaturesGet to grips with leveraging the features of arc42 to create insightful documentsLearn the concepts of software architecture documentation through real-world examplesDiscover techniques to create compact, helpful, and easy-to-read documentationBook Description When developers document the architecture of their systems, they often invent their own specific ways of articulating structures, designs, concepts, and decisions. What they need is a template that enables simple and efficient software architecture documentation. arc42 by Example shows how it's done through several real-world examples. Each example in the book, whether it is a chess engine, a huge CRM system, or a cool web system, starts with a brief description of the problem domain and the quality requirements. Then, you'll discover the system context with all the external interfaces. You'll dive into an overview of the solution strategy to implement the building blocks and runtime scenarios. The later chapters also explain various cross-cutting concerns and how they affect other aspects of a program. What you will learnUtilize arc42 to document a system's physical infrastructureLearn how to identify a system's scope and boundariesBreak a system down into building blocks and illustrate the relationships between themDiscover how to describe the runtime behavior of a systemKnow how to document design decisions and their reasonsExplore the risks and technical debt of your systemWho this book is for This book is for software developers and solutions architects who are looking for an easy, open-source tool to document their systems. It is a useful reference for those who are already using arc42. If you are new to arc42, this book is a great learning resource. For those of you who want to write better technical documentation will benefit from the general concepts covered in this book.

Algorithms

Algorithms PDF Author: Sanjoy Dasgupta
Publisher: McGraw-Hill Higher Education
ISBN: 0077388496
Category : Computer algorithms
Languages : en
Pages : 338

Get Book Here

Book Description
This text, extensively class-tested over a decade at UC Berkeley and UC San Diego, explains the fundamentals of algorithms in a story line that makes the material enjoyable and easy to digest. Emphasis is placed on understanding the crisp mathematical idea behind each algorithm, in a manner that is intuitive and rigorous without being unduly formal. Features include:The use of boxes to strengthen the narrative: pieces that provide historical context, descriptions of how the algorithms are used in practice, and excursions for the mathematically sophisticated. Carefully chosen advanced topics that can be skipped in a standard one-semester course but can be covered in an advanced algorithms course or in a more leisurely two-semester sequence.An accessible treatment of linear programming introduces students to one of the greatest achievements in algorithms. An optional chapter on the quantum algorithm for factoring provides a unique peephole into this exciting topic. In addition to the text DasGupta also offers a Solutions Manual which is available on the Online Learning Center."Algorithms is an outstanding undergraduate text equally informed by the historical roots and contemporary applications of its subject. Like a captivating novel it is a joy to read." Tim Roughgarden Stanford University