Author: Hadley Wickham
Publisher: "O'Reilly Media, Inc."
ISBN: 1491910364
Category : Computers
Languages : en
Pages : 521
Book Description
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you've learned along the way. You'll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results
R for Data Science
Author: Hadley Wickham
Publisher: "O'Reilly Media, Inc."
ISBN: 1491910364
Category : Computers
Languages : en
Pages : 521
Book Description
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you've learned along the way. You'll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results
Publisher: "O'Reilly Media, Inc."
ISBN: 1491910364
Category : Computers
Languages : en
Pages : 521
Book Description
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you've learned along the way. You'll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results
Data Science Workflow for Beginners
Author: Alejandro Garcia
Publisher: Alejandro Garcia
ISBN:
Category : Computers
Languages : en
Pages : 62
Book Description
This book brings to you a simple yet effective 40 to 60 mins introduction that will clear all your doubts about Data Sience and will answer some important questions like: What is data Science ? The book explores all the initial concepts a person might want to know about the data science workflow. There’s not coding, math or statistics required to successfully understand the goals and end results of this process. This book takes you on an exclusive tour of datasets and sites to download your first datasets. Then jumps into a comprehensive and easy-to-follow data science process letting you go through 3 data visualization projects. (Python Code Understanding is Recommended for the Data Visualization projects) - 40 to 60 mins reading time. - 3 Data Visualization projects. - 10 Datasets sources. - 26 Quality datasets for your first visualizations. - Get the code and reuse in your own projects. The ebook covers: - Intro to Data Science. - The Workflow of Data Science. - Data Science and Machine Learning. - Datasets to start right away. - Data Visualization Projects. (Python Code Understanding Recommended)
Publisher: Alejandro Garcia
ISBN:
Category : Computers
Languages : en
Pages : 62
Book Description
This book brings to you a simple yet effective 40 to 60 mins introduction that will clear all your doubts about Data Sience and will answer some important questions like: What is data Science ? The book explores all the initial concepts a person might want to know about the data science workflow. There’s not coding, math or statistics required to successfully understand the goals and end results of this process. This book takes you on an exclusive tour of datasets and sites to download your first datasets. Then jumps into a comprehensive and easy-to-follow data science process letting you go through 3 data visualization projects. (Python Code Understanding is Recommended for the Data Visualization projects) - 40 to 60 mins reading time. - 3 Data Visualization projects. - 10 Datasets sources. - 26 Quality datasets for your first visualizations. - Get the code and reuse in your own projects. The ebook covers: - Intro to Data Science. - The Workflow of Data Science. - Data Science and Machine Learning. - Datasets to start right away. - Data Visualization Projects. (Python Code Understanding Recommended)
Data Science from Scratch
Author: Joel Grus
Publisher: "O'Reilly Media, Inc."
ISBN: 1491904399
Category : Computers
Languages : en
Pages : 336
Book Description
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
Publisher: "O'Reilly Media, Inc."
ISBN: 1491904399
Category : Computers
Languages : en
Pages : 336
Book Description
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
Data Science Solutions
Author: Manav Sehgal
Publisher:
ISBN: 9781520545318
Category :
Languages : en
Pages : 281
Book Description
The field of data science, big data, machine learning, and artificial intelligence is exciting and complex at the same time. Data science is also rapidly growing with new tools, technologies, algorithms, datasets, and use cases. For a beginner in this field, the learning curve can be fairly daunting. This is where this book helps. The data science solutions book provides a repeatable, robust, and reliable framework to apply the right-fit workflows, strategies, tools, APIs, and domain for your data science projects. This book takes a solutions focused approach to data science. Each chapter meets an end-to-end objective of solving for data science workflow or technology requirements. At the end of each chapter you either complete a data science tools pipeline or write a fully functional coding project meeting your data science workflow requirements. SEVEN STAGES OF DATA SCIENCE SOLUTIONS WORKFLOW Every chapter in this book will go through one or more of these seven stages of data science solutions workflow. STAGE 1: Question. Problem. Solution. Before starting a data science project we must ask relevant questions specific to our project domain and datasets. We may answer or solve these during the course of our project. Think of these questions-solutions as the key requirements for our data science project. Here are some templates that can be used to frame questions for our data science projects. Can we classify an entity based on given features if our data science model is trained on certain number of samples with similar features related to specific classes?Do the samples, in a given dataset, cluster in specific classes based on similar or correlated features?Can our machine learning model recognise and classify new inputs based on prior training on a sample of similar inputs?STAGE 2: Acquire. Search. Create. Catalog.This stage involves data acquisition strategies including searching for datasets on popular data sources or internally within your organisation. We may also create a dataset based on external or internal data sources. The acquire stage may feedback to the question stage, refining our problem and solution definition based on the constraints and characteristics of the acquired datasets. STAGE 3: Wrangle. Prepare. Cleanse.The data wrangle phase prepares and cleanses our datasets for our project goals. This workflow stage starts by importing a dataset, exploring the dataset for its features and available samples, preparing the dataset using appropriate data types and data structures, and optionally cleansing the data set for creating model training and solution testing samples. The wrangle stage may circle back to the acquire stage to identify complementary datasets to combine and complete the existing dataset. STAGE 4: Analyse. Patterns. Explore.The analyse phase explores the given datasets to determine patterns, correlations, classification, and nature of the dataset. This helps determine choice of model algorithms and strategies that may work best on the dataset. The analyse stage may also visualize the dataset to determine such patterns. STAGE 5: Model. Predict. Solve.The model stage uses prediction and solution algorithms to train on a given dataset and apply this training to solve for a given problem. STAGE 6: Visualize. Report. Present.The visualization stage can help data wrangling, analysis, and modeling stages. Data can be visualized using charts and plots suiting the characteristics of the dataset and the desired results.Visualization stage may also provide the inputs for the supply stage.STAGE 7: Supply. Products. Services.Once we are ready to monetize our data science solution or derive further return on investment from our projects, we need to think about distribution and data supply chain. This stage circles back to the acquisition stage. In fact we are acquiring data from someone else's data supply chain.
Publisher:
ISBN: 9781520545318
Category :
Languages : en
Pages : 281
Book Description
The field of data science, big data, machine learning, and artificial intelligence is exciting and complex at the same time. Data science is also rapidly growing with new tools, technologies, algorithms, datasets, and use cases. For a beginner in this field, the learning curve can be fairly daunting. This is where this book helps. The data science solutions book provides a repeatable, robust, and reliable framework to apply the right-fit workflows, strategies, tools, APIs, and domain for your data science projects. This book takes a solutions focused approach to data science. Each chapter meets an end-to-end objective of solving for data science workflow or technology requirements. At the end of each chapter you either complete a data science tools pipeline or write a fully functional coding project meeting your data science workflow requirements. SEVEN STAGES OF DATA SCIENCE SOLUTIONS WORKFLOW Every chapter in this book will go through one or more of these seven stages of data science solutions workflow. STAGE 1: Question. Problem. Solution. Before starting a data science project we must ask relevant questions specific to our project domain and datasets. We may answer or solve these during the course of our project. Think of these questions-solutions as the key requirements for our data science project. Here are some templates that can be used to frame questions for our data science projects. Can we classify an entity based on given features if our data science model is trained on certain number of samples with similar features related to specific classes?Do the samples, in a given dataset, cluster in specific classes based on similar or correlated features?Can our machine learning model recognise and classify new inputs based on prior training on a sample of similar inputs?STAGE 2: Acquire. Search. Create. Catalog.This stage involves data acquisition strategies including searching for datasets on popular data sources or internally within your organisation. We may also create a dataset based on external or internal data sources. The acquire stage may feedback to the question stage, refining our problem and solution definition based on the constraints and characteristics of the acquired datasets. STAGE 3: Wrangle. Prepare. Cleanse.The data wrangle phase prepares and cleanses our datasets for our project goals. This workflow stage starts by importing a dataset, exploring the dataset for its features and available samples, preparing the dataset using appropriate data types and data structures, and optionally cleansing the data set for creating model training and solution testing samples. The wrangle stage may circle back to the acquire stage to identify complementary datasets to combine and complete the existing dataset. STAGE 4: Analyse. Patterns. Explore.The analyse phase explores the given datasets to determine patterns, correlations, classification, and nature of the dataset. This helps determine choice of model algorithms and strategies that may work best on the dataset. The analyse stage may also visualize the dataset to determine such patterns. STAGE 5: Model. Predict. Solve.The model stage uses prediction and solution algorithms to train on a given dataset and apply this training to solve for a given problem. STAGE 6: Visualize. Report. Present.The visualization stage can help data wrangling, analysis, and modeling stages. Data can be visualized using charts and plots suiting the characteristics of the dataset and the desired results.Visualization stage may also provide the inputs for the supply stage.STAGE 7: Supply. Products. Services.Once we are ready to monetize our data science solution or derive further return on investment from our projects, we need to think about distribution and data supply chain. This stage circles back to the acquisition stage. In fact we are acquiring data from someone else's data supply chain.
Data Science on AWS
Author: Chris Fregly
Publisher: "O'Reilly Media, Inc."
ISBN: 1492079367
Category : Computers
Languages : en
Pages : 524
Book Description
With this practical book, AI and machine learning practitioners will learn how to successfully build and deploy data science projects on Amazon Web Services. The Amazon AI and machine learning stack unifies data science, data engineering, and application development to help level up your skills. This guide shows you how to build and run pipelines in the cloud, then integrate the results into applications in minutes instead of days. Throughout the book, authors Chris Fregly and Antje Barth demonstrate how to reduce cost and improve performance. Apply the Amazon AI and ML stack to real-world use cases for natural language processing, computer vision, fraud detection, conversational devices, and more Use automated machine learning to implement a specific subset of use cases with SageMaker Autopilot Dive deep into the complete model development lifecycle for a BERT-based NLP use case including data ingestion, analysis, model training, and deployment Tie everything together into a repeatable machine learning operations pipeline Explore real-time ML, anomaly detection, and streaming analytics on data streams with Amazon Kinesis and Managed Streaming for Apache Kafka Learn security best practices for data science projects and workflows including identity and access management, authentication, authorization, and more
Publisher: "O'Reilly Media, Inc."
ISBN: 1492079367
Category : Computers
Languages : en
Pages : 524
Book Description
With this practical book, AI and machine learning practitioners will learn how to successfully build and deploy data science projects on Amazon Web Services. The Amazon AI and machine learning stack unifies data science, data engineering, and application development to help level up your skills. This guide shows you how to build and run pipelines in the cloud, then integrate the results into applications in minutes instead of days. Throughout the book, authors Chris Fregly and Antje Barth demonstrate how to reduce cost and improve performance. Apply the Amazon AI and ML stack to real-world use cases for natural language processing, computer vision, fraud detection, conversational devices, and more Use automated machine learning to implement a specific subset of use cases with SageMaker Autopilot Dive deep into the complete model development lifecycle for a BERT-based NLP use case including data ingestion, analysis, model training, and deployment Tie everything together into a repeatable machine learning operations pipeline Explore real-time ML, anomaly detection, and streaming analytics on data streams with Amazon Kinesis and Managed Streaming for Apache Kafka Learn security best practices for data science projects and workflows including identity and access management, authentication, authorization, and more
Confident Data Skills
Author: Kirill Eremenko
Publisher: Kogan Page Publishers
ISBN: 0749481552
Category : Computers
Languages : en
Pages : 272
Book Description
Data has dramatically changed how our world works. From entertainment to politics, from technology to advertising and from science to the business world, understanding and using data is now one of the most transferable and transferable skills out there. Learning how to work with data may seem intimidating or difficult but with Confident Data Skills you will be able to master the fundamentals and supercharge your professional abilities. This essential book covers data mining, preparing data, analysing data, communicating data, financial modelling, visualizing insights and presenting data through film making and dynamic simulations. In-depth international case studies from a wide range of organizations, including Netflix, LinkedIn, Goodreads, Deep Blue, Alpha Go and Mike's Hard Lemonade Co. show successful data techniques in practice and inspire you to turn knowledge into innovation. Confident Data Skills also provides insightful guidance on how you can use data skills to enhance your employability and improve how your industry or company works through your data skills. Expert author and instructor, Kirill Eremenko, is committed to making the complex simple and inspiring you to have the confidence to develop an understanding, adeptness and love of data.
Publisher: Kogan Page Publishers
ISBN: 0749481552
Category : Computers
Languages : en
Pages : 272
Book Description
Data has dramatically changed how our world works. From entertainment to politics, from technology to advertising and from science to the business world, understanding and using data is now one of the most transferable and transferable skills out there. Learning how to work with data may seem intimidating or difficult but with Confident Data Skills you will be able to master the fundamentals and supercharge your professional abilities. This essential book covers data mining, preparing data, analysing data, communicating data, financial modelling, visualizing insights and presenting data through film making and dynamic simulations. In-depth international case studies from a wide range of organizations, including Netflix, LinkedIn, Goodreads, Deep Blue, Alpha Go and Mike's Hard Lemonade Co. show successful data techniques in practice and inspire you to turn knowledge into innovation. Confident Data Skills also provides insightful guidance on how you can use data skills to enhance your employability and improve how your industry or company works through your data skills. Expert author and instructor, Kirill Eremenko, is committed to making the complex simple and inspiring you to have the confidence to develop an understanding, adeptness and love of data.
Data Science at the Command Line
Author: Jeroen Janssens
Publisher: "O'Reilly Media, Inc."
ISBN: 1492087866
Category : Computers
Languages : en
Pages : 270
Book Description
This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools--useful whether you work with Windows, macOS, or Linux. You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power. This book is ideal for data scientists, analysts, engineers, system administrators, and researchers. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on text, CSV, HTML, XML, and JSON files Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow Create your own tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines Model data with dimensionality reduction, regression, and classification algorithms Leverage the command line from Python, Jupyter, R, RStudio, and Apache Spark
Publisher: "O'Reilly Media, Inc."
ISBN: 1492087866
Category : Computers
Languages : en
Pages : 270
Book Description
This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools--useful whether you work with Windows, macOS, or Linux. You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power. This book is ideal for data scientists, analysts, engineers, system administrators, and researchers. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on text, CSV, HTML, XML, and JSON files Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow Create your own tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines Model data with dimensionality reduction, regression, and classification algorithms Leverage the command line from Python, Jupyter, R, RStudio, and Apache Spark
Data Science at the Command Line
Author: Jeroen Janssens
Publisher: "O'Reilly Media, Inc."
ISBN: 1491947802
Category : Computers
Languages : en
Pages : 207
Book Description
This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data. To get you started—whether you’re on Windows, OS X, or Linux—author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools. Discover why the command line is an agile, scalable, and extensible technology. Even if you’re already comfortable processing data with, say, Python or R, you’ll greatly improve your data science workflow by also leveraging the power of the command line. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on plain text, CSV, HTML/XML, and JSON Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow using Drake Create reusable tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines using GNU Parallel Model data with dimensionality reduction, clustering, regression, and classification algorithms
Publisher: "O'Reilly Media, Inc."
ISBN: 1491947802
Category : Computers
Languages : en
Pages : 207
Book Description
This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data. To get you started—whether you’re on Windows, OS X, or Linux—author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools. Discover why the command line is an agile, scalable, and extensible technology. Even if you’re already comfortable processing data with, say, Python or R, you’ll greatly improve your data science workflow by also leveraging the power of the command line. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on plain text, CSV, HTML/XML, and JSON Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow using Drake Create reusable tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines using GNU Parallel Model data with dimensionality reduction, clustering, regression, and classification algorithms
Model-Based Machine Learning
Author: John Winn
Publisher: CRC Press
ISBN: 1498756824
Category : Business & Economics
Languages : en
Pages : 469
Book Description
Today, machine learning is being applied to a growing variety of problems in a bewildering variety of domains. A fundamental challenge when using machine learning is connecting the abstract mathematics of a machine learning technique to a concrete, real world problem. This book tackles this challenge through model-based machine learning which focuses on understanding the assumptions encoded in a machine learning system and their corresponding impact on the behaviour of the system. The key ideas of model-based machine learning are introduced through a series of case studies involving real-world applications. Case studies play a central role because it is only in the context of applications that it makes sense to discuss modelling assumptions. Each chapter introduces one case study and works through step-by-step to solve it using a model-based approach. The aim is not just to explain machine learning methods, but also showcase how to create, debug, and evolve them to solve a problem. Features: Explores the assumptions being made by machine learning systems and the effect these assumptions have when the system is applied to concrete problems. Explains machine learning concepts as they arise in real-world case studies. Shows how to diagnose, understand and address problems with machine learning systems. Full source code available, allowing models and results to be reproduced and explored. Includes optional deep-dive sections with more mathematical details on inference algorithms for the interested reader.
Publisher: CRC Press
ISBN: 1498756824
Category : Business & Economics
Languages : en
Pages : 469
Book Description
Today, machine learning is being applied to a growing variety of problems in a bewildering variety of domains. A fundamental challenge when using machine learning is connecting the abstract mathematics of a machine learning technique to a concrete, real world problem. This book tackles this challenge through model-based machine learning which focuses on understanding the assumptions encoded in a machine learning system and their corresponding impact on the behaviour of the system. The key ideas of model-based machine learning are introduced through a series of case studies involving real-world applications. Case studies play a central role because it is only in the context of applications that it makes sense to discuss modelling assumptions. Each chapter introduces one case study and works through step-by-step to solve it using a model-based approach. The aim is not just to explain machine learning methods, but also showcase how to create, debug, and evolve them to solve a problem. Features: Explores the assumptions being made by machine learning systems and the effect these assumptions have when the system is applied to concrete problems. Explains machine learning concepts as they arise in real-world case studies. Shows how to diagnose, understand and address problems with machine learning systems. Full source code available, allowing models and results to be reproduced and explored. Includes optional deep-dive sections with more mathematical details on inference algorithms for the interested reader.
Data Science
Author: Tiffany Timbers
Publisher: CRC Press
ISBN: 1000579646
Category : Business & Economics
Languages : en
Pages : 466
Book Description
Data Science: A First Introduction focuses on using the R programming language in Jupyter notebooks to perform data manipulation and cleaning, create effective visualizations, and extract insights from data using classification, regression, clustering, and inference. The text emphasizes workflows that are clear, reproducible, and shareable, and includes coverage of the basics of version control. All source code is available online, demonstrating the use of good reproducible project workflows. Based on educational research and active learning principles, the book uses a modern approach to R and includes accompanying autograded Jupyter worksheets for interactive, self-directed learning. The book will leave readers well-prepared for data science projects. The book is designed for learners from all disciplines with minimal prior knowledge of mathematics and programming. The authors have honed the material through years of experience teaching thousands of undergraduates in the University of British Columbia’s DSCI100: Introduction to Data Science course.
Publisher: CRC Press
ISBN: 1000579646
Category : Business & Economics
Languages : en
Pages : 466
Book Description
Data Science: A First Introduction focuses on using the R programming language in Jupyter notebooks to perform data manipulation and cleaning, create effective visualizations, and extract insights from data using classification, regression, clustering, and inference. The text emphasizes workflows that are clear, reproducible, and shareable, and includes coverage of the basics of version control. All source code is available online, demonstrating the use of good reproducible project workflows. Based on educational research and active learning principles, the book uses a modern approach to R and includes accompanying autograded Jupyter worksheets for interactive, self-directed learning. The book will leave readers well-prepared for data science projects. The book is designed for learners from all disciplines with minimal prior knowledge of mathematics and programming. The authors have honed the material through years of experience teaching thousands of undergraduates in the University of British Columbia’s DSCI100: Introduction to Data Science course.