Author: Randall Pruim
Publisher: American Mathematical Soc.
ISBN: 1470428482
Category : Computers
Languages : en
Pages : 842
Book Description
Foundations and Applications of Statistics simultaneously emphasizes both the foundational and the computational aspects of modern statistics. Engaging and accessible, this book is useful to undergraduate students with a wide range of backgrounds and career goals. The exposition immediately begins with statistics, presenting concepts and results from probability along the way. Hypothesis testing is introduced very early, and the motivation for several probability distributions comes from p-value computations. Pruim develops the students' practical statistical reasoning through explicit examples and through numerical and graphical summaries of data that allow intuitive inferences before introducing the formal machinery. The topics have been selected to reflect the current practice in statistics, where computation is an indispensible tool. In this vein, the statistical computing environment R is used throughout the text and is integral to the exposition. Attention is paid to developing students' mathematical and computational skills as well as their statistical reasoning. Linear models, such as regression and ANOVA, are treated with explicit reference to the underlying linear algebra, which is motivated geometrically. Foundations and Applications of Statistics discusses both the mathematical theory underlying statistics and practical applications that make it a powerful tool across disciplines. The book contains ample material for a two-semester course in undergraduate probability and statistics. A one-semester course based on the book will cover hypothesis testing and confidence intervals for the most common situations. In the second edition, the R code has been updated throughout to take advantage of new R packages and to illustrate better coding style. New sections have been added covering bootstrap methods, multinomial and multivariate normal distributions, the delta method, numerical methods for Bayesian inference, and nonlinear least squares. Also, the use of matrix algebra has been expanded, but remains optional, providing instructors with more options regarding the amount of linear algebra required.
Foundations and Applications of Statistics
Author: Randall Pruim
Publisher: American Mathematical Soc.
ISBN: 1470428482
Category : Computers
Languages : en
Pages : 842
Book Description
Foundations and Applications of Statistics simultaneously emphasizes both the foundational and the computational aspects of modern statistics. Engaging and accessible, this book is useful to undergraduate students with a wide range of backgrounds and career goals. The exposition immediately begins with statistics, presenting concepts and results from probability along the way. Hypothesis testing is introduced very early, and the motivation for several probability distributions comes from p-value computations. Pruim develops the students' practical statistical reasoning through explicit examples and through numerical and graphical summaries of data that allow intuitive inferences before introducing the formal machinery. The topics have been selected to reflect the current practice in statistics, where computation is an indispensible tool. In this vein, the statistical computing environment R is used throughout the text and is integral to the exposition. Attention is paid to developing students' mathematical and computational skills as well as their statistical reasoning. Linear models, such as regression and ANOVA, are treated with explicit reference to the underlying linear algebra, which is motivated geometrically. Foundations and Applications of Statistics discusses both the mathematical theory underlying statistics and practical applications that make it a powerful tool across disciplines. The book contains ample material for a two-semester course in undergraduate probability and statistics. A one-semester course based on the book will cover hypothesis testing and confidence intervals for the most common situations. In the second edition, the R code has been updated throughout to take advantage of new R packages and to illustrate better coding style. New sections have been added covering bootstrap methods, multinomial and multivariate normal distributions, the delta method, numerical methods for Bayesian inference, and nonlinear least squares. Also, the use of matrix algebra has been expanded, but remains optional, providing instructors with more options regarding the amount of linear algebra required.
Publisher: American Mathematical Soc.
ISBN: 1470428482
Category : Computers
Languages : en
Pages : 842
Book Description
Foundations and Applications of Statistics simultaneously emphasizes both the foundational and the computational aspects of modern statistics. Engaging and accessible, this book is useful to undergraduate students with a wide range of backgrounds and career goals. The exposition immediately begins with statistics, presenting concepts and results from probability along the way. Hypothesis testing is introduced very early, and the motivation for several probability distributions comes from p-value computations. Pruim develops the students' practical statistical reasoning through explicit examples and through numerical and graphical summaries of data that allow intuitive inferences before introducing the formal machinery. The topics have been selected to reflect the current practice in statistics, where computation is an indispensible tool. In this vein, the statistical computing environment R is used throughout the text and is integral to the exposition. Attention is paid to developing students' mathematical and computational skills as well as their statistical reasoning. Linear models, such as regression and ANOVA, are treated with explicit reference to the underlying linear algebra, which is motivated geometrically. Foundations and Applications of Statistics discusses both the mathematical theory underlying statistics and practical applications that make it a powerful tool across disciplines. The book contains ample material for a two-semester course in undergraduate probability and statistics. A one-semester course based on the book will cover hypothesis testing and confidence intervals for the most common situations. In the second edition, the R code has been updated throughout to take advantage of new R packages and to illustrate better coding style. New sections have been added covering bootstrap methods, multinomial and multivariate normal distributions, the delta method, numerical methods for Bayesian inference, and nonlinear least squares. Also, the use of matrix algebra has been expanded, but remains optional, providing instructors with more options regarding the amount of linear algebra required.
Statistical Foundations of Data Science
Author: Jianqing Fan
Publisher: CRC Press
ISBN: 0429527616
Category : Mathematics
Languages : en
Pages : 974
Book Description
Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.
Publisher: CRC Press
ISBN: 0429527616
Category : Mathematics
Languages : en
Pages : 974
Book Description
Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.
Foundations of Statistics for Data Scientists
Author: Alan Agresti
Publisher: CRC Press
ISBN: 1000462919
Category : Business & Economics
Languages : en
Pages : 486
Book Description
Foundations of Statistics for Data Scientists: With R and Python is designed as a textbook for a one- or two-term introduction to mathematical statistics for students training to become data scientists. It is an in-depth presentation of the topics in statistical science with which any data scientist should be familiar, including probability distributions, descriptive and inferential statistical methods, and linear modeling. The book assumes knowledge of basic calculus, so the presentation can focus on "why it works" as well as "how to do it." Compared to traditional "mathematical statistics" textbooks, however, the book has less emphasis on probability theory and more emphasis on using software to implement statistical methods and to conduct simulations to illustrate key concepts. All statistical analyses in the book use R software, with an appendix showing the same analyses with Python. Key Features: Shows the elements of statistical science that are important for students who plan to become data scientists. Includes Bayesian and regularized fitting of models (e.g., showing an example using the lasso), classification and clustering, and implementing methods with modern software (R and Python). Contains nearly 500 exercises. The book also introduces modern topics that do not normally appear in mathematical statistics texts but are highly relevant for data scientists, such as Bayesian inference, generalized linear models for non-normal responses (e.g., logistic regression and Poisson loglinear models), and regularized model fitting. The nearly 500 exercises are grouped into "Data Analysis and Applications" and "Methods and Concepts." Appendices introduce R and Python and contain solutions for odd-numbered exercises. The book's website (http://stat4ds.rwth-aachen.de/) has expanded R, Python, and Matlab appendices and all data sets from the examples and exercises.
Publisher: CRC Press
ISBN: 1000462919
Category : Business & Economics
Languages : en
Pages : 486
Book Description
Foundations of Statistics for Data Scientists: With R and Python is designed as a textbook for a one- or two-term introduction to mathematical statistics for students training to become data scientists. It is an in-depth presentation of the topics in statistical science with which any data scientist should be familiar, including probability distributions, descriptive and inferential statistical methods, and linear modeling. The book assumes knowledge of basic calculus, so the presentation can focus on "why it works" as well as "how to do it." Compared to traditional "mathematical statistics" textbooks, however, the book has less emphasis on probability theory and more emphasis on using software to implement statistical methods and to conduct simulations to illustrate key concepts. All statistical analyses in the book use R software, with an appendix showing the same analyses with Python. Key Features: Shows the elements of statistical science that are important for students who plan to become data scientists. Includes Bayesian and regularized fitting of models (e.g., showing an example using the lasso), classification and clustering, and implementing methods with modern software (R and Python). Contains nearly 500 exercises. The book also introduces modern topics that do not normally appear in mathematical statistics texts but are highly relevant for data scientists, such as Bayesian inference, generalized linear models for non-normal responses (e.g., logistic regression and Poisson loglinear models), and regularized model fitting. The nearly 500 exercises are grouped into "Data Analysis and Applications" and "Methods and Concepts." Appendices introduce R and Python and contain solutions for odd-numbered exercises. The book's website (http://stat4ds.rwth-aachen.de/) has expanded R, Python, and Matlab appendices and all data sets from the examples and exercises.
The Foundations of Statistics: A Simulation-based Approach
Author: Shravan Vasishth
Publisher: Springer Science & Business Media
ISBN: 3642163130
Category : Mathematics
Languages : en
Pages : 187
Book Description
Statistics and hypothesis testing are routinely used in areas (such as linguistics) that are traditionally not mathematically intensive. In such fields, when faced with experimental data, many students and researchers tend to rely on commercial packages to carry out statistical data analysis, often without understanding the logic of the statistical tests they rely on. As a consequence, results are often misinterpreted, and users have difficulty in flexibly applying techniques relevant to their own research — they use whatever they happen to have learned. A simple solution is to teach the fundamental ideas of statistical hypothesis testing without using too much mathematics. This book provides a non-mathematical, simulation-based introduction to basic statistical concepts and encourages readers to try out the simulations themselves using the source code and data provided (the freely available programming language R is used throughout). Since the code presented in the text almost always requires the use of previously introduced programming constructs, diligent students also acquire basic programming abilities in R. The book is intended for advanced undergraduate and graduate students in any discipline, although the focus is on linguistics, psychology, and cognitive science. It is designed for self-instruction, but it can also be used as a textbook for a first course on statistics. Earlier versions of the book have been used in undergraduate and graduate courses in Europe and the US. ”Vasishth and Broe have written an attractive introduction to the foundations of statistics. It is concise, surprisingly comprehensive, self-contained and yet quite accessible. Highly recommended.” Harald Baayen, Professor of Linguistics, University of Alberta, Canada ”By using the text students not only learn to do the specific things outlined in the book, they also gain a skill set that empowers them to explore new areas that lie beyond the book’s coverage.” Colin Phillips, Professor of Linguistics, University of Maryland, USA
Publisher: Springer Science & Business Media
ISBN: 3642163130
Category : Mathematics
Languages : en
Pages : 187
Book Description
Statistics and hypothesis testing are routinely used in areas (such as linguistics) that are traditionally not mathematically intensive. In such fields, when faced with experimental data, many students and researchers tend to rely on commercial packages to carry out statistical data analysis, often without understanding the logic of the statistical tests they rely on. As a consequence, results are often misinterpreted, and users have difficulty in flexibly applying techniques relevant to their own research — they use whatever they happen to have learned. A simple solution is to teach the fundamental ideas of statistical hypothesis testing without using too much mathematics. This book provides a non-mathematical, simulation-based introduction to basic statistical concepts and encourages readers to try out the simulations themselves using the source code and data provided (the freely available programming language R is used throughout). Since the code presented in the text almost always requires the use of previously introduced programming constructs, diligent students also acquire basic programming abilities in R. The book is intended for advanced undergraduate and graduate students in any discipline, although the focus is on linguistics, psychology, and cognitive science. It is designed for self-instruction, but it can also be used as a textbook for a first course on statistics. Earlier versions of the book have been used in undergraduate and graduate courses in Europe and the US. ”Vasishth and Broe have written an attractive introduction to the foundations of statistics. It is concise, surprisingly comprehensive, self-contained and yet quite accessible. Highly recommended.” Harald Baayen, Professor of Linguistics, University of Alberta, Canada ”By using the text students not only learn to do the specific things outlined in the book, they also gain a skill set that empowers them to explore new areas that lie beyond the book’s coverage.” Colin Phillips, Professor of Linguistics, University of Maryland, USA
Foundations and Applications of Statistics
Author: Randall J. Pruim
Publisher: American Mathematical Soc.
ISBN: 0821852337
Category : Computers
Languages : en
Pages : 640
Book Description
Foundations and Applications of Statistics simultaneously emphasizes both the foundational and the computational aspects of modern statistics. Engaging and accessible, this book is useful to undergraduate students with a wide range of backgrounds and career goals. The exposition immediately begins with statistics, presenting concepts and results from probability along the way. Hypothesis testing is introduced very early, and the motivation for several probability distributions comes from p-value computations. Pruim develops the students' practical statistical reasoning through explicit examples and through numerical and graphical summaries of data that allow intuitive inferences before introducing the formal machinery. The topics have been selected to reflect the current practice in statistics, where computation is an indispensible tool. In this vein, the statistical computing environment \mathsf{R} is used throughout the text and is integral to the exposition. Attention is paid to developing students' mathematical and computational skills as well as their statistical reasoning. Linear models, such as regression and ANOVA, are treated with explicit reference to the underlying linear algebra, which is motivated geometrically. Foundations and Applications of Statistics discusses both the mathematical theory underlying statistics and practical applications that make it a powerful tool across disciplines. The book contains ample material for a two-semester course in undergraduate probability and statistics. A one-semester course based on the book will cover hypothesis testing and confidence intervals for the most common situations.
Publisher: American Mathematical Soc.
ISBN: 0821852337
Category : Computers
Languages : en
Pages : 640
Book Description
Foundations and Applications of Statistics simultaneously emphasizes both the foundational and the computational aspects of modern statistics. Engaging and accessible, this book is useful to undergraduate students with a wide range of backgrounds and career goals. The exposition immediately begins with statistics, presenting concepts and results from probability along the way. Hypothesis testing is introduced very early, and the motivation for several probability distributions comes from p-value computations. Pruim develops the students' practical statistical reasoning through explicit examples and through numerical and graphical summaries of data that allow intuitive inferences before introducing the formal machinery. The topics have been selected to reflect the current practice in statistics, where computation is an indispensible tool. In this vein, the statistical computing environment \mathsf{R} is used throughout the text and is integral to the exposition. Attention is paid to developing students' mathematical and computational skills as well as their statistical reasoning. Linear models, such as regression and ANOVA, are treated with explicit reference to the underlying linear algebra, which is motivated geometrically. Foundations and Applications of Statistics discusses both the mathematical theory underlying statistics and practical applications that make it a powerful tool across disciplines. The book contains ample material for a two-semester course in undergraduate probability and statistics. A one-semester course based on the book will cover hypothesis testing and confidence intervals for the most common situations.
Foundations of Statistical Natural Language Processing
Author: Christopher Manning
Publisher: MIT Press
ISBN: 0262303795
Category : Language Arts & Disciplines
Languages : en
Pages : 719
Book Description
Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.
Publisher: MIT Press
ISBN: 0262303795
Category : Language Arts & Disciplines
Languages : en
Pages : 719
Book Description
Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.
Music Data Analysis
Author: Claus Weihs
Publisher: CRC Press
ISBN: 1498719570
Category : Business & Economics
Languages : en
Pages : 694
Book Description
This book provides a comprehensive overview of music data analysis, from introductory material to advanced concepts. It covers various applications including transcription and segmentation as well as chord and harmony, instrument and tempo recognition. It also discusses the implementation aspects of music data analysis such as architecture, user interface and hardware. It is ideal for use in university classes with an interest in music data analysis. It also could be used in computer science and statistics as well as musicology.
Publisher: CRC Press
ISBN: 1498719570
Category : Business & Economics
Languages : en
Pages : 694
Book Description
This book provides a comprehensive overview of music data analysis, from introductory material to advanced concepts. It covers various applications including transcription and segmentation as well as chord and harmony, instrument and tempo recognition. It also discusses the implementation aspects of music data analysis such as architecture, user interface and hardware. It is ideal for use in university classes with an interest in music data analysis. It also could be used in computer science and statistics as well as musicology.
An Introduction to Statistical Learning
Author: Gareth James
Publisher: Springer Nature
ISBN: 3031387473
Category : Mathematics
Languages : en
Pages : 617
Book Description
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance, marketing, and astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. This book is targeted at statisticians and non-statisticians alike, who wish to use cutting-edge statistical learning techniques to analyze their data. Four of the authors co-wrote An Introduction to Statistical Learning, With Applications in R (ISLR), which has become a mainstay of undergraduate and graduate classrooms worldwide, as well as an important reference book for data scientists. One of the keys to its success was that each chapter contains a tutorial on implementing the analyses and methods presented in the R scientific computing environment. However, in recent years Python has become a popular language for data science, and there has been increasing demand for a Python-based alternative to ISLR. Hence, this book (ISLP) covers the same materials as ISLR but with labs implemented in Python. These labs will be useful both for Python novices, as well as experienced users.
Publisher: Springer Nature
ISBN: 3031387473
Category : Mathematics
Languages : en
Pages : 617
Book Description
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance, marketing, and astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. This book is targeted at statisticians and non-statisticians alike, who wish to use cutting-edge statistical learning techniques to analyze their data. Four of the authors co-wrote An Introduction to Statistical Learning, With Applications in R (ISLR), which has become a mainstay of undergraduate and graduate classrooms worldwide, as well as an important reference book for data scientists. One of the keys to its success was that each chapter contains a tutorial on implementing the analyses and methods presented in the R scientific computing environment. However, in recent years Python has become a popular language for data science, and there has been increasing demand for a Python-based alternative to ISLR. Hence, this book (ISLP) covers the same materials as ISLR but with labs implemented in Python. These labs will be useful both for Python novices, as well as experienced users.
Probabilistic Foundations of Statistical Network Analysis
Author: Harry Crane
Publisher: CRC Press
ISBN: 1351807331
Category : Business & Economics
Languages : en
Pages : 257
Book Description
Probabilistic Foundations of Statistical Network Analysis presents a fresh and insightful perspective on the fundamental tenets and major challenges of modern network analysis. Its lucid exposition provides necessary background for understanding the essential ideas behind exchangeable and dynamic network models, network sampling, and network statistics such as sparsity and power law, all of which play a central role in contemporary data science and machine learning applications. The book rewards readers with a clear and intuitive understanding of the subtle interplay between basic principles of statistical inference, empirical properties of network data, and technical concepts from probability theory. Its mathematically rigorous, yet non-technical, exposition makes the book accessible to professional data scientists, statisticians, and computer scientists as well as practitioners and researchers in substantive fields. Newcomers and non-quantitative researchers will find its conceptual approach invaluable for developing intuition about technical ideas from statistics and probability, while experts and graduate students will find the book a handy reference for a wide range of new topics, including edge exchangeability, relative exchangeability, graphon and graphex models, and graph-valued Levy process and rewiring models for dynamic networks. The author’s incisive commentary supplements these core concepts, challenging the reader to push beyond the current limitations of this emerging discipline. With an approachable exposition and more than 50 open research problems and exercises with solutions, this book is ideal for advanced undergraduate and graduate students interested in modern network analysis, data science, machine learning, and statistics.
Publisher: CRC Press
ISBN: 1351807331
Category : Business & Economics
Languages : en
Pages : 257
Book Description
Probabilistic Foundations of Statistical Network Analysis presents a fresh and insightful perspective on the fundamental tenets and major challenges of modern network analysis. Its lucid exposition provides necessary background for understanding the essential ideas behind exchangeable and dynamic network models, network sampling, and network statistics such as sparsity and power law, all of which play a central role in contemporary data science and machine learning applications. The book rewards readers with a clear and intuitive understanding of the subtle interplay between basic principles of statistical inference, empirical properties of network data, and technical concepts from probability theory. Its mathematically rigorous, yet non-technical, exposition makes the book accessible to professional data scientists, statisticians, and computer scientists as well as practitioners and researchers in substantive fields. Newcomers and non-quantitative researchers will find its conceptual approach invaluable for developing intuition about technical ideas from statistics and probability, while experts and graduate students will find the book a handy reference for a wide range of new topics, including edge exchangeability, relative exchangeability, graphon and graphex models, and graph-valued Levy process and rewiring models for dynamic networks. The author’s incisive commentary supplements these core concepts, challenging the reader to push beyond the current limitations of this emerging discipline. With an approachable exposition and more than 50 open research problems and exercises with solutions, this book is ideal for advanced undergraduate and graduate students interested in modern network analysis, data science, machine learning, and statistics.
Foundations of Data Science
Author: Avrim Blum
Publisher: Cambridge University Press
ISBN: 1108617360
Category : Computers
Languages : en
Pages : 433
Book Description
This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.
Publisher: Cambridge University Press
ISBN: 1108617360
Category : Computers
Languages : en
Pages : 433
Book Description
This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.