Author: Martin Braschler
Publisher: Springer
ISBN: 3030118215
Category : Computers
Languages : en
Pages : 464
Book Description
This book has two main goals: to define data science through the work of data scientists and their results, namely data products, while simultaneously providing the reader with relevant lessons learned from applied data science projects at the intersection of academia and industry. As such, it is not a replacement for a classical textbook (i.e., it does not elaborate on fundamentals of methods and principles described elsewhere), but systematically highlights the connection between theory, on the one hand, and its application in specific use cases, on the other. With these goals in mind, the book is divided into three parts: Part I pays tribute to the interdisciplinary nature of data science and provides a common understanding of data science terminology for readers with different backgrounds. These six chapters are geared towards drawing a consistent picture of data science and were predominantly written by the editors themselves. Part II then broadens the spectrum by presenting views and insights from diverse authors – some from academia and some from industry, ranging from financial to health and from manufacturing to e-commerce. Each of these chapters describes a fundamental principle, method or tool in data science by analyzing specific use cases and drawing concrete conclusions from them. The case studies presented, and the methods and tools applied, represent the nuts and bolts of data science. Finally, Part III was again written from the perspective of the editors and summarizes the lessons learned that have been distilled from the case studies in Part II. The section can be viewed as a meta-study on data science across a broad range of domains, viewpoints and fields. Moreover, it provides answers to the question of what the mission-critical factors for success in different data science undertakings are. The book targets professionals as well as students of data science: first, practicing data scientists in industry and academia who want to broaden their scope and expand their knowledge by drawing on the authors’ combined experience. Second, decision makers in businesses who face the challenge of creating or implementing a data-driven strategy and who want to learn from success stories spanning a range of industries. Third, students of data science who want to understand both the theoretical and practical aspects of data science, vetted by real-world case studies at the intersection of academia and industry.
Applied Data Science
Author: Martin Braschler
Publisher: Springer
ISBN: 3030118215
Category : Computers
Languages : en
Pages : 464
Book Description
This book has two main goals: to define data science through the work of data scientists and their results, namely data products, while simultaneously providing the reader with relevant lessons learned from applied data science projects at the intersection of academia and industry. As such, it is not a replacement for a classical textbook (i.e., it does not elaborate on fundamentals of methods and principles described elsewhere), but systematically highlights the connection between theory, on the one hand, and its application in specific use cases, on the other. With these goals in mind, the book is divided into three parts: Part I pays tribute to the interdisciplinary nature of data science and provides a common understanding of data science terminology for readers with different backgrounds. These six chapters are geared towards drawing a consistent picture of data science and were predominantly written by the editors themselves. Part II then broadens the spectrum by presenting views and insights from diverse authors – some from academia and some from industry, ranging from financial to health and from manufacturing to e-commerce. Each of these chapters describes a fundamental principle, method or tool in data science by analyzing specific use cases and drawing concrete conclusions from them. The case studies presented, and the methods and tools applied, represent the nuts and bolts of data science. Finally, Part III was again written from the perspective of the editors and summarizes the lessons learned that have been distilled from the case studies in Part II. The section can be viewed as a meta-study on data science across a broad range of domains, viewpoints and fields. Moreover, it provides answers to the question of what the mission-critical factors for success in different data science undertakings are. The book targets professionals as well as students of data science: first, practicing data scientists in industry and academia who want to broaden their scope and expand their knowledge by drawing on the authors’ combined experience. Second, decision makers in businesses who face the challenge of creating or implementing a data-driven strategy and who want to learn from success stories spanning a range of industries. Third, students of data science who want to understand both the theoretical and practical aspects of data science, vetted by real-world case studies at the intersection of academia and industry.
Publisher: Springer
ISBN: 3030118215
Category : Computers
Languages : en
Pages : 464
Book Description
This book has two main goals: to define data science through the work of data scientists and their results, namely data products, while simultaneously providing the reader with relevant lessons learned from applied data science projects at the intersection of academia and industry. As such, it is not a replacement for a classical textbook (i.e., it does not elaborate on fundamentals of methods and principles described elsewhere), but systematically highlights the connection between theory, on the one hand, and its application in specific use cases, on the other. With these goals in mind, the book is divided into three parts: Part I pays tribute to the interdisciplinary nature of data science and provides a common understanding of data science terminology for readers with different backgrounds. These six chapters are geared towards drawing a consistent picture of data science and were predominantly written by the editors themselves. Part II then broadens the spectrum by presenting views and insights from diverse authors – some from academia and some from industry, ranging from financial to health and from manufacturing to e-commerce. Each of these chapters describes a fundamental principle, method or tool in data science by analyzing specific use cases and drawing concrete conclusions from them. The case studies presented, and the methods and tools applied, represent the nuts and bolts of data science. Finally, Part III was again written from the perspective of the editors and summarizes the lessons learned that have been distilled from the case studies in Part II. The section can be viewed as a meta-study on data science across a broad range of domains, viewpoints and fields. Moreover, it provides answers to the question of what the mission-critical factors for success in different data science undertakings are. The book targets professionals as well as students of data science: first, practicing data scientists in industry and academia who want to broaden their scope and expand their knowledge by drawing on the authors’ combined experience. Second, decision makers in businesses who face the challenge of creating or implementing a data-driven strategy and who want to learn from success stories spanning a range of industries. Third, students of data science who want to understand both the theoretical and practical aspects of data science, vetted by real-world case studies at the intersection of academia and industry.
Product Analytics
Author: Joanne Rodrigues
Publisher: Addison-Wesley Professional
ISBN: 0135258634
Category : Computers
Languages : en
Pages : 735
Book Description
Use Product Analytics to Understand Consumer Behavior and Change It at Scale Product Analytics is a complete, hands-on guide to generating actionable business insights from customer data. Experienced data scientist and enterprise manager Joanne Rodrigues introduces practical statistical techniques for determining why things happen and how to change what people do at scale. She complements these with powerful social science techniques for creating better theories, designing better metrics, and driving more rapid and sustained behavior change. Writing for entrepreneurs, product managers/marketers, and other business practitioners, Rodrigues teaches through intuitive examples from both web and offline environments. Avoiding math-heavy explanations, she guides you step by step through choosing the right techniques and algorithms for each application, running analyses in R, and getting answers you can trust. Develop core metrics and effective KPIs for user analytics in any web product Truly understand statistical inference, and the differences between correlation and causation Conduct more effective A/B tests Build intuitive predictive models to capture user behavior in products Use modern, quasi-experimental designs and statistical matching to tease out causal effects from observational data Improve response through uplift modeling and other sophisticated targeting methods Project business costs/subgroup population changes via advanced demographic projection Whatever your product or service, this guide can help you create precision-targeted marketing campaigns, improve consumer satisfaction and engagement, and grow revenue and profits. Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.
Publisher: Addison-Wesley Professional
ISBN: 0135258634
Category : Computers
Languages : en
Pages : 735
Book Description
Use Product Analytics to Understand Consumer Behavior and Change It at Scale Product Analytics is a complete, hands-on guide to generating actionable business insights from customer data. Experienced data scientist and enterprise manager Joanne Rodrigues introduces practical statistical techniques for determining why things happen and how to change what people do at scale. She complements these with powerful social science techniques for creating better theories, designing better metrics, and driving more rapid and sustained behavior change. Writing for entrepreneurs, product managers/marketers, and other business practitioners, Rodrigues teaches through intuitive examples from both web and offline environments. Avoiding math-heavy explanations, she guides you step by step through choosing the right techniques and algorithms for each application, running analyses in R, and getting answers you can trust. Develop core metrics and effective KPIs for user analytics in any web product Truly understand statistical inference, and the differences between correlation and causation Conduct more effective A/B tests Build intuitive predictive models to capture user behavior in products Use modern, quasi-experimental designs and statistical matching to tease out causal effects from observational data Improve response through uplift modeling and other sophisticated targeting methods Project business costs/subgroup population changes via advanced demographic projection Whatever your product or service, this guide can help you create precision-targeted marketing campaigns, improve consumer satisfaction and engagement, and grow revenue and profits. Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.
Applied Data Science in Tourism
Author: Roman Egger
Publisher: Springer Nature
ISBN: 3030883892
Category : Business & Economics
Languages : en
Pages : 647
Book Description
Access to large data sets has led to a paradigm shift in the tourism research landscape. Big data is enabling a new form of knowledge gain, while at the same time shaking the epistemological foundations and requiring new methods and analysis approaches. It allows for interdisciplinary cooperation between computer sciences and social and economic sciences, and complements the traditional research approaches. This book provides a broad basis for the practical application of data science approaches such as machine learning, text mining, social network analysis, and many more, which are essential for interdisciplinary tourism research. Each method is presented in principle, viewed analytically, and its advantages and disadvantages are weighed up and typical fields of application are presented. The correct methodical application is presented with a "how-to" approach, together with code examples, allowing a wider reader base including researchers, practitioners, and students entering the field. The book is a very well-structured introduction to data science – not only in tourism – and its methodological foundations, accompanied by well-chosen practical cases. It underlines an important insight: data are only representations of reality, you need methodological skills and domain background to derive knowledge from them - Hannes Werthner, Vienna University of Technology Roman Egger has accomplished a difficult but necessary task: make clear how data science can practically support and foster travel and tourism research and applications. The book offers a well-taught collection of chapters giving a comprehensive and deep account of AI and data science for tourism - Francesco Ricci, Free University of Bozen-Bolzano This well-structured and easy-to-read book provides a comprehensive overview of data science in tourism. It contributes largely to the methodological repository beyond traditional methods. - Rob Law, University of Macau
Publisher: Springer Nature
ISBN: 3030883892
Category : Business & Economics
Languages : en
Pages : 647
Book Description
Access to large data sets has led to a paradigm shift in the tourism research landscape. Big data is enabling a new form of knowledge gain, while at the same time shaking the epistemological foundations and requiring new methods and analysis approaches. It allows for interdisciplinary cooperation between computer sciences and social and economic sciences, and complements the traditional research approaches. This book provides a broad basis for the practical application of data science approaches such as machine learning, text mining, social network analysis, and many more, which are essential for interdisciplinary tourism research. Each method is presented in principle, viewed analytically, and its advantages and disadvantages are weighed up and typical fields of application are presented. The correct methodical application is presented with a "how-to" approach, together with code examples, allowing a wider reader base including researchers, practitioners, and students entering the field. The book is a very well-structured introduction to data science – not only in tourism – and its methodological foundations, accompanied by well-chosen practical cases. It underlines an important insight: data are only representations of reality, you need methodological skills and domain background to derive knowledge from them - Hannes Werthner, Vienna University of Technology Roman Egger has accomplished a difficult but necessary task: make clear how data science can practically support and foster travel and tourism research and applications. The book offers a well-taught collection of chapters giving a comprehensive and deep account of AI and data science for tourism - Francesco Ricci, Free University of Bozen-Bolzano This well-structured and easy-to-read book provides a comprehensive overview of data science in tourism. It contributes largely to the methodological repository beyond traditional methods. - Rob Law, University of Macau
Applied Data Analysis and Modeling for Energy Engineers and Scientists
Author: T. Agami Reddy
Publisher: Springer Science & Business Media
ISBN: 1441996133
Category : Technology & Engineering
Languages : en
Pages : 446
Book Description
Applied Data Analysis and Modeling for Energy Engineers and Scientists fills an identified gap in engineering and science education and practice for both students and practitioners. It demonstrates how to apply concepts and methods learned in disparate courses such as mathematical modeling, probability,statistics, experimental design, regression, model building, optimization, risk analysis and decision-making to actual engineering processes and systems. The text provides a formal structure that offers a basic, broad and unified perspective,while imparting the knowledge, skills and confidence to work in data analysis and modeling. This volume uses numerous solved examples, published case studies from the author’s own research, and well-conceived problems in order to enhance comprehension levels among readers and their understanding of the “processes”along with the tools.
Publisher: Springer Science & Business Media
ISBN: 1441996133
Category : Technology & Engineering
Languages : en
Pages : 446
Book Description
Applied Data Analysis and Modeling for Energy Engineers and Scientists fills an identified gap in engineering and science education and practice for both students and practitioners. It demonstrates how to apply concepts and methods learned in disparate courses such as mathematical modeling, probability,statistics, experimental design, regression, model building, optimization, risk analysis and decision-making to actual engineering processes and systems. The text provides a formal structure that offers a basic, broad and unified perspective,while imparting the knowledge, skills and confidence to work in data analysis and modeling. This volume uses numerous solved examples, published case studies from the author’s own research, and well-conceived problems in order to enhance comprehension levels among readers and their understanding of the “processes”along with the tools.
Data Science Applied to Sustainability Analysis
Author: Jennifer Dunn
Publisher: Elsevier
ISBN: 0128179775
Category : Science
Languages : en
Pages : 312
Book Description
Data Science Applied to Sustainability Analysis focuses on the methodological considerations associated with applying this tool in analysis techniques such as lifecycle assessment and materials flow analysis. As sustainability analysts need examples of applications of big data techniques that are defensible and practical in sustainability analyses and that yield actionable results that can inform policy development, corporate supply chain management strategy, or non-governmental organization positions, this book helps answer underlying questions. In addition, it addresses the need of data science experts looking for routes to apply their skills and knowledge to domain areas. - Presents data sources that are available for application in sustainability analyses, such as market information, environmental monitoring data, social media data and satellite imagery - Includes considerations sustainability analysts must evaluate when applying big data - Features case studies illustrating the application of data science in sustainability analyses
Publisher: Elsevier
ISBN: 0128179775
Category : Science
Languages : en
Pages : 312
Book Description
Data Science Applied to Sustainability Analysis focuses on the methodological considerations associated with applying this tool in analysis techniques such as lifecycle assessment and materials flow analysis. As sustainability analysts need examples of applications of big data techniques that are defensible and practical in sustainability analyses and that yield actionable results that can inform policy development, corporate supply chain management strategy, or non-governmental organization positions, this book helps answer underlying questions. In addition, it addresses the need of data science experts looking for routes to apply their skills and knowledge to domain areas. - Presents data sources that are available for application in sustainability analyses, such as market information, environmental monitoring data, social media data and satellite imagery - Includes considerations sustainability analysts must evaluate when applying big data - Features case studies illustrating the application of data science in sustainability analyses
Handbook of Research on Applied Data Science and Artificial Intelligence in Business and Industry
Author: Chkoniya, Valentina
Publisher: IGI Global
ISBN: 1799869865
Category : Computers
Languages : en
Pages : 653
Book Description
The contemporary world lives on the data produced at an unprecedented speed through social networks and the internet of things (IoT). Data has been called the new global currency, and its rise is transforming entire industries, providing a wealth of opportunities. Applied data science research is necessary to derive useful information from big data for the effective and efficient utilization to solve real-world problems. A broad analytical set allied with strong business logic is fundamental in today’s corporations. Organizations work to obtain competitive advantage by analyzing the data produced within and outside their organizational limits to support their decision-making processes. This book aims to provide an overview of the concepts, tools, and techniques behind the fields of data science and artificial intelligence (AI) applied to business and industries. The Handbook of Research on Applied Data Science and Artificial Intelligence in Business and Industry discusses all stages of data science to AI and their application to real problems across industries—from science and engineering to academia and commerce. This book brings together practice and science to build successful data solutions, showing how to uncover hidden patterns and leverage them to improve all aspects of business performance by making sense of data from both web and offline environments. Covering topics including applied AI, consumer behavior analytics, and machine learning, this text is essential for data scientists, IT specialists, managers, executives, software and computer engineers, researchers, practitioners, academicians, and students.
Publisher: IGI Global
ISBN: 1799869865
Category : Computers
Languages : en
Pages : 653
Book Description
The contemporary world lives on the data produced at an unprecedented speed through social networks and the internet of things (IoT). Data has been called the new global currency, and its rise is transforming entire industries, providing a wealth of opportunities. Applied data science research is necessary to derive useful information from big data for the effective and efficient utilization to solve real-world problems. A broad analytical set allied with strong business logic is fundamental in today’s corporations. Organizations work to obtain competitive advantage by analyzing the data produced within and outside their organizational limits to support their decision-making processes. This book aims to provide an overview of the concepts, tools, and techniques behind the fields of data science and artificial intelligence (AI) applied to business and industries. The Handbook of Research on Applied Data Science and Artificial Intelligence in Business and Industry discusses all stages of data science to AI and their application to real problems across industries—from science and engineering to academia and commerce. This book brings together practice and science to build successful data solutions, showing how to uncover hidden patterns and leverage them to improve all aspects of business performance by making sense of data from both web and offline environments. Covering topics including applied AI, consumer behavior analytics, and machine learning, this text is essential for data scientists, IT specialists, managers, executives, software and computer engineers, researchers, practitioners, academicians, and students.
Applying Data Science
Author: Gerhard Svolba
Publisher: SAS Institute
ISBN: 1635260566
Category : Computers
Languages : en
Pages : 490
Book Description
See how data science can answer the questions your business faces! Applying Data Science: Business Case Studies Using SAS, by Gerhard Svolba, shows you the benefits of analytics, how to gain more insight into your data, and how to make better decisions. In eight entertaining and real-world case studies, Svolba combines data science and advanced analytics with business questions, illustrating them with data and SAS code. The case studies range from a variety of fields, including performing headcount survival analysis for employee retention, forecasting the demand for new projects, using Monte Carlo simulation to understand outcome distribution, among other topics. The data science methods covered include Kaplan-Meier estimates, Cox Proportional Hazard Regression, ARIMA models, Poisson regression, imputation of missing values, variable clustering, and much more! Written for business analysts, statisticians, data miners, data scientists, and SAS programmers, Applying Data Science bridges the gap between high-level, business-focused books that skimp on the details and technical books that only show SAS code with no business context.
Publisher: SAS Institute
ISBN: 1635260566
Category : Computers
Languages : en
Pages : 490
Book Description
See how data science can answer the questions your business faces! Applying Data Science: Business Case Studies Using SAS, by Gerhard Svolba, shows you the benefits of analytics, how to gain more insight into your data, and how to make better decisions. In eight entertaining and real-world case studies, Svolba combines data science and advanced analytics with business questions, illustrating them with data and SAS code. The case studies range from a variety of fields, including performing headcount survival analysis for employee retention, forecasting the demand for new projects, using Monte Carlo simulation to understand outcome distribution, among other topics. The data science methods covered include Kaplan-Meier estimates, Cox Proportional Hazard Regression, ARIMA models, Poisson regression, imputation of missing values, variable clustering, and much more! Written for business analysts, statisticians, data miners, data scientists, and SAS programmers, Applying Data Science bridges the gap between high-level, business-focused books that skimp on the details and technical books that only show SAS code with no business context.
Applied Data Science with Python and Jupyter
Author: Alex Galea
Publisher: Packt Publishing Ltd
ISBN: 1789951925
Category : Computers
Languages : en
Pages : 192
Book Description
Become the master player of data exploration by creating reproducible data processing pipelines, visualizations, and prediction models for your applications. Key FeaturesGet up and running with the Jupyter ecosystem and some example datasetsLearn about key machine learning concepts such as SVM, KNN classifiers, and Random ForestsDiscover how you can use web scraping to gather and parse your own bespoke datasetsBook Description Getting started with data science doesn't have to be an uphill battle. Applied Data Science with Python and Jupyter is a step-by-step guide ideal for beginners who know a little Python and are looking for a quick, fast-paced introduction to these concepts. In this book, you'll learn every aspect of the standard data workflow process, including collecting, cleaning, investigating, visualizing, and modeling data. You'll start with the basics of Jupyter, which will be the backbone of the book. After familiarizing ourselves with its standard features, you'll look at an example of it in practice with our first analysis. In the next lesson, you dive right into predictive analytics, where multiple classification algorithms are implemented. Finally, the book ends by looking at data collection techniques. You'll see how web data can be acquired with scraping techniques and via APIs, and then briefly explore interactive visualizations. What you will learnGet up and running with the Jupyter ecosystemIdentify potential areas of investigation and perform exploratory data analysisPlan a machine learning classification strategy and train classification modelsUse validation curves and dimensionality reduction to tune and enhance your modelsScrape tabular data from web pages and transform it into Pandas DataFramesCreate interactive, web-friendly visualizations to clearly communicate your findingsWho this book is for Applied Data Science with Python and Jupyter is ideal for professionals with a variety of job descriptions across a large range of industries, given the rising popularity and accessibility of data science. You'll need some prior experience with Python, with any prior work with libraries such as Pandas, Matplotlib, and Pandas providing you a useful head start.
Publisher: Packt Publishing Ltd
ISBN: 1789951925
Category : Computers
Languages : en
Pages : 192
Book Description
Become the master player of data exploration by creating reproducible data processing pipelines, visualizations, and prediction models for your applications. Key FeaturesGet up and running with the Jupyter ecosystem and some example datasetsLearn about key machine learning concepts such as SVM, KNN classifiers, and Random ForestsDiscover how you can use web scraping to gather and parse your own bespoke datasetsBook Description Getting started with data science doesn't have to be an uphill battle. Applied Data Science with Python and Jupyter is a step-by-step guide ideal for beginners who know a little Python and are looking for a quick, fast-paced introduction to these concepts. In this book, you'll learn every aspect of the standard data workflow process, including collecting, cleaning, investigating, visualizing, and modeling data. You'll start with the basics of Jupyter, which will be the backbone of the book. After familiarizing ourselves with its standard features, you'll look at an example of it in practice with our first analysis. In the next lesson, you dive right into predictive analytics, where multiple classification algorithms are implemented. Finally, the book ends by looking at data collection techniques. You'll see how web data can be acquired with scraping techniques and via APIs, and then briefly explore interactive visualizations. What you will learnGet up and running with the Jupyter ecosystemIdentify potential areas of investigation and perform exploratory data analysisPlan a machine learning classification strategy and train classification modelsUse validation curves and dimensionality reduction to tune and enhance your modelsScrape tabular data from web pages and transform it into Pandas DataFramesCreate interactive, web-friendly visualizations to clearly communicate your findingsWho this book is for Applied Data Science with Python and Jupyter is ideal for professionals with a variety of job descriptions across a large range of industries, given the rising popularity and accessibility of data science. You'll need some prior experience with Python, with any prior work with libraries such as Pandas, Matplotlib, and Pandas providing you a useful head start.
Applying Data Science
Author: Arthur K. Kordon
Publisher: Springer
ISBN: 9783030363772
Category : Computers
Languages : en
Pages : 494
Book Description
This book offers practical guidelines on creating value from the application of data science based on selected artificial intelligence methods. In Part I, the author introduces a problem-driven approach to implementing AI-based data science and offers practical explanations of key technologies: machine learning, deep learning, decision trees and random forests, evolutionary computation, swarm intelligence, and intelligent agents. In Part II, he describes the main steps in creating AI-based data science solutions for business problems, including problem knowledge acquisition, data preparation, data analysis, model development, and model deployment lifecycle. Finally, in Part III the author illustrates the power of AI-based data science with successful applications in manufacturing and business. He also shows how to introduce this technology in a business setting and guides the reader on how to build the appropriate infrastructure and develop the required skillsets. The book is ideal for data scientists who will implement the proposed methodology and techniques in their projects. It is also intended to help business leaders and entrepreneurs who want to create competitive advantage by using AI-based data science, as well as academics and students looking for an industrial view of this discipline.
Publisher: Springer
ISBN: 9783030363772
Category : Computers
Languages : en
Pages : 494
Book Description
This book offers practical guidelines on creating value from the application of data science based on selected artificial intelligence methods. In Part I, the author introduces a problem-driven approach to implementing AI-based data science and offers practical explanations of key technologies: machine learning, deep learning, decision trees and random forests, evolutionary computation, swarm intelligence, and intelligent agents. In Part II, he describes the main steps in creating AI-based data science solutions for business problems, including problem knowledge acquisition, data preparation, data analysis, model development, and model deployment lifecycle. Finally, in Part III the author illustrates the power of AI-based data science with successful applications in manufacturing and business. He also shows how to introduce this technology in a business setting and guides the reader on how to build the appropriate infrastructure and develop the required skillsets. The book is ideal for data scientists who will implement the proposed methodology and techniques in their projects. It is also intended to help business leaders and entrepreneurs who want to create competitive advantage by using AI-based data science, as well as academics and students looking for an industrial view of this discipline.
Applied Data Science Using PySpark
Author: Ramcharan Kakarla
Publisher: Apress
ISBN: 9781484264997
Category : Computers
Languages : en
Pages : 410
Book Description
Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science Using PySpark is divided unto six sections which walk you through the book. In section 1, you start with the basics of PySpark focusing on data manipulation. We make you comfortable with the language and then build upon it to introduce you to the mathematical functions available off the shelf. In section 2, you will dive into the art of variable selection where we demonstrate various selection techniques available in PySpark. In section 3, we take you on a journey through machine learning algorithms, implementations, and fine-tuning techniques. We will also talk about different validation metrics and how to use them for picking the best models. Sections 4 and 5 go through machine learning pipelines and various methods available to operationalize the model and serve it through Docker/an API. In the final section, you will cover reusable objects for easy experimentation and learn some tricks that can help you optimize your programs and machine learning pipelines. By the end of this book, you will have seen the flexibility and advantages of PySpark in data science applications. This book is recommended to those who want to unleash the power of parallel computing by simultaneously working with big datasets. What You Will Learn Build an end-to-end predictive model Implement multiple variable selection techniques Operationalize models Master multiple algorithms and implementations Who This Book is For Data scientists and machine learning and deep learning engineers who want to learn and use PySpark for real-time analysis of streaming data.
Publisher: Apress
ISBN: 9781484264997
Category : Computers
Languages : en
Pages : 410
Book Description
Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science Using PySpark is divided unto six sections which walk you through the book. In section 1, you start with the basics of PySpark focusing on data manipulation. We make you comfortable with the language and then build upon it to introduce you to the mathematical functions available off the shelf. In section 2, you will dive into the art of variable selection where we demonstrate various selection techniques available in PySpark. In section 3, we take you on a journey through machine learning algorithms, implementations, and fine-tuning techniques. We will also talk about different validation metrics and how to use them for picking the best models. Sections 4 and 5 go through machine learning pipelines and various methods available to operationalize the model and serve it through Docker/an API. In the final section, you will cover reusable objects for easy experimentation and learn some tricks that can help you optimize your programs and machine learning pipelines. By the end of this book, you will have seen the flexibility and advantages of PySpark in data science applications. This book is recommended to those who want to unleash the power of parallel computing by simultaneously working with big datasets. What You Will Learn Build an end-to-end predictive model Implement multiple variable selection techniques Operationalize models Master multiple algorithms and implementations Who This Book is For Data scientists and machine learning and deep learning engineers who want to learn and use PySpark for real-time analysis of streaming data.