A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams

A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams PDF Author: Sid Ryan
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
In many real-world applications, the characteristics of data change over time. This behavior is known as concept drift. Maintaining optimal algorithms and their hyperparameters in such applications becomes cumbersome, as models become outdated very quickly. Although the data often consists of one-dimensional streams (e.g. collected by activity logs, sensors and mobile devices), in a higher level the aggregated sources produce multiple streams. Machine learning, therefore, requires univariate and multivariate analysis of long term dependencies to create valuable insights. In this thesis, we assess hundreds of combinations of data characteristics and methods in sequential data. Particularly we use real-life anomalous instances in the network traffic domain and to increase complexity we combine it with synthesized drifting data. From our preliminary evaluation of conventional machine learning, meta-learning and deep learning methods and comparing their generalization performance in the presence of concept drift, the results show that deep learning outperforms all other tested methods. Although, one-dimensional Convolutional Neural Networks (1D-CNN) produced the highest performance in image classification, similar to other models, they are able to label if sliding windows are anomalous or not. However, in majority of real-life applications, it is crucial to find individual instances that resulted in an anomalous pattern. Therefore, we introduce a method to transform the representation of the data to tensors of two dimensional images, enabling modern deep learning methods to become directly applicable to sequential data. We propose Sequential Mask Convolutional Neural Network (SMCNN) pinpoints the location of anomalous patterns. SMCNN model transforms sequential data by means of a specialized filter that produces flexible shape forms and detects multiple types of outliers simultaneously. In addition, to solve the issue of high ratio of False Positive in the unsupervised Generative Adversarial Networks (GAN) in concept drifts, we introduce a method for finding optimal sliding windows that automatically removes normal repetitive patterns. We introduce DriftGAN architecture that discriminates between normal and anomalous patterns. Our SMCNN and DriftGAN methods significantly outperform prior endeavours and provide high generalization capabilities on a wide array of one-dimensional data characteristics with repetitive nature.

A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams

A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams PDF Author: Sid Ryan
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
In many real-world applications, the characteristics of data change over time. This behavior is known as concept drift. Maintaining optimal algorithms and their hyperparameters in such applications becomes cumbersome, as models become outdated very quickly. Although the data often consists of one-dimensional streams (e.g. collected by activity logs, sensors and mobile devices), in a higher level the aggregated sources produce multiple streams. Machine learning, therefore, requires univariate and multivariate analysis of long term dependencies to create valuable insights. In this thesis, we assess hundreds of combinations of data characteristics and methods in sequential data. Particularly we use real-life anomalous instances in the network traffic domain and to increase complexity we combine it with synthesized drifting data. From our preliminary evaluation of conventional machine learning, meta-learning and deep learning methods and comparing their generalization performance in the presence of concept drift, the results show that deep learning outperforms all other tested methods. Although, one-dimensional Convolutional Neural Networks (1D-CNN) produced the highest performance in image classification, similar to other models, they are able to label if sliding windows are anomalous or not. However, in majority of real-life applications, it is crucial to find individual instances that resulted in an anomalous pattern. Therefore, we introduce a method to transform the representation of the data to tensors of two dimensional images, enabling modern deep learning methods to become directly applicable to sequential data. We propose Sequential Mask Convolutional Neural Network (SMCNN) pinpoints the location of anomalous patterns. SMCNN model transforms sequential data by means of a specialized filter that produces flexible shape forms and detects multiple types of outliers simultaneously. In addition, to solve the issue of high ratio of False Positive in the unsupervised Generative Adversarial Networks (GAN) in concept drifts, we introduce a method for finding optimal sliding windows that automatically removes normal repetitive patterns. We introduce DriftGAN architecture that discriminates between normal and anomalous patterns. Our SMCNN and DriftGAN methods significantly outperform prior endeavours and provide high generalization capabilities on a wide array of one-dimensional data characteristics with repetitive nature.

Anomaly Detection and Complex Event Processing Over IoT Data Streams

Anomaly Detection and Complex Event Processing Over IoT Data Streams PDF Author: Patrick Schneider
Publisher: Academic Press
ISBN: 0128238194
Category : Computers
Languages : en
Pages : 408

Get Book Here

Book Description
Anomaly Detection and Complex Event Processing over IoT Data Streams: With Application to eHealth and Patient Data Monitoring presents advanced processing techniques for IoT data streams and the anomaly detection algorithms over them. The book brings new advances and generalized techniques for processing IoT data streams, semantic data enrichment with contextual information at Edge, Fog and Cloud as well as complex event processing in IoT applications. The book comprises fundamental models, concepts and algorithms, architectures and technological solutions as well as their application to eHealth. Case studies, such as the bio-metric signals stream processing are presented –the massive amount of raw ECG signals from the sensors are processed dynamically across the data pipeline and classified with modern machine learning approaches including the Hierarchical Temporal Memory and Deep Learning algorithms. The book discusses adaptive solutions to IoT stream processing that can be extended to different use cases from different fields of eHealth, to enable a complex analysis of patient data in a historical, predictive and even prescriptive application scenarios. The book ends with a discussion on ethics, emerging research trends, issues and challenges of IoT data stream processing. - Provides the state-of-the-art in IoT Data Stream Processing, Semantic Data Enrichment, Reasoning and Knowledge - Covers extraction (Anomaly Detection) - Illustrates new, scalable and reliable processing techniques based on IoT stream technologies - Offers applications to new, real-time anomaly detection scenarios in the health domain

Optimum-Path Forest

Optimum-Path Forest PDF Author: Alexandre Xavier Falcao
Publisher: Elsevier
ISBN: 0128226889
Category : Computers
Languages : en
Pages : 244

Get Book Here

Book Description
Optimum-Path Forest: Theory, Algorithms, and Applications was first published in 2008 in its supervised and unsupervised versions with applications in medicine and image classification. Since then, it has expanded to a variety of other applications such as remote sensing, electrical and petroleum engineering, and biology. In recent years, multi-label and semi-supervised versions were also developed to handle video classification problems. The book presents the principles, algorithms and applications of Optimum-Path Forest, giving the theory and state-of-the-art as well as insights into future directions. Presents the first book on Optimum-path Forest Shows how it can be used with Deep Learning Gives a wide range of applications Includes the methods, underlying theory and applications of Optimum-Path Forest (OPF)

Knowledge Discovery from Data Streams

Knowledge Discovery from Data Streams PDF Author: Joao Gama
Publisher: CRC Press
ISBN: 1439826129
Category : Business & Economics
Languages : en
Pages : 256

Get Book Here

Book Description
Since the beginning of the Internet age and the increased use of ubiquitous computing devices, the large volume and continuous flow of distributed data have imposed new constraints on the design of learning algorithms. Exploring how to extract knowledge structures from evolving and time-changing data, Knowledge Discovery from Data Streams presents

Neural Information Processing

Neural Information Processing PDF Author: Teddy Mantoro
Publisher: Springer Nature
ISBN: 303092307X
Category : Computers
Languages : en
Pages : 802

Get Book Here

Book Description
The two-volume set CCIS 1516 and 1517 constitutes thoroughly refereed short papers presented at the 28th International Conference on Neural Information Processing, ICONIP 2021, held in Sanur, Bali, Indonesia, in December 2021.* The volume also presents papers from the workshop on Artificial Intelligence and Cyber Security, held during the ICONIP 2021. The 176 short and workshop papers presented in this volume were carefully reviewed and selected for publication out of 1093 submissions. The papers are organized in topical sections as follows: theory and algorithms; AI and cybersecurity; cognitive neurosciences; human centred computing; advances in deep and shallow machine learning algorithms for biomedical data and imaging; reliable, robust, and secure machine learning algorithms; theory and applications of natural computing paradigms; applications. * The conference was held virtually due to the COVID-19 pandemic.

Machine Learning for Data Streams

Machine Learning for Data Streams PDF Author: Albert Bifet
Publisher: MIT Press
ISBN: 0262346052
Category : Computers
Languages : en
Pages : 255

Get Book Here

Book Description
A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework. Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations. The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.

Outlier Detection for Temporal Data

Outlier Detection for Temporal Data PDF Author: Manish Gupta
Publisher: Springer
ISBN: 9783031007774
Category : Computers
Languages : en
Pages : 110

Get Book Here

Book Description
Outlier (or anomaly) detection is a very broad field which has been studied in the context of a large number of research areas like statistics, data mining, sensor networks, environmental science, distributed systems, spatio-temporal mining, etc. Initial research in outlier detection focused on time series-based outliers (in statistics). Since then, outlier detection has been studied on a large variety of data types including high-dimensional data, uncertain data, stream data, network data, time series data, spatial data, and spatio-temporal data. While there have been many tutorials and surveys for general outlier detection, we focus on outlier detection for temporal data in this book. A large number of applications generate temporal datasets. For example, in our everyday life, various kinds of records like credit, personnel, financial, judicial, medical, etc., are all temporal. This stresses the need for an organized and detailed study of outliers with respect to such temporal data. In the past decade, there has been a lot of research on various forms of temporal data including consecutive data snapshots, series of data snapshots and data streams. Besides the initial work on time series, researchers have focused on rich forms of data including multiple data streams, spatio-temporal data, network data, community distribution data, etc. Compared to general outlier detection, techniques for temporal outlier detection are very different. In this book, we will present an organized picture of both recent and past research in temporal outlier detection. We start with the basics and then ramp up the reader to the main ideas in state-of-the-art outlier detection techniques. We motivate the importance of temporal outlier detection and brief the challenges beyond usual outlier detection. Then, we list down a taxonomy of proposed techniques for temporal outlier detection. Such techniques broadly include statistical techniques (like AR models, Markov models, histograms, neural networks), distance- and density-based approaches, grouping-based approaches (clustering, community detection), network-based approaches, and spatio-temporal outlier detection approaches. We summarize by presenting a wide collection of applications where temporal outlier detection techniques have been applied to discover interesting outliers. Table of Contents: Preface / Acknowledgments / Figure Credits / Introduction and Challenges / Outlier Detection for Time Series and Data Sequences / Outlier Detection for Data Streams / Outlier Detection for Distributed Data Streams / Outlier Detection for Spatio-Temporal Data / Outlier Detection for Temporal Network Data / Applications of Outlier Detection for Temporal Data / Conclusions and Research Directions / Bibliography / Authors' Biographies

Machine Learning and Knowledge Discovery in Databases. Research Track

Machine Learning and Knowledge Discovery in Databases. Research Track PDF Author: Nuria Oliver
Publisher: Springer Nature
ISBN: 3030864863
Category : Computers
Languages : en
Pages : 838

Get Book Here

Book Description
The multi-volume set LNAI 12975 until 12979 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2021, which was held during September 13-17, 2021. The conference was originally planned to take place in Bilbao, Spain, but changed to an online event due to the COVID-19 pandemic. The 210 full papers presented in these proceedings were carefully reviewed and selected from a total of 869 submissions. The volumes are organized in topical sections as follows: Research Track: Part I: Online learning; reinforcement learning; time series, streams, and sequence models; transfer and multi-task learning; semi-supervised and few-shot learning; learning algorithms and applications. Part II: Generative models; algorithms and learning theory; graphs and networks; interpretation, explainability, transparency, safety. Part III: Generative models; search and optimization; supervised learning; text mining and natural language processing; image processing, computer vision and visual analytics. Applied Data Science Track: Part IV: Anomaly detection and malware; spatio-temporal data; e-commerce and finance; healthcare and medical applications (including Covid); mobility and transportation. Part V: Automating machine learning, optimization, and feature engineering; machine learning based simulations and knowledge discovery; recommender systems and behavior modeling; natural language processing; remote sensing, image and video processing; social media.

Outlier Ensembles

Outlier Ensembles PDF Author: Charu C. Aggarwal
Publisher: Springer
ISBN: 3319547658
Category : Computers
Languages : en
Pages : 288

Get Book Here

Book Description
This book discusses a variety of methods for outlier ensembles and organizes them by the specific principles with which accuracy improvements are achieved. In addition, it covers the techniques with which such methods can be made more effective. A formal classification of these methods is provided, and the circumstances in which they work well are examined. The authors cover how outlier ensembles relate (both theoretically and practically) to the ensemble techniques used commonly for other data mining problems like classification. The similarities and (subtle) differences in the ensemble techniques for the classification and outlier detection problems are explored. These subtle differences do impact the design of ensemble algorithms for the latter problem. This book can be used for courses in data mining and related curricula. Many illustrative examples and exercises are provided in order to facilitate classroom teaching. A familiarity is assumed to the outlier detection problem and also to generic problem of ensemble analysis in classification. This is because many of the ensemble methods discussed in this book are adaptations from their counterparts in the classification domain. Some techniques explained in this book, such as wagging, randomized feature weighting, and geometric subsampling, provide new insights that are not available elsewhere. Also included is an analysis of the performance of various types of base detectors and their relative effectiveness. The book is valuable for researchers and practitioners for leveraging ensemble methods into optimal algorithmic design.

Conformal Prediction for Reliable Machine Learning

Conformal Prediction for Reliable Machine Learning PDF Author: Vineeth Balasubramanian
Publisher: Newnes
ISBN: 0124017150
Category : Computers
Languages : en
Pages : 323

Get Book Here

Book Description
The conformal predictions framework is a recent development in machine learning that can associate a reliable measure of confidence with a prediction in any real-world pattern recognition application, including risk-sensitive applications such as medical diagnosis, face recognition, and financial risk prediction. Conformal Predictions for Reliable Machine Learning: Theory, Adaptations and Applications captures the basic theory of the framework, demonstrates how to apply it to real-world problems, and presents several adaptations, including active learning, change detection, and anomaly detection. As practitioners and researchers around the world apply and adapt the framework, this edited volume brings together these bodies of work, providing a springboard for further research as well as a handbook for application in real-world problems. - Understand the theoretical foundations of this important framework that can provide a reliable measure of confidence with predictions in machine learning - Be able to apply this framework to real-world problems in different machine learning settings, including classification, regression, and clustering - Learn effective ways of adapting the framework to newer problem settings, such as active learning, model selection, or change detection