Inference and Estimation in High-dimensional Data Analysis

Inference and Estimation in High-dimensional Data Analysis PDF Author: Adel Javanmard
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Modern technologies generate vast amounts of fine-grained data at an unprecedented speed. Nowadays, high-dimensional data, where the number of variables is much larger than the sample size, occur in many applications, such as healthcare, social networks, and recommendation systems, among others. The ubiquitous interest in these applications has spurred remarkable progress in the area of high-dimensional data analysis in terms of point estimation and computation. However, one of the fundamental inference task, namely quantifying uncertainty or assessing statistical significance, is still in its infancy for such models. In the first part of this dissertation, we present efficient procedures and corresponding theory for constructing classical uncertainty measures like confidence intervals and p-values for single regression coefficients in high-dimensional settings. In the second part, we study the compressed sensing reconstruction problem, a well-known example of estimation in high-dimensional settings. We propose a new approach to this problem that is drastically different from the classical wisdom in this area. Our construction of the sensing matrix is inspired by the idea of spatial coupling in coding theory and similar ideas in statistical physics. For reconstruction, we use an approximate message passing algorithm. This is an iterative algorithm that takes advantage of the statistical properties of the problem to improve convergence rate. Finally, we prove that our method can effectively solve the reconstruction problem at (information-theoretically) optimal undersampling rate and show its robustness to measurement noise.

Inference and Estimation in High-dimensional Data Analysis

Inference and Estimation in High-dimensional Data Analysis PDF Author: Adel Javanmard
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Modern technologies generate vast amounts of fine-grained data at an unprecedented speed. Nowadays, high-dimensional data, where the number of variables is much larger than the sample size, occur in many applications, such as healthcare, social networks, and recommendation systems, among others. The ubiquitous interest in these applications has spurred remarkable progress in the area of high-dimensional data analysis in terms of point estimation and computation. However, one of the fundamental inference task, namely quantifying uncertainty or assessing statistical significance, is still in its infancy for such models. In the first part of this dissertation, we present efficient procedures and corresponding theory for constructing classical uncertainty measures like confidence intervals and p-values for single regression coefficients in high-dimensional settings. In the second part, we study the compressed sensing reconstruction problem, a well-known example of estimation in high-dimensional settings. We propose a new approach to this problem that is drastically different from the classical wisdom in this area. Our construction of the sensing matrix is inspired by the idea of spatial coupling in coding theory and similar ideas in statistical physics. For reconstruction, we use an approximate message passing algorithm. This is an iterative algorithm that takes advantage of the statistical properties of the problem to improve convergence rate. Finally, we prove that our method can effectively solve the reconstruction problem at (information-theoretically) optimal undersampling rate and show its robustness to measurement noise.

Statistics for High-Dimensional Data

Statistics for High-Dimensional Data PDF Author: Peter Bühlmann
Publisher: Springer Science & Business Media
ISBN: 364220192X
Category : Mathematics
Languages : en
Pages : 568

Get Book Here

Book Description
Modern statistics deals with large and complex data sets, and consequently with models containing a large number of parameters. This book presents a detailed account of recently developed approaches, including the Lasso and versions of it for various models, boosting methods, undirected graphical modeling, and procedures controlling false positive selections. A special characteristic of the book is that it contains comprehensive mathematical theory on high-dimensional statistics combined with methodology, algorithms and illustrations with real data examples. This in-depth approach highlights the methods’ great potential and practical applicability in a variety of settings. As such, it is a valuable resource for researchers, graduate students and experts in statistics, applied mathematics and computer science.

High-dimensional Data Analysis

High-dimensional Data Analysis PDF Author: Tianwen Tony Cai
Publisher: World Scientific Publishing Company Incorporated
ISBN: 9789814324854
Category : Mathematics
Languages : en
Pages : 307

Get Book Here

Book Description
Over the last few years, significant developments have been taking place in high-dimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics and signal processing. In particular, substantial advances have been made in the areas of feature selection, covariance estimation, classification and regression. This book intends to examine important issues arising from high-dimensional data analysis to explore key ideas for statistical inference and prediction. It is structured around topics on multiple hypothesis testing, feature selection, regression, classification, dimension reduction, as well as applications in survival analysis and biomedical research. The book will appeal to graduate students and new researchers interested in the plethora of opportunities available in high-dimensional data analysis.

Statistical Inference from High Dimensional Data

Statistical Inference from High Dimensional Data PDF Author: Carlos Fernandez-Lozano
Publisher: MDPI
ISBN: 3036509445
Category : Science
Languages : en
Pages : 314

Get Book Here

Book Description
• Real-world problems can be high-dimensional, complex, and noisy • More data does not imply more information • Different approaches deal with the so-called curse of dimensionality to reduce irrelevant information • A process with multidimensional information is not necessarily easy to interpret nor process • In some real-world applications, the number of elements of a class is clearly lower than the other. The models tend to assume that the importance of the analysis belongs to the majority class and this is not usually the truth • The analysis of complex diseases such as cancer are focused on more-than-one dimensional omic data • The increasing amount of data thanks to the reduction of cost of the high-throughput experiments opens up a new era for integrative data-driven approaches • Entropy-based approaches are of interest to reduce the dimensionality of high-dimensional data

Introduction to High-Dimensional Statistics

Introduction to High-Dimensional Statistics PDF Author: Christophe Giraud
Publisher: CRC Press
ISBN: 1000408353
Category : Computers
Languages : en
Pages : 410

Get Book Here

Book Description
Praise for the first edition: "[This book] succeeds singularly at providing a structured introduction to this active field of research. ... it is arguably the most accessible overview yet published of the mathematical ideas and principles that one needs to master to enter the field of high-dimensional statistics. ... recommended to anyone interested in the main results of current research in high-dimensional statistics as well as anyone interested in acquiring the core mathematical skills to enter this area of research." —Journal of the American Statistical Association Introduction to High-Dimensional Statistics, Second Edition preserves the philosophy of the first edition: to be a concise guide for students and researchers discovering the area and interested in the mathematics involved. The main concepts and ideas are presented in simple settings, avoiding thereby unessential technicalities. High-dimensional statistics is a fast-evolving field, and much progress has been made on a large variety of topics, providing new insights and methods. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this new edition: Offers revised chapters from the previous edition, with the inclusion of many additional materials on some important topics, including compress sensing, estimation with convex constraints, the slope estimator, simultaneously low-rank and row-sparse linear regression, or aggregation of a continuous set of estimators. Introduces three new chapters on iterative algorithms, clustering, and minimax lower bounds. Provides enhanced appendices, minimax lower-bounds mainly with the addition of the Davis-Kahan perturbation bound and of two simple versions of the Hanson-Wright concentration inequality. Covers cutting-edge statistical methods including model selection, sparsity and the Lasso, iterative hard thresholding, aggregation, support vector machines, and learning theory. Provides detailed exercises at the end of every chapter with collaborative solutions on a wiki site. Illustrates concepts with simple but clear practical examples.

High-Dimensional Statistics

High-Dimensional Statistics PDF Author: Martin J. Wainwright
Publisher: Cambridge University Press
ISBN: 1108498027
Category : Business & Economics
Languages : en
Pages : 571

Get Book Here

Book Description
A coherent introductory text from a groundbreaking researcher, focusing on clarity and motivation to build intuition and understanding.

High-dimensional Data Analysis

High-dimensional Data Analysis PDF Author: Tony Cai;Xiaotong Shen
Publisher:
ISBN: 9787894236326
Category :
Languages : en
Pages : 318

Get Book Here

Book Description
Over the last few years, significant developments have been taking place in highdimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics and signal processing. In particular, substantial advances have been made in the areas of feature selection, covariance estimation, classification and regression. This book intends to examine important issues arising from highdimensional data analysis to explore key ideas for statistical inference and prediction. It is structured around topics on multiple hypothesis testing, feature selection, regression, cla.

Functional and High-Dimensional Statistics and Related Fields

Functional and High-Dimensional Statistics and Related Fields PDF Author: Germán Aneiros
Publisher: Springer Nature
ISBN: 3030477568
Category : Mathematics
Languages : en
Pages : 254

Get Book Here

Book Description
This book presents the latest research on the statistical analysis of functional, high-dimensional and other complex data, addressing methodological and computational aspects, as well as real-world applications. It covers topics like classification, confidence bands, density estimation, depth, diagnostic tests, dimension reduction, estimation on manifolds, high- and infinite-dimensional statistics, inference on functional data, networks, operatorial statistics, prediction, regression, robustness, sequential learning, small-ball probability, smoothing, spatial data, testing, and topological object data analysis, and includes applications in automobile engineering, criminology, drawing recognition, economics, environmetrics, medicine, mobile phone data, spectrometrics and urban environments. The book gathers selected, refereed contributions presented at the Fifth International Workshop on Functional and Operatorial Statistics (IWFOS) in Brno, Czech Republic. The workshop was originally to be held on June 24-26, 2020, but had to be postponed as a consequence of the COVID-19 pandemic. Initiated by the Working Group on Functional and Operatorial Statistics at the University of Toulouse in 2008, the IWFOS workshops provide a forum to discuss the latest trends and advances in functional statistics and related fields, and foster the exchange of ideas and international collaboration in the field.

Role of Sparsity in High Dimensional Signal Detection and Estimation

Role of Sparsity in High Dimensional Signal Detection and Estimation PDF Author: Manqi Zhao
Publisher:
ISBN:
Category :
Languages : en
Pages : 414

Get Book Here

Book Description
Abstract: Processing high dimensional data arises in a number of real world applications such as financial data analysis, hyperspectral imagery, and video surveillance. The data are organized in a rectangular array with n rows and p columns, where the rows represent different measurements and the columns represent different features. High dimensional statistical inference studies signal detection and estimation problems in the scenario when n “ p . The main challenge of high dimensional statistical inference is the curse of dimensionality phenomena. The curse of dimensionality leads to intractability of accurately approximating high-dimensional density function. Nevertheless, data samples in many high dimensional problems come from an underlying low dimensional space or manifold. This limits the degrees of freedom (DOF) in the ambient space. This structure can be exploited for statistical inference. Another feature of high dimensional data is concentration of measure phenomena, which states that certain smooth random functions in high dimensional space are nearly constant. The philosophy is that under mild conditions it is easy to predict the behavior of high dimensional data.In this thesis, we exploit the DOF structure in detection and estimation of high dimensional data together with concentration of measure inequalities to obtain new results. In particular we consider the sparsity model for compressed sensing, the joint sparse and Markov structure for blind deconvolution, the manifold model for outlier detection and the temporally local anomaly structure for time-series anomaly detection. We present a linear programming solution for signal support recovery from noisy measurements that leverages sparse constraint. We simultaneously reconstruct the unknown autoregressive filter and the driving process in light of the joint structure on sparsity and Markov property. We develop novel non-parametric adaptive anomaly detection algorithm for high dimensional data that can adapt to local sparse manifold structure. We develop a clustering algorithm that accounts for highly unbalanced proximal and complex shaped clusters based on the scheme of reweighting the graph edge similarity. We propose a new paradigm for time-series anomaly detection that exploits the local anomaly structure. Our analysis in compressed sensing shows that the achievable bound in terms of SNR, the number of measurements, and admissible sparsity level of a linear programming solution matches the optimal information-theoretic in an order-wise sense. Our result in anomaly detection suggests that estimating high dimensional level-set can be avoided by computing a sufficient p-value statistic. The resulting anomaly detector is asymptotically uniformly most powerful against any uniformly mixing density. We also provide a generalization of this p-value statistic in time-series anomaly detection with false alarm control.

High-Dimensional Covariance Matrix Estimation

High-Dimensional Covariance Matrix Estimation PDF Author: Aygul Zagidullina
Publisher: Springer Nature
ISBN: 3030800652
Category : Business & Economics
Languages : en
Pages : 123

Get Book Here

Book Description
This book presents covariance matrix estimation and related aspects of random matrix theory. It focuses on the sample covariance matrix estimator and provides a holistic description of its properties under two asymptotic regimes: the traditional one, and the high-dimensional regime that better fits the big data context. It draws attention to the deficiencies of standard statistical tools when used in the high-dimensional setting, and introduces the basic concepts and major results related to spectral statistics and random matrix theory under high-dimensional asymptotics in an understandable and reader-friendly way. The aim of this book is to inspire applied statisticians, econometricians, and machine learning practitioners who analyze high-dimensional data to apply the recent developments in their work.