Author: Marianna Bolla
Publisher: John Wiley & Sons
ISBN: 1118650719
Category : Mathematics
Languages : en
Pages : 229
Book Description
Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, multivariate statistics, or applied graph theory; but by skipping the proofs, the algorithms can also be used by specialists who just want to retrieve information from their data when analysing communication, social, or biological networks. Spectral Clustering and Biclustering: Provides a unified treatment for edge-weighted graphs and contingency tables via methods of multivariate statistical analysis (factoring, clustering, and biclustering). Uses spectral embedding and relaxation to estimate multiway cuts of edge-weighted graphs and bicuts of contingency tables. Goes beyond the expanders by describing the structure of dense graphs with a small spectral gap via the structural eigenvalues and eigen-subspaces of the normalized modularity matrix. Treats graphs like statistical data by combining methods of graph theory and statistics. Establishes a common outline structure for the contents of each algorithm, applicable to networks and microarrays, with unified notions and principles.
Spectral Clustering and Biclustering
Author: Marianna Bolla
Publisher: John Wiley & Sons
ISBN: 1118650719
Category : Mathematics
Languages : en
Pages : 229
Book Description
Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, multivariate statistics, or applied graph theory; but by skipping the proofs, the algorithms can also be used by specialists who just want to retrieve information from their data when analysing communication, social, or biological networks. Spectral Clustering and Biclustering: Provides a unified treatment for edge-weighted graphs and contingency tables via methods of multivariate statistical analysis (factoring, clustering, and biclustering). Uses spectral embedding and relaxation to estimate multiway cuts of edge-weighted graphs and bicuts of contingency tables. Goes beyond the expanders by describing the structure of dense graphs with a small spectral gap via the structural eigenvalues and eigen-subspaces of the normalized modularity matrix. Treats graphs like statistical data by combining methods of graph theory and statistics. Establishes a common outline structure for the contents of each algorithm, applicable to networks and microarrays, with unified notions and principles.
Publisher: John Wiley & Sons
ISBN: 1118650719
Category : Mathematics
Languages : en
Pages : 229
Book Description
Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, multivariate statistics, or applied graph theory; but by skipping the proofs, the algorithms can also be used by specialists who just want to retrieve information from their data when analysing communication, social, or biological networks. Spectral Clustering and Biclustering: Provides a unified treatment for edge-weighted graphs and contingency tables via methods of multivariate statistical analysis (factoring, clustering, and biclustering). Uses spectral embedding and relaxation to estimate multiway cuts of edge-weighted graphs and bicuts of contingency tables. Goes beyond the expanders by describing the structure of dense graphs with a small spectral gap via the structural eigenvalues and eigen-subspaces of the normalized modularity matrix. Treats graphs like statistical data by combining methods of graph theory and statistics. Establishes a common outline structure for the contents of each algorithm, applicable to networks and microarrays, with unified notions and principles.
Spectral Clustering and Biclustering
Author: Marianna Bolla
Publisher: Wiley
ISBN: 9781118344927
Category : Mathematics
Languages : en
Pages : 292
Book Description
Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, multivariate statistics, or applied graph theory; but by skipping the proofs, the algorithms can also be used by specialists who just want to retrieve information from their data when analysing communication, social, or biological networks. Spectral Clustering and Biclustering: Provides a unified treatment for edge-weighted graphs and contingency tables via methods of multivariate statistical analysis (factoring, clustering, and biclustering). Uses spectral embedding and relaxation to estimate multiway cuts of edge-weighted graphs and bicuts of contingency tables. Goes beyond the expanders by describing the structure of dense graphs with a small spectral gap via the structural eigenvalues and eigen-subspaces of the normalized modularity matrix. Treats graphs like statistical data by combining methods of graph theory and statistics. Establishes a common outline structure for the contents of each algorithm, applicable to networks and microarrays, with unified notions and principles.
Publisher: Wiley
ISBN: 9781118344927
Category : Mathematics
Languages : en
Pages : 292
Book Description
Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, multivariate statistics, or applied graph theory; but by skipping the proofs, the algorithms can also be used by specialists who just want to retrieve information from their data when analysing communication, social, or biological networks. Spectral Clustering and Biclustering: Provides a unified treatment for edge-weighted graphs and contingency tables via methods of multivariate statistical analysis (factoring, clustering, and biclustering). Uses spectral embedding and relaxation to estimate multiway cuts of edge-weighted graphs and bicuts of contingency tables. Goes beyond the expanders by describing the structure of dense graphs with a small spectral gap via the structural eigenvalues and eigen-subspaces of the normalized modularity matrix. Treats graphs like statistical data by combining methods of graph theory and statistics. Establishes a common outline structure for the contents of each algorithm, applicable to networks and microarrays, with unified notions and principles.
Clustering: Theoretical And Practical Aspects
Author: Dan A Simovici
Publisher: World Scientific
ISBN: 981124121X
Category : Computers
Languages : en
Pages : 882
Book Description
This unique compendium gives an updated presentation of clustering, one of the most challenging tasks in machine learning. The book provides a unitary presentation of classical and contemporary algorithms ranging from partitional and hierarchical clustering up to density-based clustering, clustering of categorical data, and spectral clustering.Most of the mathematical background is provided in appendices, highlighting algebraic and complexity theory, in order to make this volume as self-contained as possible. A substantial number of exercises and supplements makes this a useful reference textbook for researchers and students.
Publisher: World Scientific
ISBN: 981124121X
Category : Computers
Languages : en
Pages : 882
Book Description
This unique compendium gives an updated presentation of clustering, one of the most challenging tasks in machine learning. The book provides a unitary presentation of classical and contemporary algorithms ranging from partitional and hierarchical clustering up to density-based clustering, clustering of categorical data, and spectral clustering.Most of the mathematical background is provided in appendices, highlighting algebraic and complexity theory, in order to make this volume as self-contained as possible. A substantial number of exercises and supplements makes this a useful reference textbook for researchers and students.
Proceedings of the Fifth SIAM International Conference on Data Mining
Author: Hillol Kargupta
Publisher: SIAM
ISBN: 9780898715934
Category : Mathematics
Languages : en
Pages : 670
Book Description
The Fifth SIAM International Conference on Data Mining continues the tradition of providing an open forum for the presentation and discussion of innovative algorithms as well as novel applications of data mining. Advances in information technology and data collection methods have led to the availability of large data sets in commercial enterprises and in a wide variety of scientific and engineering disciplines. The field of data mining draws upon extensive work in areas such as statistics, machine learning, pattern recognition, databases, and high performance computing to discover interesting and previously unknown information in data. This conference results in data mining, including applications, algorithms, software, and systems.
Publisher: SIAM
ISBN: 9780898715934
Category : Mathematics
Languages : en
Pages : 670
Book Description
The Fifth SIAM International Conference on Data Mining continues the tradition of providing an open forum for the presentation and discussion of innovative algorithms as well as novel applications of data mining. Advances in information technology and data collection methods have led to the availability of large data sets in commercial enterprises and in a wide variety of scientific and engineering disciplines. The field of data mining draws upon extensive work in areas such as statistics, machine learning, pattern recognition, databases, and high performance computing to discover interesting and previously unknown information in data. This conference results in data mining, including applications, algorithms, software, and systems.
Data Clustering: Theory, Algorithms, and Applications, Second Edition
Author: Guojun Gan
Publisher: SIAM
ISBN: 1611976332
Category : Mathematics
Languages : en
Pages : 430
Book Description
Data clustering, also known as cluster analysis, is an unsupervised process that divides a set of objects into homogeneous groups. Since the publication of the first edition of this monograph in 2007, development in the area has exploded, especially in clustering algorithms for big data and open-source software for cluster analysis. This second edition reflects these new developments, covers the basics of data clustering, includes a list of popular clustering algorithms, and provides program code that helps users implement clustering algorithms. Data Clustering: Theory, Algorithms and Applications, Second Edition will be of interest to researchers, practitioners, and data scientists as well as undergraduate and graduate students.
Publisher: SIAM
ISBN: 1611976332
Category : Mathematics
Languages : en
Pages : 430
Book Description
Data clustering, also known as cluster analysis, is an unsupervised process that divides a set of objects into homogeneous groups. Since the publication of the first edition of this monograph in 2007, development in the area has exploded, especially in clustering algorithms for big data and open-source software for cluster analysis. This second edition reflects these new developments, covers the basics of data clustering, includes a list of popular clustering algorithms, and provides program code that helps users implement clustering algorithms. Data Clustering: Theory, Algorithms and Applications, Second Edition will be of interest to researchers, practitioners, and data scientists as well as undergraduate and graduate students.
Co-Clustering
Author: Gérard Govaert
Publisher: John Wiley & Sons
ISBN: 1118649508
Category : Computers
Languages : en
Pages : 246
Book Description
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixtures adapted to different types of data. The algorithms used are described and related works with different classical methods are presented and commented upon. This chapter is useful in tackling the problem of co-clustering under the mixture approach. Chapter 2 is devoted to the latent block model proposed in the mixture approach context. The authors discuss this model in detail and present its interest regarding co-clustering. Various algorithms are presented in a general context. Chapter 3 focuses on binary and categorical data. It presents, in detail, the appropriated latent block mixture models. Variants of these models and algorithms are presented and illustrated using examples. Chapter 4 focuses on contingency data. Mutual information, phi-squared and model-based co-clustering are studied. Models, algorithms and connections among different approaches are described and illustrated. Chapter 5 presents the case of continuous data. In the same way, the different approaches used in the previous chapters are extended to this situation. Contents 1. Cluster Analysis. 2. Model-Based Co-Clustering. 3. Co-Clustering of Binary and Categorical Data. 4. Co-Clustering of Contingency Tables. 5. Co-Clustering of Continuous Data. About the Authors Gérard Govaert is Professor at the University of Technology of Compiègne, France. He is also a member of the CNRS Laboratory Heudiasyc (Heuristic and diagnostic of complex systems). His research interests include latent structure modeling, model selection, model-based cluster analysis, block clustering and statistical pattern recognition. He is one of the authors of the MIXMOD (MIXtureMODelling) software. Mohamed Nadif is Professor at the University of Paris-Descartes, France, where he is a member of LIPADE (Paris Descartes computer science laboratory) in the Mathematics and Computer Science department. His research interests include machine learning, data mining, model-based cluster analysis, co-clustering, factorization and data analysis. Cluster Analysis is an important tool in a variety of scientific areas. Chapter 1 briefly presents a state of the art of already well-established as well more recent methods. The hierarchical, partitioning and fuzzy approaches will be discussed amongst others. The authors review the difficulty of these classical methods in tackling the high dimensionality, sparsity and scalability. Chapter 2 discusses the interests of coclustering, presenting different approaches and defining a co-cluster. The authors focus on co-clustering as a simultaneous clustering and discuss the cases of binary, continuous and co-occurrence data. The criteria and algorithms are described and illustrated on simulated and real data. Chapter 3 considers co-clustering as a model-based co-clustering. A latent block model is defined for different kinds of data. The estimation of parameters and co-clustering is tackled under two approaches: maximum likelihood and classification maximum likelihood. Hard and soft algorithms are described and applied on simulated and real data. Chapter 4 considers co-clustering as a matrix approximation. The trifactorization approach is considered and algorithms based on update rules are described. Links with numerical and probabilistic approaches are established. A combination of algorithms are proposed and evaluated on simulated and real data. Chapter 5 considers a co-clustering or bi-clustering as the search for coherent co-clusters in biological terms or the extraction of co-clusters under conditions. Classical algorithms will be described and evaluated on simulated and real data. Different indices to evaluate the quality of coclusters are noted and used in numerical experiments.
Publisher: John Wiley & Sons
ISBN: 1118649508
Category : Computers
Languages : en
Pages : 246
Book Description
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixtures adapted to different types of data. The algorithms used are described and related works with different classical methods are presented and commented upon. This chapter is useful in tackling the problem of co-clustering under the mixture approach. Chapter 2 is devoted to the latent block model proposed in the mixture approach context. The authors discuss this model in detail and present its interest regarding co-clustering. Various algorithms are presented in a general context. Chapter 3 focuses on binary and categorical data. It presents, in detail, the appropriated latent block mixture models. Variants of these models and algorithms are presented and illustrated using examples. Chapter 4 focuses on contingency data. Mutual information, phi-squared and model-based co-clustering are studied. Models, algorithms and connections among different approaches are described and illustrated. Chapter 5 presents the case of continuous data. In the same way, the different approaches used in the previous chapters are extended to this situation. Contents 1. Cluster Analysis. 2. Model-Based Co-Clustering. 3. Co-Clustering of Binary and Categorical Data. 4. Co-Clustering of Contingency Tables. 5. Co-Clustering of Continuous Data. About the Authors Gérard Govaert is Professor at the University of Technology of Compiègne, France. He is also a member of the CNRS Laboratory Heudiasyc (Heuristic and diagnostic of complex systems). His research interests include latent structure modeling, model selection, model-based cluster analysis, block clustering and statistical pattern recognition. He is one of the authors of the MIXMOD (MIXtureMODelling) software. Mohamed Nadif is Professor at the University of Paris-Descartes, France, where he is a member of LIPADE (Paris Descartes computer science laboratory) in the Mathematics and Computer Science department. His research interests include machine learning, data mining, model-based cluster analysis, co-clustering, factorization and data analysis. Cluster Analysis is an important tool in a variety of scientific areas. Chapter 1 briefly presents a state of the art of already well-established as well more recent methods. The hierarchical, partitioning and fuzzy approaches will be discussed amongst others. The authors review the difficulty of these classical methods in tackling the high dimensionality, sparsity and scalability. Chapter 2 discusses the interests of coclustering, presenting different approaches and defining a co-cluster. The authors focus on co-clustering as a simultaneous clustering and discuss the cases of binary, continuous and co-occurrence data. The criteria and algorithms are described and illustrated on simulated and real data. Chapter 3 considers co-clustering as a model-based co-clustering. A latent block model is defined for different kinds of data. The estimation of parameters and co-clustering is tackled under two approaches: maximum likelihood and classification maximum likelihood. Hard and soft algorithms are described and applied on simulated and real data. Chapter 4 considers co-clustering as a matrix approximation. The trifactorization approach is considered and algorithms based on update rules are described. Links with numerical and probabilistic approaches are established. A combination of algorithms are proposed and evaluated on simulated and real data. Chapter 5 considers a co-clustering or bi-clustering as the search for coherent co-clusters in biological terms or the extraction of co-clusters under conditions. Classical algorithms will be described and evaluated on simulated and real data. Different indices to evaluate the quality of coclusters are noted and used in numerical experiments.
Machine Learning Techniques for Multimedia
Author: Matthieu Cord
Publisher: Springer Science & Business Media
ISBN: 3540751718
Category : Computers
Languages : en
Pages : 297
Book Description
Processing multimedia content has emerged as a key area for the application of machine learning techniques, where the objectives are to provide insight into the domain from which the data is drawn, and to organize that data and improve the performance of the processes manipulating it. Arising from the EU MUSCLE network, this multidisciplinary book provides a comprehensive coverage of the most important machine learning techniques used and their application in this domain.
Publisher: Springer Science & Business Media
ISBN: 3540751718
Category : Computers
Languages : en
Pages : 297
Book Description
Processing multimedia content has emerged as a key area for the application of machine learning techniques, where the objectives are to provide insight into the domain from which the data is drawn, and to organize that data and improve the performance of the processes manipulating it. Arising from the EU MUSCLE network, this multidisciplinary book provides a comprehensive coverage of the most important machine learning techniques used and their application in this domain.
Plaid Models for Gene Expression Data
Author: Laura Lazzeroni
Publisher:
ISBN:
Category :
Languages : en
Pages : 25
Book Description
Publisher:
ISBN:
Category :
Languages : en
Pages : 25
Book Description
Applied Biclustering Methods for Big and High-Dimensional Data Using R
Author: Adetayo Kasim
Publisher: CRC Press
ISBN: 1482208245
Category : Mathematics
Languages : en
Pages : 428
Book Description
Proven Methods for Big Data Analysis As big data has become standard in many application areas, challenges have arisen related to methodology and software development, including how to discover meaningful patterns in the vast amounts of data. Addressing these problems, Applied Biclustering Methods for Big and High-Dimensional Data Using R shows how to apply biclustering methods to find local patterns in a big data matrix. The book presents an overview of data analysis using biclustering methods from a practical point of view. Real case studies in drug discovery, genetics, marketing research, biology, toxicity, and sports illustrate the use of several biclustering methods. References to technical details of the methods are provided for readers who wish to investigate the full theoretical background. All the methods are accompanied with R examples that show how to conduct the analyses. The examples, software, and other materials are available on a supplementary website.
Publisher: CRC Press
ISBN: 1482208245
Category : Mathematics
Languages : en
Pages : 428
Book Description
Proven Methods for Big Data Analysis As big data has become standard in many application areas, challenges have arisen related to methodology and software development, including how to discover meaningful patterns in the vast amounts of data. Addressing these problems, Applied Biclustering Methods for Big and High-Dimensional Data Using R shows how to apply biclustering methods to find local patterns in a big data matrix. The book presents an overview of data analysis using biclustering methods from a practical point of view. Real case studies in drug discovery, genetics, marketing research, biology, toxicity, and sports illustrate the use of several biclustering methods. References to technical details of the methods are provided for readers who wish to investigate the full theoretical background. All the methods are accompanied with R examples that show how to conduct the analyses. The examples, software, and other materials are available on a supplementary website.
Machine Learning and Cybernetics
Author: Xizhao Wang
Publisher: Springer
ISBN: 3662456524
Category : Computers
Languages : en
Pages : 460
Book Description
This book constitutes the refereed proceedings of the 13th International Conference on Machine Learning and Cybernetics, Lanzhou, China, in July 2014. The 45 revised full papers presented were carefully reviewed and selected from 421 submissions. The papers are organized in topical sections on classification and semi-supervised learning; clustering and kernel; application to recognition; sampling and big data; application to detection; decision tree learning; learning and adaptation; similarity and decision making; learning with uncertainty; improved learning algorithms and applications.
Publisher: Springer
ISBN: 3662456524
Category : Computers
Languages : en
Pages : 460
Book Description
This book constitutes the refereed proceedings of the 13th International Conference on Machine Learning and Cybernetics, Lanzhou, China, in July 2014. The 45 revised full papers presented were carefully reviewed and selected from 421 submissions. The papers are organized in topical sections on classification and semi-supervised learning; clustering and kernel; application to recognition; sampling and big data; application to detection; decision tree learning; learning and adaptation; similarity and decision making; learning with uncertainty; improved learning algorithms and applications.