Author: Weili Wu
Publisher: Springer Science & Business Media
ISBN: 1461302277
Category : Computers
Languages : en
Pages : 331
Book Description
Clustering is an important technique for discovering relatively dense sub-regions or sub-spaces of a multi-dimension data distribution. Clus tering has been used in information retrieval for many different purposes, such as query expansion, document grouping, document indexing, and visualization of search results. In this book, we address issues of cluster ing algorithms, evaluation methodologies, applications, and architectures for information retrieval. The first two chapters discuss clustering algorithms. The chapter from Baeza-Yates et al. describes a clustering method for a general metric space which is a common model of data relevant to information retrieval. The chapter by Guha, Rastogi, and Shim presents a survey as well as detailed discussion of two clustering algorithms: CURE and ROCK for numeric data and categorical data respectively. Evaluation methodologies are addressed in the next two chapters. Ertoz et al. demonstrate the use of text retrieval benchmarks, such as TRECS, to evaluate clustering algorithms. He et al. provide objective measures of clustering quality in their chapter. Applications of clustering methods to information retrieval is ad dressed in the next four chapters. Chu et al. and Noel et al. explore feature selection using word stems, phrases, and link associations for document clustering and indexing. Wen et al. and Sung et al. discuss applications of clustering to user queries and data cleansing. Finally, we consider the problem of designing architectures for infor mation retrieval. Crichton, Hughes, and Kelly elaborate on the devel opment of a scientific data system architecture for information retrieval.
Clustering and Information Retrieval
Fuzzy Sets in Information Retrieval and Cluster Analysis
Author: S. Miyamoto
Publisher: Springer Science & Business Media
ISBN: 9401578877
Category : Mathematics
Languages : en
Pages : 266
Book Description
The present monograph intends to establish a solid link among three fields: fuzzy set theory, information retrieval, and cluster analysis. Fuzzy set theory supplies new concepts and methods for the other two fields, and provides a common frame work within which they can be reorganized. Four principal groups of readers are assumed: researchers or students who are interested in (a) application of fuzzy sets, (b) theory of information retrieval or bibliographic databases, (c) hierarchical clustering, and (d) application of methods in systems science. Readers in group (a) may notice that the fuzzy set theory used here is very simple, since only finite sets are dealt with. This simplification enables the max min algebra to deal with fuzzy relations and matrices as equivalent entities. Fuzzy graphs are also used for describing theoretical properties of fuzzy relations. This assumption of finite sets is sufficient for applying fuzzy sets to information retrieval and cluster analysis. This means that little theory, beyond the basic theory of fuzzy sets, is required. Although readers in group (b) with little background in the theory of fuzzy sets may have difficulty with a few sections, they will also find enough in this monograph to support an intuitive grasp of this new concept of fuzzy information retrieval. Chapter 4 provides fuzzy retrieval without the use of mathematical symbols. Also, fuzzy graphs will serve as an aid to the intuitive understanding of fuzzy relations.
Publisher: Springer Science & Business Media
ISBN: 9401578877
Category : Mathematics
Languages : en
Pages : 266
Book Description
The present monograph intends to establish a solid link among three fields: fuzzy set theory, information retrieval, and cluster analysis. Fuzzy set theory supplies new concepts and methods for the other two fields, and provides a common frame work within which they can be reorganized. Four principal groups of readers are assumed: researchers or students who are interested in (a) application of fuzzy sets, (b) theory of information retrieval or bibliographic databases, (c) hierarchical clustering, and (d) application of methods in systems science. Readers in group (a) may notice that the fuzzy set theory used here is very simple, since only finite sets are dealt with. This simplification enables the max min algebra to deal with fuzzy relations and matrices as equivalent entities. Fuzzy graphs are also used for describing theoretical properties of fuzzy relations. This assumption of finite sets is sufficient for applying fuzzy sets to information retrieval and cluster analysis. This means that little theory, beyond the basic theory of fuzzy sets, is required. Although readers in group (b) with little background in the theory of fuzzy sets may have difficulty with a few sections, they will also find enough in this monograph to support an intuitive grasp of this new concept of fuzzy information retrieval. Chapter 4 provides fuzzy retrieval without the use of mathematical symbols. Also, fuzzy graphs will serve as an aid to the intuitive understanding of fuzzy relations.
Introduction to Information Retrieval
Author: Christopher D. Manning
Publisher: Cambridge University Press
ISBN: 1139472100
Category : Computers
Languages : en
Pages :
Book Description
Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Slides and additional exercises (with solutions for lecturers) are also available through the book's supporting website to help course instructors prepare their lectures.
Publisher: Cambridge University Press
ISBN: 1139472100
Category : Computers
Languages : en
Pages :
Book Description
Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Slides and additional exercises (with solutions for lecturers) are also available through the book's supporting website to help course instructors prepare their lectures.
Information Retrieval Systems
Author: Gerald J. Kowalski
Publisher: Springer
ISBN: 058532090X
Category : Computers
Languages : en
Pages : 291
Book Description
The growth of the Internet and the availability of enormous volumes of data in digital form have necessitated intense interest in techniques to assist the user in locating data of interest. The Internet has over 350 million pages of data and is expected to reach over one billion pages by the year 2000. Buried on the Internet are both valuable nuggets to answer questions as well as a large quantity of information the average person does not care about. The Digital Library effort is also progressing, with the goal of migrating from the traditional book environment to a digital library environment. The challenge to both authors of new publications that will reside on this information domain and developers of systems to locate information is to provide the information and capabilities to sort out the non-relevant items from those desired by the consumer. In effect, as we proceed down this path, it will be the computer that determines what we see versus the human being. The days of going to a library and browsing the new book shelf are being replaced by electronic searching the Internet or the library catalogs. Whatever the search engines return will constrain our knowledge of what information is available. An understanding of Information Retrieval Systems puts this new environment into perspective for both the creator of documents and the consumer trying to locate information.
Publisher: Springer
ISBN: 058532090X
Category : Computers
Languages : en
Pages : 291
Book Description
The growth of the Internet and the availability of enormous volumes of data in digital form have necessitated intense interest in techniques to assist the user in locating data of interest. The Internet has over 350 million pages of data and is expected to reach over one billion pages by the year 2000. Buried on the Internet are both valuable nuggets to answer questions as well as a large quantity of information the average person does not care about. The Digital Library effort is also progressing, with the goal of migrating from the traditional book environment to a digital library environment. The challenge to both authors of new publications that will reside on this information domain and developers of systems to locate information is to provide the information and capabilities to sort out the non-relevant items from those desired by the consumer. In effect, as we proceed down this path, it will be the computer that determines what we see versus the human being. The days of going to a library and browsing the new book shelf are being replaced by electronic searching the Internet or the library catalogs. Whatever the search engines return will constrain our knowledge of what information is available. An understanding of Information Retrieval Systems puts this new environment into perspective for both the creator of documents and the consumer trying to locate information.
Survey of Text Mining
Author: Michael W. Berry
Publisher: Springer Science & Business Media
ISBN: 147574305X
Category : Computers
Languages : en
Pages : 251
Book Description
Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory. As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments. This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.
Publisher: Springer Science & Business Media
ISBN: 147574305X
Category : Computers
Languages : en
Pages : 251
Book Description
Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory. As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments. This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.
Introduction to Clustering Large and High-Dimensional Data
Author: Jacob Kogan
Publisher: Cambridge University Press
ISBN: 9780521617932
Category : Computers
Languages : en
Pages : 228
Book Description
Focuses on a few of the important clustering algorithms in the context of information retrieval.
Publisher: Cambridge University Press
ISBN: 9780521617932
Category : Computers
Languages : en
Pages : 228
Book Description
Focuses on a few of the important clustering algorithms in the context of information retrieval.
Soft Computing in Information Retrieval
Author: Fabio Crestani
Publisher: Physica
ISBN: 3790818496
Category : Computers
Languages : en
Pages : 398
Book Description
Information retrieval (IR) aims at defining systems able to provide a fast and effective content-based access to a large amount of stored information. The aim of an IR system is to estimate the relevance of documents to users' information needs, expressed by means of a query. This is a very difficult and complex task, since it is pervaded with imprecision and uncertainty. Most of the existing IR systems offer a very simple model of IR, which privileges efficiency at the expense of effectiveness. A promising direction to increase the effectiveness of IR is to model the concept of "partially intrinsic" in the IR process and to make the systems adaptive, i.e. able to "learn" the user's concept of relevance. To this aim, the application of soft computing techniques can be of help to obtain greater flexibility in IR systems.
Publisher: Physica
ISBN: 3790818496
Category : Computers
Languages : en
Pages : 398
Book Description
Information retrieval (IR) aims at defining systems able to provide a fast and effective content-based access to a large amount of stored information. The aim of an IR system is to estimate the relevance of documents to users' information needs, expressed by means of a query. This is a very difficult and complex task, since it is pervaded with imprecision and uncertainty. Most of the existing IR systems offer a very simple model of IR, which privileges efficiency at the expense of effectiveness. A promising direction to increase the effectiveness of IR is to model the concept of "partially intrinsic" in the IR process and to make the systems adaptive, i.e. able to "learn" the user's concept of relevance. To this aim, the application of soft computing techniques can be of help to obtain greater flexibility in IR systems.
Mining Text Data
Author: Charu C. Aggarwal
Publisher: Springer Science & Business Media
ISBN: 1461432235
Category : Computers
Languages : en
Pages : 527
Book Description
Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. Mining Text Data introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including the key research content on the topic, and the future directions of research in the field. There is a special focus on Text Embedded with Heterogeneous and Multimedia Data which makes the mining process much more challenging. A number of methods have been designed such as transfer learning and cross-lingual mining for such cases. Mining Text Data simplifies the content, so that advanced-level students, practitioners and researchers in computer science can benefit from this book. Academic and corporate libraries, as well as ACM, IEEE, and Management Science focused on information security, electronic commerce, databases, data mining, machine learning, and statistics are the primary buyers for this reference book.
Publisher: Springer Science & Business Media
ISBN: 1461432235
Category : Computers
Languages : en
Pages : 527
Book Description
Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. Mining Text Data introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including the key research content on the topic, and the future directions of research in the field. There is a special focus on Text Embedded with Heterogeneous and Multimedia Data which makes the mining process much more challenging. A number of methods have been designed such as transfer learning and cross-lingual mining for such cases. Mining Text Data simplifies the content, so that advanced-level students, practitioners and researchers in computer science can benefit from this book. Academic and corporate libraries, as well as ACM, IEEE, and Management Science focused on information security, electronic commerce, databases, data mining, machine learning, and statistics are the primary buyers for this reference book.
Information Retrieval Architecture and Algorithms
Author: Gerald Kowalski
Publisher: Springer Science & Business Media
ISBN: 1441977163
Category : Computers
Languages : en
Pages : 312
Book Description
This text presents a theoretical and practical examination of the latest developments in Information Retrieval and their application to existing systems. By starting with a functional discussion of what is needed for an information system, the reader can grasp the scope of information retrieval problems and discover the tools to resolve them. The book takes a system approach to explore every functional processing step in a system from ingest of an item to be indexed to displaying results, showing how implementation decisions add to the information retrieval goal, and thus providing the user with the needed outcome, while minimizing their resources to obtain those results. The text stresses the current migration of information retrieval from just textual to multimedia, expounding upon multimedia search, retrieval and display, as well as classic and new textual techniques. It also introduces developments in hardware, and more importantly, search architectures, such as those introduced by Google, in order to approach scalability issues. About this textbook: A first course text for advanced level courses, providing a survey of information retrieval system theory and architecture, complete with challenging exercises Approaches information retrieval from a practical systems view in order for the reader to grasp both scope and solutions Features what is achievable using existing technologies and investigates what deficiencies warrant additional exploration
Publisher: Springer Science & Business Media
ISBN: 1441977163
Category : Computers
Languages : en
Pages : 312
Book Description
This text presents a theoretical and practical examination of the latest developments in Information Retrieval and their application to existing systems. By starting with a functional discussion of what is needed for an information system, the reader can grasp the scope of information retrieval problems and discover the tools to resolve them. The book takes a system approach to explore every functional processing step in a system from ingest of an item to be indexed to displaying results, showing how implementation decisions add to the information retrieval goal, and thus providing the user with the needed outcome, while minimizing their resources to obtain those results. The text stresses the current migration of information retrieval from just textual to multimedia, expounding upon multimedia search, retrieval and display, as well as classic and new textual techniques. It also introduces developments in hardware, and more importantly, search architectures, such as those introduced by Google, in order to approach scalability issues. About this textbook: A first course text for advanced level courses, providing a survey of information retrieval system theory and architecture, complete with challenging exercises Approaches information retrieval from a practical systems view in order for the reader to grasp both scope and solutions Features what is achievable using existing technologies and investigates what deficiencies warrant additional exploration
Critical Approaches to Information Retrieval Research
Author: Sarfraz, Muhammad
Publisher: IGI Global
ISBN: 1799810232
Category : Computers
Languages : en
Pages : 374
Book Description
Information retrieval (IR) is considered to be the science of searching for information from a variety of information sources related to texts, images, sounds, or multimedia. With the rise of the internet and digital databases, updated information retrieval methodologies are essential to ensure the continued facilitation and enhancement of information exchange. Critical Approaches to Information Retrieval Research is a critical scholarly publication that provides multidisciplinary examinations of theoretical innovations and methods in information retrieval technologies including search and storage applications for data, text, image, sound, document, and video retrieval. Featuring a wide range of topics including data mining, machine learning, and ontology, this book is ideal for librarians, software engineers, data scientists, professionals, researchers, information engineers, scientists, practitioners, and academicians working in the fields of computer science, information technology, information and communication sciences, education, health, library, and more.
Publisher: IGI Global
ISBN: 1799810232
Category : Computers
Languages : en
Pages : 374
Book Description
Information retrieval (IR) is considered to be the science of searching for information from a variety of information sources related to texts, images, sounds, or multimedia. With the rise of the internet and digital databases, updated information retrieval methodologies are essential to ensure the continued facilitation and enhancement of information exchange. Critical Approaches to Information Retrieval Research is a critical scholarly publication that provides multidisciplinary examinations of theoretical innovations and methods in information retrieval technologies including search and storage applications for data, text, image, sound, document, and video retrieval. Featuring a wide range of topics including data mining, machine learning, and ontology, this book is ideal for librarians, software engineers, data scientists, professionals, researchers, information engineers, scientists, practitioners, and academicians working in the fields of computer science, information technology, information and communication sciences, education, health, library, and more.