The Probabilistic Relevance Framework

The Probabilistic Relevance Framework PDF Author: Stephen Robertson
Publisher: Now Publishers Inc
ISBN: 1601983085
Category : Computers
Languages : en
Pages : 69

Get Book Here

Book Description
The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970-80s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account structure and link-graph information. Again, this has led to one of the most successful web-search and corporate-search algorithms, BM25F. The Probabilistic Relevance Framework: BM25 and Beyond presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25, BM25F. Besides presenting a full derivation of the PRF ranking algorithms, it provides many insights about document retrieval in general, and points to many open challenges in this area. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimization for models with free parameters. The Probabilistic Relevance Framework: BM25 and Beyond is self-contained and accessible to anyone with basic knowledge of probability and inference

The Probabilistic Relevance Framework

The Probabilistic Relevance Framework PDF Author: Stephen Robertson
Publisher: Now Publishers Inc
ISBN: 1601983085
Category : Computers
Languages : en
Pages : 69

Get Book Here

Book Description
The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970-80s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account structure and link-graph information. Again, this has led to one of the most successful web-search and corporate-search algorithms, BM25F. The Probabilistic Relevance Framework: BM25 and Beyond presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25, BM25F. Besides presenting a full derivation of the PRF ranking algorithms, it provides many insights about document retrieval in general, and points to many open challenges in this area. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimization for models with free parameters. The Probabilistic Relevance Framework: BM25 and Beyond is self-contained and accessible to anyone with basic knowledge of probability and inference

A Generative Theory of Relevance

A Generative Theory of Relevance PDF Author: Victor Lavrenko
Publisher: Springer Science & Business Media
ISBN: 3540893644
Category : Computers
Languages : en
Pages : 211

Get Book Here

Book Description
A modern information retrieval system must have the capability to find, organize and present very different manifestations of information – such as text, pictures, videos or database records – any of which may be of relevance to the user. However, the concept of relevance, while seemingly intuitive, is actually hard to define, and it's even harder to model in a formal way. Lavrenko does not attempt to bring forth a new definition of relevance, nor provide arguments as to why any particular definition might be theoretically superior or more complete. Instead, he takes a widely accepted, albeit somewhat conservative definition, makes several assumptions, and from them develops a new probabilistic model that explicitly captures that notion of relevance. With this book, he makes two major contributions to the field of information retrieval: first, a new way to look at topical relevance, complementing the two dominant models, i.e., the classical probabilistic model and the language modeling approach, and which explicitly combines documents, queries, and relevance in a single formalism; second, a new method for modeling exchangeable sequences of discrete random variables which does not make any structural assumptions about the data and which can also handle rare events. Thus his book is of major interest to researchers and graduate students in information retrieval who specialize in relevance modeling, ranking algorithms, and language modeling.

Information Retrieval Models

Information Retrieval Models PDF Author: Thomas Roelleke
Publisher: Springer Nature
ISBN: 3031023285
Category : Computers
Languages : en
Pages : 141

Get Book Here

Book Description
Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works." This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary & Research Outlook / Bibliography / Author's Biography / Index

Introduction to Information Retrieval and Quantum Mechanics

Introduction to Information Retrieval and Quantum Mechanics PDF Author: Massimo Melucci
Publisher: Springer
ISBN: 3662483130
Category : Computers
Languages : en
Pages : 247

Get Book Here

Book Description
This book introduces the quantum mechanical framework to information retrieval scientists seeking a new perspective on foundational problems. As such, it concentrates on the main notions of the quantum mechanical framework and describes an innovative range of concepts and tools for modeling information representation and retrieval processes. The book is divided into four chapters. Chapter 1 illustrates the main modeling concepts for information retrieval (including Boolean logic, vector spaces, probabilistic models, and machine-learning based approaches), which will be examined further in subsequent chapters. Next, chapter 2 briefly explains the main concepts of the quantum mechanical framework, focusing on approaches linked to information retrieval such as interference, superposition and entanglement. Chapter 3 then reviews the research conducted at the intersection between information retrieval and the quantum mechanical framework. The chapter is subdivided into a number of topics, and each description ends with a section suggesting the most important reference resources. Lastly, chapter 4 offers suggestions for future research, briefly outlining the most essential and promising research directions to fully leverage the quantum mechanical framework for effective and efficient information retrieval systems. This book is especially intended for researchers working in information retrieval, database systems and machine learning who want to acquire a clear picture of the potential offered by the quantum mechanical framework in their own research area. Above all, the book offers clear guidance on whether, why and when to effectively use the mathematical formalism and the concepts of the quantum mechanical framework to address various foundational issues in information retrieval.

Advances in Databases and Information Systems

Advances in Databases and Information Systems PDF Author: Tatjana Welzer
Publisher: Springer Nature
ISBN: 3030287300
Category : Computers
Languages : en
Pages : 463

Get Book Here

Book Description
This book constitutes the proceedings of the 23rd European Conference on Advances in Databases and Information Systems, ADBIS 2019, held in Bled, Slovenia, in September 2019. The 27 full papers presented were carefully reviewed and selected from 103 submissions. The papers cover a wide range of topics from different areas of research in database and information systems technologies and their advanced applications from theoretical foundations to optimizing index structures. They focus on data mining and machine learning, data warehouses and big data technologies, semantic data processing, and data modeling. They are organized in the following topical sections: data mining; machine learning; document and text databases; big data; novel applications; ontologies and knowledge management; process mining and stream processing; data quality; optimization; theoretical foundation and new requirements; and data warehouses.

Semantic Keyword-Based Search on Structured Data Sources

Semantic Keyword-Based Search on Structured Data Sources PDF Author: Andrea Calì
Publisher: Springer
ISBN: 3319536400
Category : Computers
Languages : en
Pages : 197

Get Book Here

Book Description
This book constitutes the thoroughly refereed post-conference proceedings of the Second COST Action IC1302 International KEYSTONE Conference on Semantic Keyword-Based Search on Structured Data Sources, IKC 2016, held in Cluj-Napoca, Romania, in September 2016. The 15 revised full papers and 2 invited papers are reviewed and selected from 18 initial submissions and cover the areas of keyword extraction, natural language searches, graph databases, information retrieval techniques for keyword search and document retrieval.

Current Challenges in Patent Information Retrieval

Current Challenges in Patent Information Retrieval PDF Author: Mihai Lupu
Publisher: Springer
ISBN: 3662538172
Category : Computers
Languages : en
Pages : 461

Get Book Here

Book Description
This second edition provides a systematic introduction to the work and views of the emerging patent-search research and innovation communities as well as an overview of what has been achieved and, perhaps even more importantly, of what remains to be achieved. It revises many of the contributions of the first edition and adds a significant number of new ones. The first part “Introduction to Patent Searching” includes two overview chapters on the peculiarities of patent searching and on contemporary search technology respectively, and thus sets the scene for the subsequent parts. The second part on “Evaluating Patent Retrieval” then begins with two chapters dedicated to patent evaluation campaigns, followed by two chapters discussing complementary issues from the perspective of patent searchers and from the perspective of related domains, notably legal search. “High Recall Search” includes four completely new chapters dealing with the issue of finding only the relevant documents in a reasonable time span. The last (and with six papers the largest) part on “Special Topics in Patent Information Retrieval” covers a large spectrum of research in the patent field, from classification and image processing to translation. Lastly, the book is completed by an outlook on open issues and future research. Several of the chapters have been jointly written by intellectual property and information retrieval experts. However, members of both communities with a background different to that of the primary author have reviewed the chapters, making the book accessible to both the patent search community and to the information retrieval research community. It also not only offers the latest findings for academic researchers, but is also a valuable resource for IP professionals wanting to learn about current IR approaches in the patent domain.

Hybrid Artificial Intelligent Systems

Hybrid Artificial Intelligent Systems PDF Author: Pablo García Bringas
Publisher: Springer Nature
ISBN: 3031154711
Category : Computers
Languages : en
Pages : 523

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 17th International Conference on Hybrid Artificial Intelligent Systems, HAIS 2022, held in Salamanca, Spain, in September 2022. The 43 full papers presented in this book were carefully reviewed and selected from 67 submissions. They were organized in topical sections as follows: bioinformatics; data mining and decision support systems; deep learning; evolutionary computation; HAIS applications; image and speech signal processing; and optimization techniques.

Handbook of Probabilistic Models

Handbook of Probabilistic Models PDF Author: Pijush Samui
Publisher: Butterworth-Heinemann
ISBN: 0128165464
Category : Computers
Languages : en
Pages : 592

Get Book Here

Book Description
Handbook of Probabilistic Models carefully examines the application of advanced probabilistic models in conventional engineering fields. In this comprehensive handbook, practitioners, researchers and scientists will find detailed explanations of technical concepts, applications of the proposed methods, and the respective scientific approaches needed to solve the problem. This book provides an interdisciplinary approach that creates advanced probabilistic models for engineering fields, ranging from conventional fields of mechanical engineering and civil engineering, to electronics, electrical, earth sciences, climate, agriculture, water resource, mathematical sciences and computer sciences. Specific topics covered include minimax probability machine regression, stochastic finite element method, relevance vector machine, logistic regression, Monte Carlo simulations, random matrix, Gaussian process regression, Kalman filter, stochastic optimization, maximum likelihood, Bayesian inference, Bayesian update, kriging, copula-statistical models, and more. - Explains the application of advanced probabilistic models encompassing multidisciplinary research - Applies probabilistic modeling to emerging areas in engineering - Provides an interdisciplinary approach to probabilistic models and their applications, thus solving a wide range of practical problems

Machine Learning and Knowledge Discovery in Databases

Machine Learning and Knowledge Discovery in Databases PDF Author: Peter A. Flach
Publisher: Springer
ISBN: 3642334601
Category : Computers
Languages : en
Pages : 904

Get Book Here

Book Description
This two-volume set LNAI 7523 and LNAI 7524 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: ECML PKDD 2012, held in Bristol, UK, in September 2012. The 105 revised research papers presented together with 5 invited talks were carefully reviewed and selected from 443 submissions. The final sections of the proceedings are devoted to Demo and Nectar papers. The Demo track includes 10 papers (from 19 submissions) and the Nectar track includes 4 papers (from 14 submissions). The papers grouped in topical sections on association rules and frequent patterns; Bayesian learning and graphical models; classification; dimensionality reduction, feature selection and extraction; distance-based methods and kernels; ensemble methods; graph and tree mining; large-scale, distributed and parallel mining and learning; multi-relational mining and learning; multi-task learning; natural language processing; online learning and data streams; privacy and security; rankings and recommendations; reinforcement learning and planning; rule mining and subgroup discovery; semi-supervised and transductive learning; sensor data; sequence and string mining; social network mining; spatial and geographical data mining; statistical methods and evaluation; time series and temporal data mining; and transfer learning.