Toward robust information extraction models for multimedia documents

Toward robust information extraction models for multimedia documents PDF Author: Ali Reza Ebadat
Publisher:
ISBN:
Category :
Languages : fr
Pages : 161

Get Book Here

Book Description
L'énorme quantité de documents multimédias constamment générés incite au développement de méthodes d'analyse automatique. Dans ce cadre, notre objectif est de faciliter ce processus en extrayant des informations à partir de n'importe quel texte relatif à ces documents. En outre, nous voulons des techniques suffisamment robustes pour traiter des données bruitées et de petite taille. Pour ce faire, nous utilisons des techniques simples nécessitant peu de connaissances externes comme une garantie de robustesse. Plus précisément, nous utilisons des techniques inspirées de la recherche d'information et de l'analyse statistique. Dans cette thèse, nous montrons expérimentalement que des techniques simples, sans connaissance a priori peuvent être utiles pour extraire efficacement les informations à partir du texte. Dans notre cas, ces bons résultats ont été obtenus en choisissant une représentation adaptée pour les données au lieu d'exiger de traitements complexes.

Toward robust information extraction models for multimedia documents

Toward robust information extraction models for multimedia documents PDF Author: Ali Reza Ebadat
Publisher:
ISBN:
Category :
Languages : fr
Pages : 161

Get Book Here

Book Description
L'énorme quantité de documents multimédias constamment générés incite au développement de méthodes d'analyse automatique. Dans ce cadre, notre objectif est de faciliter ce processus en extrayant des informations à partir de n'importe quel texte relatif à ces documents. En outre, nous voulons des techniques suffisamment robustes pour traiter des données bruitées et de petite taille. Pour ce faire, nous utilisons des techniques simples nécessitant peu de connaissances externes comme une garantie de robustesse. Plus précisément, nous utilisons des techniques inspirées de la recherche d'information et de l'analyse statistique. Dans cette thèse, nous montrons expérimentalement que des techniques simples, sans connaissance a priori peuvent être utiles pour extraire efficacement les informations à partir du texte. Dans notre cas, ces bons résultats ont été obtenus en choisissant une représentation adaptée pour les données au lieu d'exiger de traitements complexes.

Information Extraction

Information Extraction PDF Author: Fouad Sabry
Publisher: One Billion Knowledgeable
ISBN:
Category : Computers
Languages : en
Pages : 149

Get Book Here

Book Description
What Is Information Extraction The process of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources is referred to as information extraction (IE). This activity, in the vast majority of instances, refers to the processing of documents written in human languages by utilizing natural language processing (NLP). The process of extracting information can be seen in recent activity in multimedia document processing such as automatic annotation and content extraction out of photos, audio, and video documents. How You Will Benefit (I) Insights, and validations about the following topics: Chapter 1: Information extraction Chapter 2: Natural language processing Chapter 3: Text mining Chapter 4: Named-entity recognition Chapter 5: Unstructured data Chapter 6: Relationship extraction Chapter 7: Data extraction Chapter 8: Knowledge extraction Chapter 9: Entity linking Chapter 10: Outline of natural language processing (II) Answering the public top questions about information extraction. (III) Real world examples for the usage of information extraction in many fields. (IV) 17 appendices to explain, briefly, 266 emerging technologies in each industry to have 360-degree full understanding of information extraction' technologies. Who This Book Is For Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of information extraction.

Mining Multimedia Documents

Mining Multimedia Documents PDF Author: Wahiba Ben Abdessalem Karaa
Publisher: CRC Press
ISBN: 1315399733
Category : Technology & Engineering
Languages : en
Pages : 243

Get Book Here

Book Description
The information age has led to an explosion in the amount of information available to the individual and the means by which it is accessed, stored, viewed, and transferred. In particular, the growth of the internet has led to the creation of huge repositories of multimedia documents in a diverse range of scientific and professional fields, as well as the tools to extract useful knowledge from them. Mining Multimedia Documents is a must-read for researchers, practitioners, and students working at the intersection of data mining and multimedia applications. It investigates various techniques related to mining multimedia documents based on text, image, and video features. It provides an insight into the open research problems benefitting advanced undergraduates, graduate students, researchers, scientists and practitioners in the fields of medicine, biology, production, education, government, national security and economics.

Information Extraction: Algorithms and Prospects in a Retrieval Context

Information Extraction: Algorithms and Prospects in a Retrieval Context PDF Author: Marie-Francine Moens
Publisher: Springer Science & Business Media
ISBN: 1402049935
Category : Language Arts & Disciplines
Languages : en
Pages : 255

Get Book Here

Book Description
This book covers content recognition in text, elaborating on past and current most successful algorithms and their application in a variety of settings: news filtering, mining of biomedical text, intelligence gathering, competitive intelligence, legal information searching, and processing of informal text. Today, there is considerable interest in integrating the results of information extraction in retrieval systems, because of the demand for search engines that return precise answers to flexible information queries.

Knowledge-Driven Multimedia Information Extraction and Ontology Evolution

Knowledge-Driven Multimedia Information Extraction and Ontology Evolution PDF Author: Georgios Paliouras
Publisher: Springer Science & Business Media
ISBN: 3642207944
Category : Computers
Languages : en
Pages : 251

Get Book Here

Book Description
This book presents the state of the art in the areas of ontology evolution and knowledge-driven multimedia information extraction, placing an emphasis on how the two can be combined to bridge the semantic gap. This was also the goal of the EC-sponsored BOEMIE (Bootstrapping Ontology Evolution with Multimedia Information Extraction) project, to which the authors of this book have all contributed. The book addresses researchers and practitioners in the field of computer science and more specifically in knowledge representation and management, ontology evolution, and information extraction from multimedia data. It may also constitute an excellent guide to students attending courses within a computer science study program, addressing information processing and extraction from any type of media (text, images, and video). Among other things, the book gives concrete examples of how several of the methods discussed can be applied to athletics (track and field) events.

Document Analysis and Recognition - ICDAR 2024

Document Analysis and Recognition - ICDAR 2024 PDF Author: Elisa H. Barney Smith
Publisher: Springer Nature
ISBN: 3031705335
Category :
Languages : en
Pages : 500

Get Book Here

Book Description


Information Extraction

Information Extraction PDF Author: Maria T. Pazienza
Publisher: Springer Science & Business Media
ISBN: 3540666257
Category : Computers
Languages : en
Pages : 175

Get Book Here

Book Description
"By investigating the general structures of natural language and logic as well as relevant software engineering methodologies, the lectures presented in this book attempt the development of principled techniques for domain-independent IE. The book is based on the Second International School on Information Extraction, SCIE-99, held in Frascati near Rome, Italy in June/July 1999."--BOOK JACKET.

Automatic extraction and processing of document references

Automatic extraction and processing of document references PDF Author: Kathrin Eichler
Publisher: GRIN Verlag
ISBN: 3640722728
Category : Computers
Languages : en
Pages : 70

Get Book Here

Book Description
Master's Thesis from the year 2007 in the subject Computer Science - Applied, grade: 1.0, University of Sunderland (School of Computing and Technology), language: English, abstract: While reading documents, you often encounter text passages advising you to refer to other documents for more information about a specific topic. These references to other documents are particularly common in technical documents, written for the sole purpose of providing the reader with as much relevant information as possible, without rephrasing information that can be found elsewhere. Knowing how the documents in a system are interrelated, i.e. which other documents a document refers to or is referred by, can be extremely helpful when trying to get access to relevant information. A typical example of such a “knowledge net” providing information about document relations is CiteSeer, a digital library of academic literature. For each document in the library system, CiteSeer displays lists of related documents, such as a list of documents that the current document cites as well as a list of documents that the current document is cited by. The assumption that inspired this thesis is that such lists are not only helpful when reading academic literature but could also assist a reader of technical documents stored in a company’s document management system. The idea was thus to extend an existing document management system by displaying, for each document stored in the system, a list of links to documents that the current document refers to. As information about how the documents in this system are interrelated was not available, the focus of the project underlying this thesis was on the first step towards solving this task: automatically analyzing documents in order to extract names of related documents. Once all document names mentioned in a document have been extracted, the next step would then be to search for these documents in the system’s database and, in case they have been successfully found, create links to the respective documents. The outcome of the project was a system that performs the extraction task. It is based on Conditional Random Fields, a machine learning technique introduced by Lafferty et al. (2001), and is able to extract document names from unseen documents, achieving high precision scores (88%) and acceptable recall scores (65%) on a test dataset. The implementation is based on a Java package provided by Sarawagi & Cohen (2005), which was adapted and extended to suit the nature of the task. As the approach is based on supervised learning, the project also involved the generation of appropriate training data.

Document Analysis and Recognition - ICDAR 2023

Document Analysis and Recognition - ICDAR 2023 PDF Author: Gernot A. Fink
Publisher: Springer Nature
ISBN: 3031417313
Category : Computers
Languages : en
Pages : 190

Get Book Here

Book Description
This six-volume set of LNCS 14187, 14188, 14189, 14190, 14191 and 14192 constitutes the refereed proceedings of the 17th International Conference on Document Analysis and Recognition, ICDAR 2021, held in San José, CA, USA, in August 2023. The 53 full papers were carefully reviewed and selected from 316 submissions, and are presented with 101 poster presentations. The papers are organized into the following topical sections: Graphics Recognition, Frontiers in Handwriting Recognition, Document Analysis and Recognition.

Information Extraction

Information Extraction PDF Author: Maria T. Pazienza
Publisher: Lecture Notes in Artificial Intelligence
ISBN:
Category : Computers
Languages : en
Pages : 236

Get Book Here

Book Description
This book constitutes the strictly refereed post-workshop proceedings of the Sixth International Workshop on Logic Program Synthesis and Transformation, LOPSTR'96, held on board a ship sailing from Stockholm to Helsinki, in August 1996. The 17 revised full papers were carefully selected from a total of initially 27 submissions. The topics covered range over the areas of synthesis of programs from specifications, verification, transformation, specialization, and analysis of programs, and the use of program schemata in program development.