Efficient Algorithms for Querying Large-Scale Data in Relational, XML, and Graph-Structured Data Repositories

Efficient Algorithms for Querying Large-Scale Data in Relational, XML, and Graph-Structured Data Repositories PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
We live in an information age, and data are ubiquitous today. Various applications, ranging from scientific computing, medical research, and bioinformatics to administrative management, commercial sales, and financial marketing, generate and utilize data every day. Many of these applications are data intensive, with the amount of data involved potentially reaching hundreds of thousands of gigabytes. Further, different applications store data using different data models. For example, applications could store and manage structured data using a flat (relational) model, semi-structured data using a hierarchical (XML) model, and less-structured data using a more general and flexible graph model. In this thesis, I report my research results on efficiently querying large-scale data in relational, XML, and graph-structured data repositories. Specifically, this thesis covers three research projects, which I have been invited to present in the ACM SIGMOD conference in 2006, 2007, and 2008, respectively. The first project concerns efficient querying of relational data using materialized views and introduces our efficient view-based query-optimization algorithms that support a large and practically important subset of SQL queries. The second project focuses on efficiently querying XML data and presents efficient algorithms for evaluating XPath queries over XML streams, which are the first ones that achieve the O(|D||Q|) time performance, where |D| is the XML data size and |Q| is the XPath query size. Meanwhile, our algorithm EQ also achieves optimal space performance. The third project addresses efficient querying of graph-structured data, by introducing efficient algorithms for retrieving top-ranked tree-pattern matches from large graphs. While a tree-pattern query could have an extremely large, potentially exponential, number of answer matches in a graph, our algorithms exhibit time and space performance that is linear or sub-linear in the size of the input data. Our algorith.

Efficient Algorithms for Querying Large-Scale Data in Relational, XML, and Graph-Structured Data Repositories

Efficient Algorithms for Querying Large-Scale Data in Relational, XML, and Graph-Structured Data Repositories PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
We live in an information age, and data are ubiquitous today. Various applications, ranging from scientific computing, medical research, and bioinformatics to administrative management, commercial sales, and financial marketing, generate and utilize data every day. Many of these applications are data intensive, with the amount of data involved potentially reaching hundreds of thousands of gigabytes. Further, different applications store data using different data models. For example, applications could store and manage structured data using a flat (relational) model, semi-structured data using a hierarchical (XML) model, and less-structured data using a more general and flexible graph model. In this thesis, I report my research results on efficiently querying large-scale data in relational, XML, and graph-structured data repositories. Specifically, this thesis covers three research projects, which I have been invited to present in the ACM SIGMOD conference in 2006, 2007, and 2008, respectively. The first project concerns efficient querying of relational data using materialized views and introduces our efficient view-based query-optimization algorithms that support a large and practically important subset of SQL queries. The second project focuses on efficiently querying XML data and presents efficient algorithms for evaluating XPath queries over XML streams, which are the first ones that achieve the O(|D||Q|) time performance, where |D| is the XML data size and |Q| is the XPath query size. Meanwhile, our algorithm EQ also achieves optimal space performance. The third project addresses efficient querying of graph-structured data, by introducing efficient algorithms for retrieving top-ranked tree-pattern matches from large graphs. While a tree-pattern query could have an extremely large, potentially exponential, number of answer matches in a graph, our algorithms exhibit time and space performance that is linear or sub-linear in the size of the input data. Our algorith.

Efficient Algorithms for Querying Large-scale Data in Relational, XML, and Graph-structured Data Repositories

Efficient Algorithms for Querying Large-scale Data in Relational, XML, and Graph-structured Data Repositories PDF Author: Gang Gou
Publisher:
ISBN:
Category :
Languages : en
Pages : 136

Get Book Here

Book Description
Keywords: Databases, Views, SQL, Stream, Top-k, Algorithm, XML, Graph.

Efficient Optimization and Processing of Queries Over Text-rich Graph-structured Data

Efficient Optimization and Processing of Queries Over Text-rich Graph-structured Data PDF Author: Günter Ladwig
Publisher: KIT Scientific Publishing
ISBN: 3731500159
Category : Computers
Languages : en
Pages : 254

Get Book Here

Book Description
Many databases today capture both, structured and unstructured data. Making use of such hybrid data has become an important topic in research and industry. The efficient evaluation of hybrid data queries is the main topic of this thesis. Novel techniques are proposed that improve the whole processing pipeline, from indexes and query optimization to run-time processing. The contributions are evaluated in extensive experiments showing that the proposed techniques improve upon the state of the art.

Linked Data Management

Linked Data Management PDF Author: Andreas Harth
Publisher: CRC Press
ISBN: 1466582405
Category : Computers
Languages : en
Pages : 578

Get Book Here

Book Description
Linked Data Management presents techniques for querying and managing Linked Data that is available on today’s Web. The book shows how the abundance of Linked Data can serve as fertile ground for research and commercial applications. The text focuses on aspects of managing large-scale collections of Linked Data. It offers a detailed introduction to Linked Data and related standards, including the main principles distinguishing Linked Data from standard database technology. Chapters also describe how to generate links between datasets and explain the overall architecture of data integration systems based on Linked Data. A large part of the text is devoted to query processing in different setups. After presenting methods to publish relational data as Linked Data and efficient centralized processing, the book explores lookup-based, distributed, and parallel solutions. It then addresses advanced topics, such as reasoning, and discusses work related to read-write Linked Data for system interoperation. Despite the publication of many papers since Tim Berners-Lee developed the Linked Data principles in 2006, the field lacks a comprehensive, unified overview of the state of the art. Suitable for both researchers and practitioners, this book provides a thorough, consolidated account of the new data publishing and data integration paradigm. While the book covers query processing extensively, the Linked Data abstraction furnishes more than a mechanism for collecting, integrating, and querying data from the open Web—the Linked Data technology stack also allows for controlled, sophisticated applications deployed in an enterprise environment.

Database and XML Technologies

Database and XML Technologies PDF Author: Zohra Bellahsène
Publisher: Springer
ISBN: 354039429X
Category : Computers
Languages : en
Pages : 293

Get Book Here

Book Description
The Extensible Markup Language (XML) is playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. The database c- munity is interested in XML because it can be used to represent a variety of data f- mats originating in different kinds of data repositories while providing structure and the possibility to add type information. The theme of this symposium is the combination of database and XML te- nologies. Today, we see growing interest in using these technologies together for many Web-based and database-centric applications. XML is being used to publish data from database systems on the Web by providing input to content generators for Web pages, and database systems are increasingly being used to store and query XML data, often by handling queries issued over the Internet. As database systems incre- ingly start talking to each other over the Web, there is a fast-growing interest in using XML as the standard exchange format for distributed query processing. As a result, many relational database systems export data as XML documents, import data from XML documents, provide query and update capabilities for XML data. In addition, so-called native XML database and integration systems are appearing on the database market, and it’s claimed that they are especially tailored to store, maintain and easily access XML documents.

Encyclopedia of Information Science and Technology, Third Edition

Encyclopedia of Information Science and Technology, Third Edition PDF Author: Khosrow-Pour, Mehdi
Publisher: IGI Global
ISBN: 1466658894
Category : Computers
Languages : en
Pages : 7972

Get Book Here

Book Description
"This 10-volume compilation of authoritative, research-based articles contributed by thousands of researchers and experts from all over the world emphasized modern issues and the presentation of potential opportunities, prospective solutions, and future directions in the field of information science and technology"--Provided by publisher.

Knowledge Graphs and Big Data Processing

Knowledge Graphs and Big Data Processing PDF Author: Valentina Janev
Publisher: Springer Nature
ISBN: 3030531996
Category : Computers
Languages : en
Pages : 212

Get Book Here

Book Description
This open access book is part of the LAMBDA Project (Learning, Applying, Multiplying Big Data Analytics), funded by the European Union, GA No. 809965. Data Analytics involves applying algorithmic processes to derive insights. Nowadays it is used in many industries to allow organizations and companies to make better decisions as well as to verify or disprove existing theories or models. The term data analytics is often used interchangeably with intelligence, statistics, reasoning, data mining, knowledge discovery, and others. The goal of this book is to introduce some of the definitions, methods, tools, frameworks, and solutions for big data processing, starting from the process of information extraction and knowledge representation, via knowledge processing and analytics to visualization, sense-making, and practical applications. Each chapter in this book addresses some pertinent aspect of the data processing chain, with a specific focus on understanding Enterprise Knowledge Graphs, Semantic Big Data Architectures, and Smart Data Analytics solutions. This book is addressed to graduate students from technical disciplines, to professional audiences following continuous education short courses, and to researchers from diverse areas following self-study courses. Basic skills in computer science, mathematics, and statistics are required.

Adaptive Query Processing

Adaptive Query Processing PDF Author: Amol Deshpande
Publisher: Now Publishers Inc
ISBN: 1601980345
Category : Computers
Languages : en
Pages : 156

Get Book Here

Book Description
Adaptive Query Processing surveys the fundamental issues, techniques, costs, and benefits of adaptive query processing. It begins with a broad overview of the field, identifying the dimensions of adaptive techniques. It then looks at the spectrum of approaches available to adapt query execution at runtime - primarily in a non-streaming context. The emphasis is on simplifying and abstracting the key concepts of each technique, rather than reproducing the full details available in the papers. The authors identify the strengths and limitations of the different techniques, demonstrate when they are most useful, and suggest possible avenues of future research. Adaptive Query Processing serves as a valuable reference for students of databases, providing a thorough survey of the area. Database researchers will benefit from a more complete point of view, including a number of approaches which they may not have focused on within the scope of their own research.

Neural Networks and Statistical Learning

Neural Networks and Statistical Learning PDF Author: Ke-Lin Du
Publisher: Springer Nature
ISBN: 1447174526
Category : Mathematics
Languages : en
Pages : 988

Get Book Here

Book Description
This book provides a broad yet detailed introduction to neural networks and machine learning in a statistical framework. A single, comprehensive resource for study and further research, it explores the major popular neural network models and statistical learning approaches with examples and exercises and allows readers to gain a practical working understanding of the content. This updated new edition presents recently published results and includes six new chapters that correspond to the recent advances in computational learning theory, sparse coding, deep learning, big data and cloud computing. Each chapter features state-of-the-art descriptions and significant research findings. The topics covered include: • multilayer perceptron; • the Hopfield network; • associative memory models;• clustering models and algorithms; • t he radial basis function network; • recurrent neural networks; • nonnegative matrix factorization; • independent component analysis; •probabilistic and Bayesian networks; and • fuzzy sets and logic. Focusing on the prominent accomplishments and their practical aspects, this book provides academic and technical staff, as well as graduate students and researchers with a solid foundation and comprehensive reference on the fields of neural networks, pattern recognition, signal processing, and machine learning.

Efficient Algorithms for Learning Correlations in Large-Scale Wireless Data

Efficient Algorithms for Learning Correlations in Large-Scale Wireless Data PDF Author: Abdullah Almutairi
Publisher:
ISBN:
Category :
Languages : en
Pages : 121

Get Book Here

Book Description
Multiple locations. We propose a Global Local model that captures the above ideas and can be used to understand the relationship of different locations based on multiple users' mobile behavior. The study is extended to the temporal attributes of the data to learn both the temporal and spatio-temporal correlations present in the wireless data. To find these correlations we propose a Multi-Dimensional Hierarchical Co-Clustering (MDHCC) method.