Keyword Search in Graphs, Relational Databases and Social Networks

Keyword Search in Graphs, Relational Databases and Social Networks PDF Author: Mehdi Kargar
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Keyword search, a well known mechanism for retrieving relevant information from a set of documents, has recently been studied for extracting information from structured data (e.g., relational databases and XML documents). It offers an alternative way to query languages (e.g., SQL) to explore databases, which is effective for lay users who may not be familiar with the database schema or the query language. This dissertation addresses some issues in keyword search in structured data. Namely, novel solutions to existing problems in keyword search in graphs or relational databases are proposed. In addition, a problem related to graph keyword search, team formation in social networks, is studied. The dissertation consists of four parts. The first part addresses keyword search over a graph which finds a substructure of the graph containing all or some of the query keywords. Current methods for keyword search over graphs may produce answers in which some content nodes (id est, nodes that contain input keywords) are not very close to each other. In addition, current methods explore both content and non-content nodes while searching for the result and are thus both time and memory consuming for large graphs. To address the above problems, we propose algorithms for finding r-cliques in graphs. An r-clique is a group of content nodes that cover all the input keywords and the distance between each pair of nodes is less than or equal to r. Two approximation algorithms that produce r-cliques with a bounded approximation ratio in polynomial delay are proposed. In the second part, the problem of duplication-free and minimal keyword search in graphs is studied. Current methods for keyword search in graphs may produce duplicate answers that contain the same set of content nodes. In addition, an answer found by these methods may not be minimal in the sense that some of the nodes in the answer may contain query keywords that are all covered by other nodes in the answer. Removing these nodes does not change the coverage of the answer but can make the answer more compact. We define the problem of finding duplication-free and minimal answers, and propose algorithms for finding such answers efficiently. Meaningful keyword search in relational databases is the subject of the third part of this dissertation. Keyword search over relational databases returns a join tree spanning tuples containing the query keywords. As many answers of varying quality can be found, and the user is often only interested in seeing the top-k answers, how to gauge the relevance of answers to rank them is of paramount importance. This becomes more pertinent for databases with large and complex schemas. We focus on the relevance of join trees as the fundamental means to rank the answers. We devise means to measure relevance of relations and foreign keys in the schema over the information content of the database. The problem of keyword search over graph data is similar to the problem of team formation in social networks. In this setting, keywords represent skills and the nodes in a graph represent the experts that possess skills. Given an expert network, in which a node represents an expert that has a cost for using the expert service and an edge represents the communication cost between the two corresponding experts, we tackle the problem of finding a team of experts that covers a set of required skills and also minimizes the communication cost as well as the personnel cost of the team. We propose two types of approximation algorithms to solve this bi-criteria problem in the fourth part of this dissertation.

Keyword Search in Graphs, Relational Databases and Social Networks

Keyword Search in Graphs, Relational Databases and Social Networks PDF Author: Mehdi Kargar
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
Keyword search, a well known mechanism for retrieving relevant information from a set of documents, has recently been studied for extracting information from structured data (e.g., relational databases and XML documents). It offers an alternative way to query languages (e.g., SQL) to explore databases, which is effective for lay users who may not be familiar with the database schema or the query language. This dissertation addresses some issues in keyword search in structured data. Namely, novel solutions to existing problems in keyword search in graphs or relational databases are proposed. In addition, a problem related to graph keyword search, team formation in social networks, is studied. The dissertation consists of four parts. The first part addresses keyword search over a graph which finds a substructure of the graph containing all or some of the query keywords. Current methods for keyword search over graphs may produce answers in which some content nodes (id est, nodes that contain input keywords) are not very close to each other. In addition, current methods explore both content and non-content nodes while searching for the result and are thus both time and memory consuming for large graphs. To address the above problems, we propose algorithms for finding r-cliques in graphs. An r-clique is a group of content nodes that cover all the input keywords and the distance between each pair of nodes is less than or equal to r. Two approximation algorithms that produce r-cliques with a bounded approximation ratio in polynomial delay are proposed. In the second part, the problem of duplication-free and minimal keyword search in graphs is studied. Current methods for keyword search in graphs may produce duplicate answers that contain the same set of content nodes. In addition, an answer found by these methods may not be minimal in the sense that some of the nodes in the answer may contain query keywords that are all covered by other nodes in the answer. Removing these nodes does not change the coverage of the answer but can make the answer more compact. We define the problem of finding duplication-free and minimal answers, and propose algorithms for finding such answers efficiently. Meaningful keyword search in relational databases is the subject of the third part of this dissertation. Keyword search over relational databases returns a join tree spanning tuples containing the query keywords. As many answers of varying quality can be found, and the user is often only interested in seeing the top-k answers, how to gauge the relevance of answers to rank them is of paramount importance. This becomes more pertinent for databases with large and complex schemas. We focus on the relevance of join trees as the fundamental means to rank the answers. We devise means to measure relevance of relations and foreign keys in the schema over the information content of the database. The problem of keyword search over graph data is similar to the problem of team formation in social networks. In this setting, keywords represent skills and the nodes in a graph represent the experts that possess skills. Given an expert network, in which a node represents an expert that has a cost for using the expert service and an edge represents the communication cost between the two corresponding experts, we tackle the problem of finding a team of experts that covers a set of required skills and also minimizes the communication cost as well as the personnel cost of the team. We propose two types of approximation algorithms to solve this bi-criteria problem in the fourth part of this dissertation.

Keyword Search in Databases

Keyword Search in Databases PDF Author: Jeffrey Xu Yu
Publisher: Springer Nature
ISBN: 3031794265
Category : Technology & Engineering
Languages : en
Pages : 143

Get Book Here

Book Description
It has become highly desirable to provide users with flexible ways to query/search information over databases as simple as keyword search like Google search. This book surveys the recent developments on keyword search over databases, and focuses on finding structural information among objects in a database using a set of keywords. Such structural information to be returned can be either trees or subgraphs representing how the objects, that contain the required keywords, are interconnected in a relational database or in an XML database. The structural keyword search is completely different from finding documents that contain all the user-given keywords. The former focuses on the interconnected object structures, whereas the latter focuses on the object content. The book is organized as follows. In Chapter 1, we highlight the main research issues on the structural keyword search in different contexts. In Chapter 2, we focus on supporting structural keyword search in a relational database management system using the SQL query language. We concentrate on how to generate a set of SQL queries that can find all the structural information among records in a relational database completely, and how to evaluate the generated set of SQL queries efficiently. In Chapter 3, we discuss graph algorithms for structural keyword search by treating an entire relational database as a large data graph. In Chapter 4, we discuss structural keyword search in a large tree-structured XML database. In Chapter 5, we highlight several interesting research issues regarding keyword search on databases. The book can be used as either an extended survey for people who are interested in the structural keyword search or a reference book for a postgraduate course on the related topics. Table of Contents: Introduction / Schema-Based Keyword Search on Relational Databases / Graph-Based Keyword Search / Keyword Search in XML Databases / Other Topics for Keyword Search on Databases

Advanced Data Mining and Applications

Advanced Data Mining and Applications PDF Author: Shuigeng Zhou
Publisher: Springer Science & Business Media
ISBN: 3642355277
Category : Computers
Languages : en
Pages : 812

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 8th International Conference on Advanced Data Mining and Applications, ADMA 2012, held in Nanjing, China, in December 2012. The 32 regular papers and 32 short papers presented in this volume were carefully reviewed and selected from 168 submissions. They are organized in topical sections named: social media mining; clustering; machine learning: algorithms and applications; classification; prediction, regression and recognition; optimization and approximation; mining time series and streaming data; Web mining and semantic analysis; data mining applications; search and retrieval; information recommendation and hiding; outlier detection; topic modeling; and data cube computing.

Social Network Data Analytics

Social Network Data Analytics PDF Author: Charu C. Aggarwal
Publisher: Springer Science & Business Media
ISBN: 1441984623
Category : Computers
Languages : en
Pages : 508

Get Book Here

Book Description
Social network analysis applications have experienced tremendous advances within the last few years due in part to increasing trends towards users interacting with each other on the internet. Social networks are organized as graphs, and the data on social networks takes on the form of massive streams, which are mined for a variety of purposes. Social Network Data Analytics covers an important niche in the social network analytics field. This edited volume, contributed by prominent researchers in this field, presents a wide selection of topics on social network data mining such as Structural Properties of Social Networks, Algorithms for Structural Discovery of Social Networks and Content Analysis in Social Networks. This book is also unique in focussing on the data analytical aspects of social networks in the internet scenario, rather than the traditional sociology-driven emphasis prevalent in the existing books, which do not focus on the unique data-intensive characteristics of online social networks. Emphasis is placed on simplifying the content so that students and practitioners benefit from this book. This book targets advanced level students and researchers concentrating on computer science as a secondary text or reference book. Data mining, database, information security, electronic commerce and machine learning professionals will find this book a valuable asset, as well as primary associations such as ACM, IEEE and Management Science.

Managing and Mining Graph Data

Managing and Mining Graph Data PDF Author: Charu C. Aggarwal
Publisher: Springer Science & Business Media
ISBN: 1441960457
Category : Computers
Languages : en
Pages : 623

Get Book Here

Book Description
Managing and Mining Graph Data is a comprehensive survey book in graph management and mining. It contains extensive surveys on a variety of important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy. It also studies a number of domain-specific scenarios such as stream mining, web graphs, social networks, chemical and biological data. The chapters are written by well known researchers in the field, and provide a broad perspective of the area. This is the first comprehensive survey book in the emerging topic of graph data processing. Managing and Mining Graph Data is designed for a varied audience composed of professors, researchers and practitioners in industry. This volume is also suitable as a reference book for advanced-level database students in computer science and engineering.

Database Systems for Advanced Applications

Database Systems for Advanced Applications PDF Author: Christian S. Jensen
Publisher: Springer Nature
ISBN: 3030731944
Category : Computers
Languages : en
Pages : 683

Get Book Here

Book Description
The three-volume set LNCS 12681-12683 constitutes the proceedings of the 26th International Conference on Database Systems for Advanced Applications, DASFAA 2021, held in Taipei, Taiwan, in April 2021. The total of 156 papers presented in this three-volume set was carefully reviewed and selected from 490 submissions. The topic areas for the selected papers include information retrieval, search and recommendation techniques; RDF, knowledge graphs, semantic web, and knowledge management; and spatial, temporal, sequence, and streaming data management, while the dominant keywords are network, recommendation, graph, learning, and model. These topic areas and keywords shed the light on the direction where the research in DASFAA is moving towards. Due to the Corona pandemic this event was held virtually.

Concise Guide to Databases

Concise Guide to Databases PDF Author: Konstantinos Domdouzis
Publisher: Springer Nature
ISBN: 3030422240
Category : Computers
Languages : en
Pages : 400

Get Book Here

Book Description
Modern businesses depend on data for their very survival, creating a need for sophisticated databases and database technologies to help store, organise and transport their valuable data. This updated and expanded, easy-to-read textbook/reference presents a comprehensive introduction to databases, opening with a concise history of databases and of data as an organisational asset. As relational database management systems are no longer the only database solution, the book takes a wider view of database technology, encompassing big data, NoSQL, object and object-relational, and in-memory databases. Presenting both theoretical and practical elements, the new edition also examines the issues of scalability, availability, performance and security encountered when building and running a database in the real world. Topics and features: Presents review and discussion questions at the end of each chapter, in addition to skill-building, hands-on exercises Provides new material on database adaptiveness, integration, and efficiency in relation to data growth Introduces a range of commercial databases and encourages the reader to experiment with these in an associated learning environment Reviews use of a variety of databases in business environments, including numerous examples Discusses areas for further research within this fast-moving domain With its learning-by-doing approach, supported by both theoretical and practical examples, this clearly-structured textbook will be of great value to advanced undergraduate and postgraduate students of computer science, software engineering, and information technology. Practising database professionals and application developers will also find the book an ideal reference that addresses today's business needs.

Proceedings of the 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC – 16’)

Proceedings of the 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC – 16’) PDF Author: V. Vijayakumar
Publisher: Springer
ISBN: 3319303481
Category : Technology & Engineering
Languages : en
Pages : 508

Get Book Here

Book Description
This proceedings volume contains selected papers that were presented in the 3rd International Symposium on Big data and Cloud Computing Challenges, 2016 held at VIT University, India on March 10 and 11. New research issues, challenges and opportunities shaping the future agenda in the field of Big Data and Cloud Computing are identified and presented throughout the book, which is intended for researchers, scholars, students, software developers and practitioners working at the forefront in their field. This book acts as a platform for exchanging ideas, setting questions for discussion, and sharing the experience in Big Data and Cloud Computing domain.​

Graph Mining

Graph Mining PDF Author: Deepayan Chakrabarti
Publisher: Morgan & Claypool Publishers
ISBN: 160845116X
Category : Computers
Languages : en
Pages : 209

Get Book Here

Book Description
What does the Web look like? How can we find patterns, communities, outliers, in a social network? Which are the most central nodes in a network? These are the questions that motivate this work. Networks and graphs appear in many diverse settings, for example in social networks, computer-communication networks (intrusion detection, traffic management), protein-protein interaction networks in biology, document-text bipartite graphs in text retrieval, person-account graphs in financial fraud detection, and others. In this work, first we list several surprising patterns that real graphs tend to follow. Then we give a detailed list of generators that try to mirror these patterns. Generators are important, because they can help with "what if" scenarios, extrapolations, and anonymization. Then we provide a list of powerful tools for graph analysis, and specifically spectral methods (Singular Value Decomposition (SVD)), tensors, and case studies like the famous "pageRank" algorithm and the "HITS" algorithm for ranking web search results. Finally, we conclude with a survey of tools and observations from related fields like sociology, which provide complementary viewpoints. Table of Contents: Introduction / Patterns in Static Graphs / Patterns in Evolving Graphs / Patterns in Weighted Graphs / Discussion: The Structure of Specific Graphs / Discussion: Power Laws and Deviations / Summary of Patterns / Graph Generators / Preferential Attachment and Variants / Incorporating Geographical Information / The RMat / Graph Generation by Kronecker Multiplication / Summary and Practitioner's Guide / SVD, Random Walks, and Tensors / Tensors / Community Detection / Influence/Virus Propagation and Immunization / Case Studies / Social Networks / Other Related Work / Conclusions

Community Search over Big Graphs

Community Search over Big Graphs PDF Author: Xin Huang
Publisher: Springer Nature
ISBN: 3031018745
Category : Computers
Languages : en
Pages : 188

Get Book Here

Book Description
Communities serve as basic structural building blocks for understanding the organization of many real-world networks, including social, biological, collaboration, and communication networks. Recently, community search over graphs has attracted significantly increasing attention, from small, simple, and static graphs to big, evolving, attributed, and location-based graphs. In this book, we first review the basic concepts of networks, communities, and various kinds of dense subgraph models. We then survey the state of the art in community search techniques on various kinds of networks across different application areas. Specifically, we discuss cohesive community search, attributed community search, social circle discovery, and geo-social group search. We highlight the challenges posed by different community search problems. We present their motivations, principles, methodologies, algorithms, and applications, and provide a comprehensive comparison of the existing techniques. This book finally concludes by listing publicly available real-world datasets and useful tools for facilitating further research, and by offering further readings and future directions of research in this important and growing area.