Author: Peter Buneman
Publisher: Assn for Computing Machinery
ISBN: 9780897915922
Category : Computer science
Languages : en
Pages : 566
Book Description
Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data
Author: Peter Buneman
Publisher: Assn for Computing Machinery
ISBN: 9780897915922
Category : Computer science
Languages : en
Pages : 566
Book Description
Publisher: Assn for Computing Machinery
ISBN: 9780897915922
Category : Computer science
Languages : en
Pages : 566
Book Description
Learning Structure and Schemas from Documents
Author: Marenglen Biba
Publisher: Springer Science & Business Media
ISBN: 3642229123
Category : Computers
Languages : en
Pages : 449
Book Description
The rapidly growing volume of available digital documents of various formats and the possibility to access these through Internet-based technologies, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Due to the extremely large volumes of documents and to their unstructured form, most of the research efforts in this direction are dedicated to automatically infer structure and schemas that can help to better organize huge collections of documents and data. This book covers the latest advances in structure inference in heterogeneous collections of documents and data. The book brings a comprehensive view of the state-of-the-art in the area, presents some lessons learned and identifies new research issues, challenges and opportunities for further research agenda and developments. The selected chapters cover a broad range of research issues, from theoretical approaches to case studies and best practices in the field. Researcher, software developers, practitioners and students interested in the field of learning structure and schemas from documents will find the comprehensive coverage of this book useful for their research, academic, development and practice activity.
Publisher: Springer Science & Business Media
ISBN: 3642229123
Category : Computers
Languages : en
Pages : 449
Book Description
The rapidly growing volume of available digital documents of various formats and the possibility to access these through Internet-based technologies, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Due to the extremely large volumes of documents and to their unstructured form, most of the research efforts in this direction are dedicated to automatically infer structure and schemas that can help to better organize huge collections of documents and data. This book covers the latest advances in structure inference in heterogeneous collections of documents and data. The book brings a comprehensive view of the state-of-the-art in the area, presents some lessons learned and identifies new research issues, challenges and opportunities for further research agenda and developments. The selected chapters cover a broad range of research issues, from theoretical approaches to case studies and best practices in the field. Researcher, software developers, practitioners and students interested in the field of learning structure and schemas from documents will find the comprehensive coverage of this book useful for their research, academic, development and practice activity.
Study on Data Placement Strategies in Distributed RDF Stores
Author: D.D. Janke
Publisher: IOS Press
ISBN: 1643680692
Category : Computers
Languages : en
Pages : 312
Book Description
The distributed setting of RDF stores in the cloud poses many challenges, including how to optimize data placement on the compute nodes to improve query performance. In this book, a novel benchmarking methodology is developed for data placement strategies; one that overcomes these limitations by using a data-placement-strategy-independent distributed RDF store to analyze the effect of the data placement strategies on query performance. Frequently used data placement strategies have been evaluated, and this evaluation challenges the commonly held belief that data placement strategies which emphasize local computation lead to faster query executions. Indeed, results indicate that queries with a high workload can be executed faster on hash-based data placement strategies than on, for example, minimal edge-cut covers. The analysis of additional measurements indicates that vertical parallelization (i.e., a well-distributed workload) may be more important than horizontal containment (i.e., minimal data transport) for efficient query processing. Two such data placement strategies are proposed: the first, found in the literature, is entitled overpartitioned minimal edge-cut cover, and the second is the newly developed molecule hash cover. Evaluation revealed a balanced query workload and a high horizontal containment, which lead to a high vertical parallelization. As a result, these strategies demonstrated better query performance than other frequently used data placement strategies. The book also tests the hypothesis that collocating small connected triple sets on the same compute node while balancing the amount of triples stored on the different compute nodes leads to a high vertical parallelization.
Publisher: IOS Press
ISBN: 1643680692
Category : Computers
Languages : en
Pages : 312
Book Description
The distributed setting of RDF stores in the cloud poses many challenges, including how to optimize data placement on the compute nodes to improve query performance. In this book, a novel benchmarking methodology is developed for data placement strategies; one that overcomes these limitations by using a data-placement-strategy-independent distributed RDF store to analyze the effect of the data placement strategies on query performance. Frequently used data placement strategies have been evaluated, and this evaluation challenges the commonly held belief that data placement strategies which emphasize local computation lead to faster query executions. Indeed, results indicate that queries with a high workload can be executed faster on hash-based data placement strategies than on, for example, minimal edge-cut covers. The analysis of additional measurements indicates that vertical parallelization (i.e., a well-distributed workload) may be more important than horizontal containment (i.e., minimal data transport) for efficient query processing. Two such data placement strategies are proposed: the first, found in the literature, is entitled overpartitioned minimal edge-cut cover, and the second is the newly developed molecule hash cover. Evaluation revealed a balanced query workload and a high horizontal containment, which lead to a high vertical parallelization. As a result, these strategies demonstrated better query performance than other frequently used data placement strategies. The book also tests the hypothesis that collocating small connected triple sets on the same compute node while balancing the amount of triples stored on the different compute nodes leads to a high vertical parallelization.
Fast and Scalable Cloud Data Management
Author: Felix Gessert
Publisher: Springer Nature
ISBN: 3030435067
Category : Computers
Languages : en
Pages : 199
Book Description
The unprecedented scale at which data is both produced and consumed today has generated a large demand for scalable data management solutions facilitating fast access from all over the world. As one consequence, a plethora of non-relational, distributed NoSQL database systems have risen in recent years and today’s data management system landscape has thus become somewhat hard to overlook. As another consequence, complex polyglot designs and elaborate schemes for data distribution and delivery have become the norm for building applications that connect users and organizations across the globe – but choosing the right combination of systems for a given use case has become increasingly difficult as well. To help practitioners stay on top of that challenge, this book presents a comprehensive overview and classification of the current system landscape in cloud data management as well as a survey of the state-of-the-art approaches for efficient data distribution and delivery to end-user devices. The topics covered thus range from NoSQL storage systems and polyglot architectures (backend) over distributed transactions and Web caching (network) to data access and rendering performance in the client (end-user). By distinguishing popular data management systems by data model, consistency guarantees, and other dimensions of interest, this book provides an abstract framework for reasoning about the overall design space and the individual positions claimed by each of the systems therein. Building on this classification, this book further presents an application-driven decision guidance tool that breaks the process of choosing a set of viable system candidates for a given application scenario down into a straightforward decision tree.
Publisher: Springer Nature
ISBN: 3030435067
Category : Computers
Languages : en
Pages : 199
Book Description
The unprecedented scale at which data is both produced and consumed today has generated a large demand for scalable data management solutions facilitating fast access from all over the world. As one consequence, a plethora of non-relational, distributed NoSQL database systems have risen in recent years and today’s data management system landscape has thus become somewhat hard to overlook. As another consequence, complex polyglot designs and elaborate schemes for data distribution and delivery have become the norm for building applications that connect users and organizations across the globe – but choosing the right combination of systems for a given use case has become increasingly difficult as well. To help practitioners stay on top of that challenge, this book presents a comprehensive overview and classification of the current system landscape in cloud data management as well as a survey of the state-of-the-art approaches for efficient data distribution and delivery to end-user devices. The topics covered thus range from NoSQL storage systems and polyglot architectures (backend) over distributed transactions and Web caching (network) to data access and rendering performance in the client (end-user). By distinguishing popular data management systems by data model, consistency guarantees, and other dimensions of interest, this book provides an abstract framework for reasoning about the overall design space and the individual positions claimed by each of the systems therein. Building on this classification, this book further presents an application-driven decision guidance tool that breaks the process of choosing a set of viable system candidates for a given application scenario down into a straightforward decision tree.
Benchmarking Transaction and Analytical Processing Systems
Author: Anja Bog
Publisher: Springer Science & Business Media
ISBN: 3642380700
Category : Business & Economics
Languages : en
Pages : 170
Book Description
Systems for Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) are currently separate. The potential of the latest technologies and changes in operational and analytical applications over the last decade have given rise to the unification of these systems, which can be of benefit for both workloads. Research and industry have reacted and prototypes of hybrid database systems are now appearing. Benchmarks are the standard method for evaluating, comparing and supporting the development of new database systems. Because of the separation of OLTP and OLAP systems, existing benchmarks are only focused on one or the other. With the rise of hybrid database systems, benchmarks to assess these systems will be needed as well. Based on the examination of existing benchmarks, a new benchmark for hybrid database systems is introduced in this book. It is furthermore used to determine the effect of adding OLAP to an OLTP workload and is applied to analyze the impact of typically used optimizations in the historically separate OLTP and OLAP domains in mixed-workload scenarios.
Publisher: Springer Science & Business Media
ISBN: 3642380700
Category : Business & Economics
Languages : en
Pages : 170
Book Description
Systems for Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) are currently separate. The potential of the latest technologies and changes in operational and analytical applications over the last decade have given rise to the unification of these systems, which can be of benefit for both workloads. Research and industry have reacted and prototypes of hybrid database systems are now appearing. Benchmarks are the standard method for evaluating, comparing and supporting the development of new database systems. Because of the separation of OLTP and OLAP systems, existing benchmarks are only focused on one or the other. With the rise of hybrid database systems, benchmarks to assess these systems will be needed as well. Based on the examination of existing benchmarks, a new benchmark for hybrid database systems is introduced in this book. It is furthermore used to determine the effect of adding OLAP to an OLTP workload and is applied to analyze the impact of typically used optimizations in the historically separate OLTP and OLAP domains in mixed-workload scenarios.
Handbook of Research on Cloud Infrastructures for Big Data Analytics
Author: Raj, Pethuru
Publisher: IGI Global
ISBN: 1466658657
Category : Computers
Languages : en
Pages : 592
Book Description
Clouds are being positioned as the next-generation consolidated, centralized, yet federated IT infrastructure for hosting all kinds of IT platforms and for deploying, maintaining, and managing a wider variety of personal, as well as professional applications and services. Handbook of Research on Cloud Infrastructures for Big Data Analytics focuses exclusively on the topic of cloud-sponsored big data analytics for creating flexible and futuristic organizations. This book helps researchers and practitioners, as well as business entrepreneurs, to make informed decisions and consider appropriate action to simplify and streamline the arduous journey towards smarter enterprises.
Publisher: IGI Global
ISBN: 1466658657
Category : Computers
Languages : en
Pages : 592
Book Description
Clouds are being positioned as the next-generation consolidated, centralized, yet federated IT infrastructure for hosting all kinds of IT platforms and for deploying, maintaining, and managing a wider variety of personal, as well as professional applications and services. Handbook of Research on Cloud Infrastructures for Big Data Analytics focuses exclusively on the topic of cloud-sponsored big data analytics for creating flexible and futuristic organizations. This book helps researchers and practitioners, as well as business entrepreneurs, to make informed decisions and consider appropriate action to simplify and streamline the arduous journey towards smarter enterprises.
Handbook of Mobile Data Privacy
Author: Aris Gkoulalas-Divanis
Publisher: Springer
ISBN: 3319981617
Category : Computers
Languages : en
Pages : 426
Book Description
This handbook covers the fundamental principles and theory, and the state-of-the-art research, systems and applications, in the area of mobility data privacy. It is primarily addressed to computer science and statistics researchers and educators, who are interested in topics related to mobility privacy. This handbook will also be valuable to industry developers, as it explains the state-of-the-art algorithms for offering privacy. By discussing a wide range of privacy techniques, providing in-depth coverage of the most important ones, and highlighting promising avenues for future research, this handbook also aims at attracting computer science and statistics students to this interesting field of research. The advances in mobile devices and positioning technologies, together with the progress in spatiotemporal database research, have made possible the tracking of mobile devices (and their human companions) at very high accuracy, while supporting the efficient storage of mobility data in data warehouses, which this handbook illustrates. This has provided the means to collect, store and process mobility data of an unprecedented quantity, quality and timeliness. As ubiquitous computing pervades our society, user mobility data represents a very useful but also extremely sensitive source of information. On one hand, the movement traces that are left behind by the mobile devices of the users can be very useful in a wide spectrum of applications such as urban planning, traffic engineering, and environmental pollution management. On the other hand, the disclosure of mobility data to third parties may severely jeopardize the privacy of the users whose movement is recorded, leading to abuse scenarios such as user tailing and profiling. A significant amount of research work has been conducted in the last 15 years in the area of mobility data privacy and important research directions, such as privacy-preserving mobility data management, privacy in location sensing technologies and location-based services, privacy in vehicular communication networks, privacy in location-based social networks, privacy in participatory sensing systems which this handbook addresses.. This handbook also identifies important privacy gaps in the use of mobility data and has resulted to the adoption of international laws for location privacy protection (e.g., in EU, US, Canada, Australia, New Zealand, Japan, Singapore), as well as to a large number of interesting technologies for privacy-protecting mobility data, some of which have been made available through open-source systems and featured in real-world applications.
Publisher: Springer
ISBN: 3319981617
Category : Computers
Languages : en
Pages : 426
Book Description
This handbook covers the fundamental principles and theory, and the state-of-the-art research, systems and applications, in the area of mobility data privacy. It is primarily addressed to computer science and statistics researchers and educators, who are interested in topics related to mobility privacy. This handbook will also be valuable to industry developers, as it explains the state-of-the-art algorithms for offering privacy. By discussing a wide range of privacy techniques, providing in-depth coverage of the most important ones, and highlighting promising avenues for future research, this handbook also aims at attracting computer science and statistics students to this interesting field of research. The advances in mobile devices and positioning technologies, together with the progress in spatiotemporal database research, have made possible the tracking of mobile devices (and their human companions) at very high accuracy, while supporting the efficient storage of mobility data in data warehouses, which this handbook illustrates. This has provided the means to collect, store and process mobility data of an unprecedented quantity, quality and timeliness. As ubiquitous computing pervades our society, user mobility data represents a very useful but also extremely sensitive source of information. On one hand, the movement traces that are left behind by the mobile devices of the users can be very useful in a wide spectrum of applications such as urban planning, traffic engineering, and environmental pollution management. On the other hand, the disclosure of mobility data to third parties may severely jeopardize the privacy of the users whose movement is recorded, leading to abuse scenarios such as user tailing and profiling. A significant amount of research work has been conducted in the last 15 years in the area of mobility data privacy and important research directions, such as privacy-preserving mobility data management, privacy in location sensing technologies and location-based services, privacy in vehicular communication networks, privacy in location-based social networks, privacy in participatory sensing systems which this handbook addresses.. This handbook also identifies important privacy gaps in the use of mobility data and has resulted to the adoption of international laws for location privacy protection (e.g., in EU, US, Canada, Australia, New Zealand, Japan, Singapore), as well as to a large number of interesting technologies for privacy-protecting mobility data, some of which have been made available through open-source systems and featured in real-world applications.
Advanced Applications and Structures in XML Processing: Label Streams, Semantics Utilization and Data Query Technologies
Author: Li, Changqing
Publisher: IGI Global
ISBN: 1615207287
Category : Social Science
Languages : en
Pages : 500
Book Description
"This book is for professionals and researchers working in the field of XML in various disciplines who want to improve their understanding of the XML data management technologies, such as XML models, XML query and update processing, XML query languages and their implementations, keywords search in XML documents, database, web service, publish/subscribe, medical information science, and e-business"--Provided by publisher.
Publisher: IGI Global
ISBN: 1615207287
Category : Social Science
Languages : en
Pages : 500
Book Description
"This book is for professionals and researchers working in the field of XML in various disciplines who want to improve their understanding of the XML data management technologies, such as XML models, XML query and update processing, XML query languages and their implementations, keywords search in XML documents, database, web service, publish/subscribe, medical information science, and e-business"--Provided by publisher.
Sigmod/pods '18
Author: Christopher Jermaine
Publisher:
ISBN: 9781450347037
Category :
Languages : en
Pages :
Book Description
SIGMOD/PODS '18: International Conference on Management of Data Jun 03, 2018-Jun 08, 2018 Houston, USA. You can view more information about this proceeding and all of ACM�s other published conference proceedings from the ACM Digital Library: http://www.acm.org/dl.
Publisher:
ISBN: 9781450347037
Category :
Languages : en
Pages :
Book Description
SIGMOD/PODS '18: International Conference on Management of Data Jun 03, 2018-Jun 08, 2018 Houston, USA. You can view more information about this proceeding and all of ACM�s other published conference proceedings from the ACM Digital Library: http://www.acm.org/dl.
Building a Columnar Database on RAMCloud
Author: Christian Tinnefeld
Publisher: Springer
ISBN: 3319207113
Category : Computers
Languages : en
Pages : 139
Book Description
This book examines the field of parallel database management systems and illustrates the great variety of solutions based on a shared-storage or a shared-nothing architecture. Constantly dropping memory prices and the desire to operate with low-latency responses on large sets of data paved the way for main memory-based parallel database management systems. However, this area is currently dominated by the shared-nothing approach in order to preserve the in-memory performance advantage by processing data locally on each server. The main argument this book makes is that such an unilateral development will cease due to the combination of the following three trends: a) Today’s network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory on a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic and provide high-availability. c) A modern storage system such as Stanford’s RAM Cloud even keeps all data resident in the main memory. Exploiting these characteristics in the context of a main memory-based parallel database management system is desirable. The book demonstrates that the advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible.
Publisher: Springer
ISBN: 3319207113
Category : Computers
Languages : en
Pages : 139
Book Description
This book examines the field of parallel database management systems and illustrates the great variety of solutions based on a shared-storage or a shared-nothing architecture. Constantly dropping memory prices and the desire to operate with low-latency responses on large sets of data paved the way for main memory-based parallel database management systems. However, this area is currently dominated by the shared-nothing approach in order to preserve the in-memory performance advantage by processing data locally on each server. The main argument this book makes is that such an unilateral development will cease due to the combination of the following three trends: a) Today’s network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory on a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic and provide high-availability. c) A modern storage system such as Stanford’s RAM Cloud even keeps all data resident in the main memory. Exploiting these characteristics in the context of a main memory-based parallel database management system is desirable. The book demonstrates that the advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible.