Theoretical and Empirical Comparisons of Approximate String Matching Algorithms

Theoretical and Empirical Comparisons of Approximate String Matching Algorithms PDF Author: University of California, Berkeley. Computer Science Division
Publisher:
ISBN:
Category : Computer algorithms
Languages : en
Pages : 14

Get Book Here

Book Description
We study in depth a model of non-exact pattern matching based on edit distance, which is the minimum number of substitutions, insertions, adn deletions needed to transform one string of symbols to another. More precisely, the k differences appr oximate string matching problem specifies a text string of length n, a pattern string of length m, the number k of differences (substitutions, insertions, deletions) allowed in a match, and asks for all locations in the text where a match occurs. We have carefully implemented and analyzed various O(kn) algorithms based on dynamic programming (DP), paying particular attention to dependence on b the alphabet size. An empirical observation on the average values of the DP tabulation makes apparent each algori thm's dependence on b. A new algorithm is presented that computes much fewer entires of the DP table. In practice, its speedup over the previous fastest algorithm is 2.5X for binary alphabet; 4X for four-letter alphabet; 10X for twently- letter alphabet. W e give a probabilistic analysis of the DP table in order to prove that the expected running time of our algorithm (as well as an earlier "cut-off" algorithm due to Ukkonen) is O (kn) for random text. Furthermore, we give a heuristic argument that our algo rithm is O (kn/((the square root of b) -1 )) on the average, when alphabet size is taken into consideration.

LATIN'98: Theoretical Informatics

LATIN'98: Theoretical Informatics PDF Author: Claudio L. Lucchesi
Publisher: Springer Science & Business Media
ISBN: 9783540642756
Category : Computers
Languages : en
Pages : 408

Get Book Here

Book Description
This book constitutes the refereed proceedings of the Third Latin American Symposium on Theoretical Informatics, LATIN'98, held in Campinas, Brazil, in April 1998. The 28 revised full papers presented together with five invited surveys were carefully selected from a total of 53 submissions based on 160 referees' reports. The papers are organized in sections on algorithms and complexity; automata, transition systems and combinatorics on words; computational geometry and graph drawing; cryptography; graph theory and algorithms on graphs; packet routing; parallel algorithms; and pattern matching and browsing.

Combinatorial Pattern Matching

Combinatorial Pattern Matching PDF Author: Gregory Kucherov
Publisher: Springer Science & Business Media
ISBN: 3642024408
Category : Computers
Languages : en
Pages : 381

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching, CPM 2009, held in Lille, France in June 2009. The 27 revised full papers presented together with 3 invited talks were carefully reviewed and selected from 63 submissions. The papers address all areas related to combinatorial pattern matching and its applications, such as coding and data compression, computational biology, data mining, information retrieval, natural language processing, pattern recognition, string algorithms, string processing in databases, symbolic computing and text searching.

String Processing and Information Retrieval

String Processing and Information Retrieval PDF Author: Mario A. Nascimento
Publisher: Springer
ISBN: 3540399844
Category : Computers
Languages : en
Pages : 389

Get Book Here

Book Description
This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 10th International Symposium on String Processing and Information Retrieval, SPIRE 2003, held in Manaus, Brazil. SPIRE 2003 received 54 full submissions from 17 countries, namely: - gentina(2), Australia(2), Brazil(9),Canada(1),Chile (4),Colombia(2),Czech Republic (1), Finland (10), France (1), Japan (2), Korea (5), Malaysia (1), P- tugal (2), Spain (6), Turkey (1), UK (1), USA (4) – the numbers in parentheses indicate the number of submissions from that country. In the nontrivial task of selecting the papers to be published in these proceedings we were fortunate to count on a very international program committee with 43 members, represe- ing all continents but one. These people, in turn, used the help of 40 external referees. During the review processall but a few papers had four reviewsinstead of the usual three, and at the end 21 submissions were accepted to be p- lished as full papers, yielding an acceptance rate of about 38%. An additional set of six short papers was also accepted. The technical program spans over the two well-de?ned scopes of SPIRE (string processing and information retrieval) with a number of papers also focusing on important application domains such as bioinformatics. SPIRE 2003 also features two invited speakers: Krishna Bharat (Google, Inc. ) and Joa ̃o Meidanis (State Univ. of Campinas and Scylla Bioinformatics).

Combinatorial Pattern Matching

Combinatorial Pattern Matching PDF Author: Dan Hirschberg
Publisher: Springer Science & Business Media
ISBN: 9783540612582
Category : Computers
Languages : en
Pages : 408

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching, CPM '96, held in Laguna Beach, California, USA, in June 1996. The 26 revised full papers included were selected from a total of 48 submissions; also included are two invited papers. Combinatorial pattern matching has become a full-fledged area of algorithmics with important applications in recent years. The book addresses all relevant aspects of combinatorial pattern matching and its importance in information retrieval, pattern recognition, compiling, data compression, program analysis, and molecular biology and thus describes the state of the art in the area.

Algorithms and Data Structures

Algorithms and Data Structures PDF Author: Frank Dehne
Publisher: Springer Science & Business Media
ISBN: 9783540633075
Category : Computers
Languages : en
Pages : 492

Get Book Here

Book Description
The book is an introduction to the theory of cubic metaplectic forms on the 3-dimensional hyperbolic space and the author's research on cubic metaplectic forms on special linear and symplectic groups of rank 2. The topics include: Kubota and Bass-Milnor-Serre homomorphisms, cubic metaplectic Eisenstein series, cubic theta functions, Whittaker functions. A special method is developed and applied to find Fourier coefficients of the Eisenstein series and cubic theta functions. The book is intended for readers, with beginning graduate-level background, interested in further research in the theory of metaplectic forms and in possible applications.

Combinatorial Pattern Matching

Combinatorial Pattern Matching PDF Author: Martin Farach-Colton
Publisher: Springer Science & Business Media
ISBN: 9783540647393
Category : Computers
Languages : en
Pages : 268

Get Book Here

Book Description
This is a fair overview of the basic problems in Solar Physics. The authors address not only the physics that is well understood but also discuss many open questions. The lecturers' involvement in the SOHO mission guarantees a modern and up-to-date analysis of observational data and makes this volume an extremely valuable source for further research.

A Comparison of Approximate String Matching Algorithms

A Comparison of Approximate String Matching Algorithms PDF Author: Petteri Jokinen
Publisher:
ISBN: 9789514559761
Category : Pattern recognition systems
Languages : en
Pages : 22

Get Book Here

Book Description
Abstract: "Experimental comparison of the running time of approximate string matching algorithms for the k differences problem is presented. Given a pattern string, a text string and an integer k, the task is to find all approximate occurrences of the pattern in the text with at most k differences (insertions, deletions, changes). Besides a new algorithm based on suffix automata, we consider six other algorithms based on different approaches including dynamic programming, Boyer-Moore string matching and the distribution of characters. It turns out that none of the algorithms is the best for all values of the problem parameters, and the speed differences between the methods can be large."

Combinatorial Pattern Matching

Combinatorial Pattern Matching PDF Author: Alberto Apostolico
Publisher: Springer Science & Business Media
ISBN: 3540438629
Category : Mathematics
Languages : en
Pages : 298

Get Book Here

Book Description
The papers contained in this volume were presented at the 13th Annual S- posium on Combinatorial Pattern Matching, held July 3–5, 2002 at the Hotel Uminonakamichi, in Fukuoka, Japan. They were selected from 37 abstracts s- mitted in response to the call for papers. In addition, there were invited lectures by Shinichi Morishita (University of Tokyo) and Hiroki Arimura (Kyushu U- versity). Combinatorial Pattern Matching (CPM) addresses issues of searching and matching strings and more complicated patterns such as trees, regular expr- sions, graphs, point sets, and arrays, in various formats. The goal is to derive n- trivial combinatorial properties of such structures and to exploit these properties in order to achieve superior performance for the corresponding computational problems. On the other hand, an important goal is to analyze and pinpoint the properties and conditions under which searches cannot be performed e?ciently. Over the past decade a steady ?ow of high-quality research on this subject has changed a sparse set of isolated results into a full-?edged area of algorithmics. This area is continuing to grow even further due to the increasing demand for speed and e?ciency that stems from important applications such as the World Wide Web, computational biology, computer vision, and multimedia systems. These involve requirements for information retrieval in heterogeneous databases, data compression, and pattern recognition. The objective of the annual CPM gathering is to provide an international forum for research in combinatorial p- tern matching and related applications.

Combinatorial Pattern Matching

Combinatorial Pattern Matching PDF Author: Raffaele Giancarlo
Publisher: Springer
ISBN: 3540451234
Category : Computers
Languages : en
Pages : 434

Get Book Here

Book Description
This book constitutes the refereed proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching, CPM 2000, held in Montreal, Canada, in June 2000.The 29 revised full papers presented together with 3 invited contributions and 2 tutorial lectures were carefully reviewed and selected from 44 submissions. The papers are devoted to current theoretical and algorithmic issues of searching and matching strings and more complicated patterns such as trees, regular expression graphs, point sets and arrays as well as to advanced applications of CPM in areas such as Internet, computational biology, multimedia systems, information retrieval, data compression, and pattern recognition.