A Comparison of Approximate String Matching Algorithms

A Comparison of Approximate String Matching Algorithms PDF Author: Petteri Jokinen
Publisher:
ISBN: 9789514559761
Category : Pattern recognition systems
Languages : en
Pages : 22

Get Book Here

Book Description
Abstract: "Experimental comparison of the running time of approximate string matching algorithms for the k differences problem is presented. Given a pattern string, a text string and an integer k, the task is to find all approximate occurrences of the pattern in the text with at most k differences (insertions, deletions, changes). Besides a new algorithm based on suffix automata, we consider six other algorithms based on different approaches including dynamic programming, Boyer-Moore string matching and the distribution of characters. It turns out that none of the algorithms is the best for all values of the problem parameters, and the speed differences between the methods can be large."

A Comparison of Approximate String Matching Algorithms

A Comparison of Approximate String Matching Algorithms PDF Author: Petteri Jokinen
Publisher:
ISBN: 9789514559761
Category : Pattern recognition systems
Languages : en
Pages : 22

Get Book Here

Book Description
Abstract: "Experimental comparison of the running time of approximate string matching algorithms for the k differences problem is presented. Given a pattern string, a text string and an integer k, the task is to find all approximate occurrences of the pattern in the text with at most k differences (insertions, deletions, changes). Besides a new algorithm based on suffix automata, we consider six other algorithms based on different approaches including dynamic programming, Boyer-Moore string matching and the distribution of characters. It turns out that none of the algorithms is the best for all values of the problem parameters, and the speed differences between the methods can be large."

Theoretical and Empirical Comparisons of Approximate String Matching Algorithms

Theoretical and Empirical Comparisons of Approximate String Matching Algorithms PDF Author: University of California, Berkeley. Computer Science Division
Publisher:
ISBN:
Category : Computer algorithms
Languages : en
Pages : 14

Get Book Here

Book Description
We study in depth a model of non-exact pattern matching based on edit distance, which is the minimum number of substitutions, insertions, adn deletions needed to transform one string of symbols to another. More precisely, the k differences appr oximate string matching problem specifies a text string of length n, a pattern string of length m, the number k of differences (substitutions, insertions, deletions) allowed in a match, and asks for all locations in the text where a match occurs. We have carefully implemented and analyzed various O(kn) algorithms based on dynamic programming (DP), paying particular attention to dependence on b the alphabet size. An empirical observation on the average values of the DP tabulation makes apparent each algori thm's dependence on b. A new algorithm is presented that computes much fewer entires of the DP table. In practice, its speedup over the previous fastest algorithm is 2.5X for binary alphabet; 4X for four-letter alphabet; 10X for twently- letter alphabet. W e give a probabilistic analysis of the DP table in order to prove that the expected running time of our algorithm (as well as an earlier "cut-off" algorithm due to Ukkonen) is O (kn) for random text. Furthermore, we give a heuristic argument that our algo rithm is O (kn/((the square root of b) -1 )) on the average, when alphabet size is taken into consideration.

Flexible Pattern Matching in Strings

Flexible Pattern Matching in Strings PDF Author: Gonzalo Navarro
Publisher: Cambridge University Press
ISBN: 9780521813075
Category : Computers
Languages : en
Pages : 236

Get Book Here

Book Description
Presents recently developed algorithms for searching for simple, multiple and extended strings, regular expressions, exact and approximate matches.

String Searching Algorithms

String Searching Algorithms PDF Author: Graham A Stephen
Publisher: World Scientific
ISBN: 9814501867
Category : Computers
Languages : en
Pages : 257

Get Book Here

Book Description
String searching is a subject of both theoretical and practical interest in computer science. This book presents a bibliographic overview of the field and an anthology of detailed descriptions of the principal algorithms available. The aim is twofold: on the one hand, to provide an easy-to-read comparison of the available techniques in each area, and on the other, to furnish the reader with a reference to in-depth descriptions of the major algorithms. Topics covered include methods for finding exact and approximate string matches, calculating ‘edit’ distances between strings, finding common sequences and finding the longest repetitions within strings. For clarity, all the algorithms are presented in a uniform format and notation.

A Comparison of String Matching Algorithms

A Comparison of String Matching Algorithms PDF Author: Eric Lee Hensley
Publisher:
ISBN:
Category : Algorithms
Languages : en
Pages : 168

Get Book Here

Book Description


Practical Methods for Approximate String Matching

Practical Methods for Approximate String Matching PDF Author: Heikki Hyyrö
Publisher:
ISBN: 9789514458187
Category : Information retrieval
Languages : en
Pages : 105

Get Book Here

Book Description
Abstract: "Given a pattern string and a text, the task of approximate string matching is to find all locations in the text that are similar to the pattern. This type of search may be done for example in applications of spelling error correction or bioinformatics. Typically edit distance is used as the measure of similarity (or distance) between two strings. In this thesis we concentrate on unit-cost edit distance that defines the distance between two strings as the minimum number of edit operations that are needed in transforming one of the strings into the other. More specifically, we discuss the Levenshtein and the Damerau edit distances. Aproximate [sic] string matching algorithms can be divided into off-line and on-line algorithms depending on whether they may or may not, respectively, preprocess the text. In this thesis we propose practical algorithms for both types of approximate string matching as well as for computing edit distance. Our main contributions are a new variant of the bit-parallel approximate string matching algorithm of Myers, a method that makes it easy to modify many existing Levenshtein edit distance algorithms into using the Damerau edit distance, a bit-parallel algorithm for computing edit distance, a more error tolerant version of the ABNDM algorithm, a two-phase filtering scheme, a tuned indexed approximate string matching method for genome searching, and an improved and extended version of the hybrid index of Navarro and Baeza-Yates. To evaluate their practicality, we compare most of the proposed methods with previously existing algorithms. The test results support the claim of the title of this thesis that our proposed algorithms work well in practice."

Automatic Information Organization and Retrieval

Automatic Information Organization and Retrieval PDF Author: Gerard Salton
Publisher: New York : McGraw-Hill
ISBN:
Category : Computers
Languages : en
Pages : 536

Get Book Here

Book Description
Textbook on methodology of automation in documentation work - covers EDP, computerisation, dictionary construction and operations, storage of and research for information, mathematical analysis and statistical method, evaluation of methodology, etc. Bibliography pp. 485 to 498, and flow diagrams.

Pattern Matching

Pattern Matching PDF Author: Source Wikipedia
Publisher: University-Press.org
ISBN: 9781230603896
Category :
Languages : en
Pages : 90

Get Book Here

Book Description
Please note that the content of this book primarily consists of articles available from Wikipedia or other free sources online. Pages: 38. Chapters: Approximate string matching, Backtracking, Comparison of regular expression engines, Compressed pattern matching, Delimiter, Diff, Escape character, Findstr (computing), Find (command), Glob (programming), International Components for Unicode, List of regular expression software, Metacharacter, Parser Grammar Engine, Perl Compatible Regular Expressions, Ragel, ReDoS, RegexBuddy, ReteOO, Rete algorithm, Terminal and nonterminal symbols, Tom (pattern matching language), Wildcard character, Wildmat.

Combinatorial Pattern Matching

Combinatorial Pattern Matching PDF Author: Zvi Galil
Publisher: Lecture Notes in Computer Science
ISBN:
Category : Computers
Languages : en
Pages : 424

Get Book Here

Book Description
This volume presents the proceedings of the 6th International Symposium on Combinatorial Pattern Matching, CPM '95, held in Espoo, Finland in July 1995. CPM addresses issues of searching and matching strings and more complicated patterns such as trees, regular expressions, extended expressions, etc. The aim is to derive non-trivial combinatorial properties in order to improve the performance of the corresponding computational problems. This volume presents 27 selected refereed full research papers and two invited papers; it addresses all current aspects of CPM and its applications such as the design and analysis of algorithms for pattern matching problems in strings, graphs, and hypertexts, as well as in biological sequences and molecules.

An Improved Approximate String Matching Algorithm Based Upon the Boyer-moore Algorithm

An Improved Approximate String Matching Algorithm Based Upon the Boyer-moore Algorithm PDF Author: 謝一功
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description