String Matching Algorithms

Do data science also uses technique of String Matching?
Naive String Algorithm

The Rabin-Karp Algorithm

In the Rabin-Karp algorithm. Each character is a decimal digit, and we compute values modulo from a prime number say 13. (a) A text string. A window of length 5 is shown. The numerical value of this number is computed modulo 13, yielding the value 7. (b) (b) For each conceivable position of a length-5 window, the same text string with values computed modulo 13.
Using the pattern P = 31415, we check for windows with a modulo 13 value of 7, because 31415 = 7. (mod 13). Two such windows are found. Starting at the first text position 7 is actually the appearance of the pattern, and starting at the second text position 13 is a false match. Computing the value for a window in constant time, given the value for the previous window. The first window has value 31415. Dropping the high-order digit 3, shifting left (multiplying by 10), and then adding in the low order digit 2 gives us the new value 14152. However, because both computations are modulo 13, the value computed for the first window is 7, and the value computed for the next window is 8.
Rabin-Karp Algorithm

Limitations of Rabin-Karp Algorithm

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store