Used as notes for ECS-122B-Main

String Matching

Efficient algorithms for finding pattern P (ABC) in text T (EFGABCEFG).

Uses:

  • Finding matches in text documents (text editors, pdf viewers)
  • Motifs within DNA sequences (DNA is a massive string)

Approaches: Length of text (n), Length of pattern (m)

  • Naive String Matching -
    • The case of string matching with no repeated characters in P -
  • Rabin Karp -
    • Naive vs Enhanced
    • Naive: Using straight up representations of substrings for comparison
    • Enhanced: Comparing the hashed value of subtrings (q) and checking if matches are true or spurious hits
  • Z-Algorithm