SIVERSITY ScIE\CE TECH\OLoGY CHINA 常用述语和定义 Parameters.记文本串为T,模式串为P n:the length of the text m: the length of the pattern(string Typically. n>> e.g. n=1 million, m= 1 hundred o: the size of the alphabet ∑: the alphabet Cn: the expected number of comparisons performed by an algorithm while searching the pattern in a text of length n 021/2 &T
2021/2/4 Department of Computer Science & Technology 6 常用述语和定义 ◼ Parameters. 记文本串为 T,模式串为 P ◼ n: the length of the text. ◼ m : the length of the pattern (string). ◼ Typically, n >> m. ◼ e.g., n = 1 million, m = 1 hundred ◼ σ : the size of the alphabet. ◼ ∑ : the alphabet. ◼ Cn : the expected number of comparisons performed by an algorithm while searching the pattern in a text of length n
IVERSITY ScIE\CE TECH\OLoGY 串匹配算法概述 CHINA 口目前教科书上所介绍的串匹配算法基本原理是: 利用一个大小等同于模式长度的 windo对文本串进行扫描; 2.首先将模式串与文本串的左端对齐; 3.对模式串与文本串的对应字符进行对比--称为一次 attempt 4.在每次成功匹配或每次失配之后,将 window右移; 5.重复3,4两步直到 window的右端超出文本串的右端。 口这种方法称为 sliding window mechanism 在将文本串中的当前 indow部份与模式串对比时可以: 从左到右,也可以从右到左,甚至可以用特定次序。 021/2 &T
2021/2/4 Department of Computer Science & Technology 7 串匹配算法概述 目前教科书上所介绍的串匹配算法基本原理是: 1. 利用一个大小等同于模式长度的 window 对文本串进行扫描; 2. 首先将模式串与文本串的左端对齐; 3. 对模式串与文本串的对应字符进行对比----称为一次 attempt 4. 在每次成功匹配或每次失配之后,将 window 右移; 5. 重复3,4两步直到 window 的右端超出文本串的右端。 这种方法称为 sliding window mechanism. ➢ 在将文本串中的当前window部份与模式串对比时可以: 从左到右,也可以从右到左,甚至可以用特定次序
串匹配算法概述(续)8 VERSITI E\CE TECH\OLOGY CHINA 口 From left to right Karp and rabin Knuth, Morris and Pratt 口 From right to left Boyer and moore H orspoo In any order Brute Force Algorithms( Naive Algorithm 021/2 &T
2021/2/4 Department of Computer Science & Technology 8 串匹配算法概述 (续) From left to right ➢ Karp and Rabin ➢ Knuth,Morris and Pratt From right to left ➢ Boyer and Moore ➢ Horspool In any order ➢ Brute Force Algorithms( Naive Algorithm )
IVERSITY ScIE\CE TECH\OLoGY 8.1 Brute force算法 CHINA Brute force Check for pattern starting at every text position, trying to match any substring of length m in the text with the pattern Analysis of brute force Running time depends on pattern and text. can be slow when strings repeat themselves Worst case: O(MN) comparisons. too slow when M and N are large 021/2 &T
2021/2/4 Department of Computer Science & Technology 9 8.1 Brute Force 算法 ◼ Brute force: Check for pattern starting at every text position, trying to match any substring of length m in the text with the pattern 。 Analysis of brute force. ➢ Running time depends on pattern and text. ➢ can be slow when strings repeat themselves Worst case: O(MN) comparisons. ➢ too slow when M and N are large
SIVERSITY ScIE\CE TECH\OLoGY Brute force算法伪代码1 CHINA Brute-Force-I(T, P) while i<n-m do j=0 /*left to right scan ofP while j<m and Plj+1]=t[+j+l do j=j+1 if j=m then Report match at position(i-j+1) i=i+1: Return 021/2 &T
2021/2/4 Department of Computer Science & Technology 10 Brute Force算法伪代码1 Brute-Force-1 (T,P) ; i =0 ; while i≤n-m do j = 0; //* left to right scan of P while j < m and P[j+1] = T[i+j+1] do j = j+1; if j=m then Report_match_at_position(i-j+1); i = i+1; Return