字符串匹配算法Sunday的改进

西安科技大学 计算机科学与技术学院,陕西 西安 710054

Sunday算法; Zhusunday算法; 模式匹配; 坏字符

Improvement of Sunday pattern matching algorithm
ZHU Ning-hong

(College of Computer Science and Engineering,Xi'an University of Science and Technology,Xi'an 710054,China)

Sunday algorithm; Zhusunday algorithm; pattern matching; bad character

DOI: 10.13800/j.cnki.xakjdxxb.2016.0119

备注

字符串的模式匹配应用十分广泛,在信息的搜索查询等方面具有重要作用,研究串匹配算法的效率具有重要的理论价值和实际意义。在分析几种经典模式匹配算法的基础上,对当前应用最广泛的Sunday算法提出了改进的算法Zhusunday.算法主要改进之处是:在字符串从右向左匹配过程中,当文本字符中出现不匹配模式字符串的字符且该文本字符不是坏字符时,算法从右向左搜索当前文本字符在模式串中出现的位置; 找到当前字符在模式串中的位置后继续再向左匹配模式串字符一次,如果仍不匹配时,模式窗口比Sunday算法多向右移动一个字符。改进的算法提高了模式匹配的执行效率,通过大量对比实验证明了该算法的有效性。最后得出结论:在实际应用中,坏字符大量存在的情况下,改进算法的最优时间复杂度可达O(n/m),在同一时间复杂度下,比Sunday算法效率提高25~50%.

String pattern matching is widely used in information search and query,and it is important to study the efficiency of the string matching algorithm with theoretical value and practical significance.On the basis of the analysis of several classical pattern matching algorithms,the paper puts forward a Zhusunday algorithm which is improved from the most widely used Sunday algorithm.The improved position of the algorithm is:in the matching process from the right of the text string to the left,when the character of the text does not match the character of the pattern string,it is not a bad character,the algorithm searches the position of the current text character from the right to the left in the pattern string.After finding the position,the improved algorithm continues to match the two characters in the left for another time.If there is no match,the pattern window will move to right one step more than the movement of Sunday algorithm.The improved algorithm improves the efficiency of the pattern matching,and the effectiveness of the proposed algorithm is proved by a lot of experiments.Finally the paper draws a conclusion:in practical application there are a large number of the bad characters,the optimal time complexity of the improved algorithm is O(n/m),and at the same time complexity,the improved algorithm is more efficient than the Sunday algorithm by 25~50%.