A Memory Efficient DFA using Compression and Pattern Segmentation

論文翻譯標題: 藉由壓縮技術和字串切割實現節省記憶體的 DFA
  • 李 袁碩

學生論文: Master's Thesis


As the traffic of Internet is grow quickly there are more and more malicious attacks and viruses spread over the Internet The role of network security systems such as network intrusion detection system (NIDS) firewalls and antivirus software have become more important These systems usually use regexes to inspect the payload of packet which is one of the most intensive tasks in the systems We have to develop a high-throughput algorithm that requires a small amount of memory to find out the hidden virus in packet payload Regular expressions is more complicated than simple patterns When multiple regular expressions are processed together the corresponding DFA may be so complicated and need a large amount of memory The numbers of states and transitions grow so fast when a new regular expression is added As the virus and malicious packets grow quickly more patterns is needed to be compared Therefore we hope to find a scheme to decrease the memory consumption of DFA and keep the search performance acceptable In this thesis we propose a memory efficient parallel compatible DFA that uses the techniques of compression and pattern segmentation With the idea of PFAC algorithm we propose a new DFA called compressed segmented DFA (CS-DFA) that needs less number of states and transitions than δFA Without considering the symbols of “ *” in the beginning of the regular expression we notice that a DFA state transitions for most symbols to the same next state As a result the transition table can be compressed by run-length encoding The number of transitions in the proposed CS-DFA is about a half of the transitions need in δFA and has 74% of the memory consumed by δFA In addition we focus on two conditions “Counting Constraints” and “Kleene Star” which cause state blowup frequently in pattern sets Then the technique of pattern segmentation is applied The memory consumption is smaller than pure CS-DFA without pattern segmentation by about 19% The search performance is still acceptable when using OpenMP
獎項日期2014 九月 3
監督員Yeim-Kuan Chang (Supervisor)