A novel algorithm for pattern matching with back references

Liu Yang, Vinod Ganapathy, Pratyusa Manadhata, Ye Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Modern network security applications, such as network-based intrusion detection systems (NIDS) and firewalls, routinely employ deep packet inspection to identify malicious traffic. In deep packet inspection, the contents of network traffic are matched against patterns of malicious traffic to identify attack-carrying packets. The pattern matching algorithms employed for deep packet inspection must be fast, as the algorithms are often implemented on middle-boxes residing on high-speed gigabits per second links. The majority of patterns employed in network security applications are regular languages. However, regular language-based patterns have limited expressive power and are not capable of describing some complex features in network payload. Back reference is an important feature provided by many pattern matching tools, e.g., PCRE, the regular expression libraries of Java, Perl, and Python. Back references are used to identify repeated patterns in input strings. Patterns containing back-references are non-regular languages. Very little work has been done to improve the time-efficiency of back reference-based pattern matching. The de facto algorithm to implement back reference is recursive backtracking, but it is vulnerable to algorithmic complexity attacks. In this paper, we present a novel approach to implement back references. The basic idea of our approach is to transform a back reference problem to a conditional submatch problem, and represent it with a Non-deterministic Finite Automata (NFA)-like machine subject to some constraints. Our experimental results show that our approach resists known algorithmic complexity attacks, and is faster than PCRE by up to three orders of magnitude for certain types of patterns.

Original languageEnglish (US)
Title of host publication2015 IEEE 34th International Performance Computing and Communications Conference, IPCCC 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781467385909
DOIs
StatePublished - Feb 17 2016
Event34th IEEE International Performance Computing and Communications Conference, IPCCC 2015 - Nanjing, China
Duration: Dec 14 2015Dec 16 2015

Publication series

Name2015 IEEE 34th International Performance Computing and Communications Conference, IPCCC 2015

Other

Other34th IEEE International Performance Computing and Communications Conference, IPCCC 2015
Country/TerritoryChina
CityNanjing
Period12/14/1512/16/15

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture

Keywords

  • Back reference
  • Finite automaton
  • Network-based Intrusion Detection System
  • Pattern matching

Fingerprint

Dive into the research topics of 'A novel algorithm for pattern matching with back references'. Together they form a unique fingerprint.

Cite this