ADC

Pattern sets and data sets

Policy expressions for string matching operations on a large set of string patterns tend to become long and complex. Resources consumed by the evaluation of such complex expressions are significant in terms of processing cycles, memory, and configuration size. You can create simpler, less resource-intensive expressions by using pattern matching.

Depending on the type of patterns that you want to match, you can use one of the following features to implement pattern matching:

  • A pattern set is an array of indexed patterns used for string matching during default syntax policy evaluation. Example of a pattern set: image types {svg, bmp, PNG, GIF, tiff, jpg}.
  • A data set is a specialized form of pattern set. It is an array of patterns of types number (integer), IPv4 address, or IPv6 address.

The difference between a patset and a dataset is that in a dataset we compare the boundary condition. For example, if the input string is 1.1.1.11 and supposes that the 1.1.1.1 pattern is bound to a patset and a dataset of IPv4 type, then a patset and dataset is configured to check whether the IP address is present in the request. After evaluation, patset returns that the 1.1.1.1 is present in the input, but dataset evaluation is false. This is because of a boundary check-in which the IP address is not been part of some other IP address. It means, after the bound pattern there must not be any integer.

Often, you can use either pattern sets or data sets. However, in cases where you want specific matches for numerical data or IPv4 and IPv6 addresses, you must use data sets.

Notes:

  • Pattern sets and data sets can be used only in default syntax policies.
  • From release 13.1 build 42.x and later, you can bind 50000 patterns to a pattern set. With the pattern set file, only 10000 patterns can be bound to a pattern set. Also, If the pattern set is used in streaming, then only 5000 patterns can be bound to that pattern set. A pattern set for streaming is used in the rewrite action search parameter, HTTP body, or TCP payload based expression.
Pattern sets and data sets