Product Documentation

Operations for HTTP, HTML, and XML Encoding and “Safe” Characters

Nov 27, 2014

The following operations work with the encoding of HTML data in a request or response and XML data in a POST body.

Table 1. Operations That Evaluate HTML and XML Encoding

HTML or XML Operation

Description

<text>.HTML_XML_SAFE

Transforms special characters into XML safe format, as in the following examples:

  • A left-pointing angle bracket (<) is converted to &lt;
  • A right-pointing angle bracket (>) is converted to &gt;
  • An ampersand (&) is converted to &amp;

This operation safeguards against cross-site scripting attacks. Maximum length of the transformed text is 2048 bytes. This is a read-only operation.

After applying the transformation, additional operators that you specify in the expression are applied to the selected text. Following is an example:

http.req.url.query.html_xml_safe. contains("myQueryString")

<text>.HTTP_HEADER_SAFE

Converts all new line ('\n') characters in the input text to '%0A' to enable the input to be used safely in HTTP headers.

This operation safeguards against response-splitting attacks.

The maximum length of the transformed text is 2048 bytes. This is a read-only operation.

<text>.HTTP_URL_SAFE

Converts unsafe URL characters to '%xx' values, where “xx” is a hex-based representation of the input character. For example, the ampersand (&) is represented as %26 in URL-safe encoding. The maximum length of the transformed text is 2048 bytes. This is a read-only operation.

Following are URL safe characters. All others are unsafe:

  • Alpha-numeric characters: a-z, A-Z, 0-9
  • Asterix: "*"
  • Ampersand: "&"
  • At-sign: "@"
  • Colon: ":"
  • Comma: ","
  • Dollar: "$"
  • Dot: "."
  • Equals: "="
  • Exclamation mark: "!"
  • Hyphen: "-"
  • Open and close parentheses: "(", ")"
  • Percent: "%"
  • Plus: "+"
  • Semicolon: ";"
  • Single quote: "'"
  • Slash: "/"
  • Question mark: "?"
  • Tilde: "~"
  • Underscore: "_"

<text>.MARK_SAFE

Marks the text as safe without applying any type of data transformation.

<text>.SET_TEXT_MODE(URLENCODED|NOURLENCODED)

Transforms all %HH encoding in the byte stream. This operation works with characters (not bytes). By default, a single byte represents a character in ASCII encoding. However, if you specify URLENCODED mode, three bytes can represent a character.

In the following example, a PREFIX(3) operation selects the first 3 characters in a target.

http.req.url.hostname.prefix(3)

In the following example, the NetScaler can select up to 9 bytes from the target:

http.req.url.hostname.set_text_mode(urlencoded).prefix(3)

<text>.SET_TEXT_MODE(PLUS_AS_SPACE|NO_PLUS_AS_SPACE)

Specifies how to treat the plus character (+). The PLUS_AS_SPACE option replaces a plus character with white space. For example, the text “hello+world” becomes “hello world.” The NO_PLUS_AS_SPACE option leaves plus characters as they are.

<text>.SET_TEXT_MODE(BACKSLASH_ENCODED|NO_BACKSLASH_ENCODED)

Specifies whether or not backslash decoding is performed on the text object represented by <text>.

If BACKSLASH_ENCODED is specified, the SET_TEXT_MODE operator performs the following operations on the text object:

  • All occurrences of “\XXX” will be replaced with the character “Y” (where XXX represents a number in the octal system and Y represents the ASCII equivalent of XXX). The valid range of octal values for this type of encoding is 0 to 377. For example, the encoded text "http\72//" and "http\072//" will both be decoded to "http://", where the colon (:) is the ASCII equivalent of the octal value “72”.
  • All occurrences of “\xHH” will be replaced with the character “Y” (HH represents a number in the hexadecimal system and Y denotes the ASCII equivalent of HH. For example, the encoded text "http\x3a//" will be decoded to "http://", where the colon (:) is the ASCII equivalent of the hexadecimal value “3a“.
  • All occurrences of “\\uWWXX” will be replaced with the character sequence “YZ” (Where WW and XX represent two distinct hexadecimal values and Y and Z represent their ASCII equivalents of WW and XX respectively. For example, the encoded text "http%u3a2f/" and "http%u003a//" will both be decoded to "http://", where “3a” and “2f” are two hexadecimal values and the colon (:) and forward slash (“/”) represent their ASCII equivalents respectively.
  • All occurrences of "\b", "\n", "\t", "\f", and "\r" are replaced with the corresponding ASCII characters.

If NO_BACKSLASH_ENCODED is specified, backslash decoding is not performed on the text object.

<text>.SET_TEXT_MODE(BAD_ENCODE_RAISE_UNDEF|NO_BAD_ENCODE_RAISE_UNDEF)

Performs the associated undefined action if either the URLENCODED or the BACKSLASH_ENCODED mode is set and bad encoding corresponding to the specified encoding mode is encountered in the text object represented by <text>.

If NO_BAD_ENCODE_RAISE_UNDEF is specified, the associated undefined action will not be performed when bad encoding is encountered in the text object represented by<text>.