Product Documentation

Operations for HTTP, HTML, and XML encoding and “safe” characters

The following operations work with the encoding of HTML data in a request or response and XML data in a POST body.

  • <text>.HTML_XML_SAFE: Transforms special characters into XML safe format, as in the following examples:

    A left-pointing angle bracket (<) is converted to < A right-pointing angle bracket (>) is converted to > An ampersand (&) is converted to & This operation safeguards against cross-site scripting attacks. Maximum length of the transformed text is 2048 bytes. This is a read-only operation.

    After applying the transformation, additional operators that you specify in the expression are applied to the selected text. Following is an example:

    http.req.url.query.html_xml_safe. contains(“myQueryString”)

  • <text>.HTTP_HEADER_SAFE: Converts all new line (‘\n’) characters in the input text to ‘%0A’ to enable the input to be used safely in HTTP headers.

    This operation safeguards against response-splitting attacks.

    The maximum length of the transformed text is 2048 bytes. This is a read-only operation.

  • <text>.HTTP_URL_SAFE: Converts unsafe URL characters to ‘%xx’ values, where “xx” is a hex-based representation of the input character. For example, the ampersand (&) is represented as %26 in URL-safe encoding. The maximum length of the transformed text is 2048 bytes. This is a read-only operation.

    Following are URL safe characters. All others are unsafe:

    • Alpha-numeric characters: a-z, A-Z, 0-9
    • Asterix: “*”
    • Ampersand: “&”
    • At-sign: “@”
    • Colon: “:”
    • Comma: “,”
    • Dollar: “$”
    • Dot: “.”
    • Equals: “=”
    • Exclamation mark: “!”
    • Hyphen: “-“
    • Open and close parentheses: “(“, “)”
    • Percent: “%”
    • Plus: “+”
    • Semicolon: “;”
    • Single quote: “’”
    • Slash: “/”
    • Question mark: “?”
    • Tilde: “~”
    • Underscore: “_”
  • <text>.MARK_SAFE: Marks the text as safe without applying any type of data transformation.

  • **<text>.SET_TEXT_MODE(URLENCODED|NOURLENCODED) Transforms all %HH encoding in the byte stream. This operation works with characters (not bytes). By default, a single byte represents a character in ASCII encoding. However, if you specify URLENCODED mode, three bytes can represent a character.

    In the following example, a PREFIX(3) operation selects the first 3 characters in a target.

    http.req.url.hostname.prefix(3)

    In the following example, the Citrix ADC can select up to 9 bytes from the target:

    http.req.url.hostname.set_text_mode(urlencoded).prefix(3)

  • <text>.SET_TEXT_MODE(PLUS_AS_SPACE|NO_PLUS_AS_SPACE): Specifies how to treat the plus character (+). The PLUS_AS_SPACE option replaces a plus character with white space. For example, the text “hello+world” becomes “hello world.” The NO_PLUS_AS_SPACE option leaves plus characters as they are.

  • <text>.SET_TEXT_MODE(BACKSLASH_ENCODED|NO_BACKSLASH_ENCODED): Specifies whether or not backslash decoding is performed on the text object represented by <text>.

    If BACKSLASH_ENCODED is specified, the SET_TEXT_MODE operator performs the following operations on the text object:

    • All occurrences of “\XXX” will be replaced with the character “Y” (where XXX represents a number in the octal system and Y represents the ASCII equivalent of XXX). The valid range of octal values for this type of encoding is 0 to 377. For example, the encoded text “http\72//” and http\072//” will both be decoded to http://, where the colon (:) is the ASCII equivalent of the octal value “72”.
    • All occurrences of “\xHH” will be replaced with the character “Y” (HH represents a number in the hexadecimal system and Y denotes the ASCII equivalent of HH. For example, the encoded text “http\x3a//” will be decoded to http://, where the colon (:) is the ASCII equivalent of the hexadecimal value “3a“.
    • All occurrences of “\uWWXX” will be replaced with the character sequence “YZ” (Where WW and XX represent two distinct hexadecimal values and Y and Z represent their ASCII equivalents of WW and XX respectively. For example, the encoded text “http%u3a2f/” and “http%u003a//” will both be decoded to http://, where “3a” and “2f” are two hexadecimal values and the colon (:) and forward slash (“/”) represent their ASCII equivalents respectively.
    • All occurrences of “\b”, “\n”, “\t”, “\f”, and “\r” are replaced with the corresponding ASCII characters. If NO_BACKSLASH_ENCODED is specified, backslash decoding is not performed on the text object.
  • <text>.SET_TEXT_MODE(BAD_ENCODE_RAISE_UNDEF|NO_BAD_ENCODE_RAISE_UNDEF): Performs the associated undefined action if either the URLENCODED or the BACKSLASH_ENCODED mode is set and bad encoding corresponding to the specified encoding mode is encountered in the text object represented by <text>.

    If NO_BAD_ENCODE_RAISE_UNDEF is specified, the associated undefined action will not be performed when bad encoding is encountered in the text object represented by<text>.

Operations for HTTP, HTML, and XML encoding and “safe” characters

In this article