Operations for HTTP, HTML, and XML encoding and “safe” characters
The following operations work with the encoding of HTML data in a request or response and XML data in a POST body.
<text>.HTML_XML_SAFE: Transforms special characters into XML safe format, as in the following examples:
A left-pointing angle bracket (<) is converted to < A right-pointing angle bracket (>) is converted to > An ampersand (&) is converted to & This operation safeguards against cross-site scripting attacks. Maximum length of the transformed text is 2048 bytes. This is a read-only operation.
After applying the transformation, additional operators that you specify in the expression are applied to the selected text. Following is an example:
<text>.HTTP_HEADER_SAFE: Converts all new line (‘\n’) characters in the input text to ‘%0A’ to enable the input to be used safely in HTTP headers.
This operation safeguards against response-splitting attacks.
The maximum length of the transformed text is 2048 bytes. This is a read-only operation.
<text>.HTTP_URL_SAFE: Converts unsafe URL characters to ‘%xx’ values, where “xx” is a hex-based representation of the input character. For example, the ampersand (&) is represented as %26 in URL-safe encoding. The maximum length of the transformed text is 2048 bytes. This is a read-only operation.
Following are URL safe characters. All others are unsafe:
- Alpha-numeric characters: a-z, A-Z, 0-9
- Asterix: “*”
- Ampersand: “&”
- At-sign: “@”
- Colon: “:”
- Comma: “,”
- Dollar: “$”
- Dot: “.”
- Equals: “=”
- Exclamation mark: “!”
- Hyphen: “-“
- Open and close parentheses: “(“, “)”
- Percent: “%”
- Plus: “+”
- Semicolon: “;”
- Single quote: “’”
- Slash: “/”
- Question mark: “?”
- Tilde: “~”
- Underscore: “_”
<text>.MARK_SAFE: Marks the text as safe without applying any type of data transformation.
**<text>.SET_TEXT_MODE(URLENCODED|NOURLENCODED) Transforms all %HH encoding in the byte stream. This operation works with characters (not bytes). By default, a single byte represents a character in ASCII encoding. However, if you specify URLENCODED mode, three bytes can represent a character.
In the following example, a PREFIX(3) operation selects the first 3 characters in a target.
In the following example, the Citrix ADC can select up to 9 bytes from the target:
<text>.SET_TEXT_MODE(PLUS_AS_SPACE|NO_PLUS_AS_SPACE): Specifies how to treat the plus character (+). The PLUS_AS_SPACE option replaces a plus character with white space. For example, the text “hello+world” becomes “hello world.” The NO_PLUS_AS_SPACE option leaves plus characters as they are.
<text>.SET_TEXT_MODE(BACKSLASH_ENCODED|NO_BACKSLASH_ENCODED): Specifies whether or not backslash decoding is performed on the text object represented by <text>.
If BACKSLASH_ENCODED is specified, the SET_TEXT_MODE operator performs the following operations on the text object:
- All occurrences of “\XXX” will be replaced with the character “Y” (where XXX represents a number in the octal system and Y represents the ASCII equivalent of XXX). The valid range of octal values for this type of encoding is 0 to 377. For example, the encoded text “http\72//” and http\072//” will both be decoded to http://, where the colon (:) is the ASCII equivalent of the octal value “72”.
- All occurrences of “\xHH” will be replaced with the character “Y” (HH represents a number in the hexadecimal system and Y denotes the ASCII equivalent of HH. For example, the encoded text “http\x3a//” will be decoded to http://, where the colon (:) is the ASCII equivalent of the hexadecimal value “3a“.
- All occurrences of “\uWWXX” will be replaced with the character sequence “YZ” (Where WW and XX represent two distinct hexadecimal values and Y and Z represent their ASCII equivalents of WW and XX respectively. For example, the encoded text “http%u3a2f/” and “http%u003a//” will both be decoded to http://, where “3a” and “2f” are two hexadecimal values and the colon (:) and forward slash (“/”) represent their ASCII equivalents respectively.
- All occurrences of “\b”, “\n”, “\t”, “\f”, and “\r” are replaced with the corresponding ASCII characters. If NO_BACKSLASH_ENCODED is specified, backslash decoding is not performed on the text object.
<text>.SET_TEXT_MODE(BAD_ENCODE_RAISE_UNDEF|NO_BAD_ENCODE_RAISE_UNDEF): Performs the associated undefined action if either the URLENCODED or the BACKSLASH_ENCODED mode is set and bad encoding corresponding to the specified encoding mode is encountered in the text object represented by <text>.
If NO_BAD_ENCODE_RAISE_UNDEF is specified, the associated undefined action will not be performed when bad encoding is encountered in the text object represented by<text>.