Product Documentation

Compression

May 23, 2013

CloudBridge compression uses breakthrough technology to provide transparent multilevel compression. It is true compression that acts on arbitrary byte streams. It is not application-aware, is indifferent to connection boundaries, and can compress a string optimally the second time it appears in the data. CloudBridge compression works at any link speed.

The compression engine is very fast, allowing the speedup factor for compression to approach the compression ratio. For example, a bulk transfer monopolizing a 1.5 Mbps T1 link and achieving a 100:1 compression ratio can deliver a speedup ratio of almost 100x, or 150 Mbps, provided that the WAN bandwidth is the only bottleneck in the transfer.

Unlike with most compression methods, CloudBridge compression history is shared between all connections that pass between the same two appliances. Data sent hours, days, or even weeks earlier by connection A can be referred to later by connection B, and receive the full speedup benefit of compression. The resulting performance is much higher than can be achieved by conventional methods.

Compression can use the appliance's disk as well as memory, providing up to terabytes of compression history.

How Compression Works

All compression algorithms scan the data to be compressed, searching for strings of data that match strings that have been sent before. If no such matches are found, the literal data is sent. If a match is found, the matching data is replaced with a pointer to the previous occurrence. In a very large matching string, megabytes or even gigabytes of data can be represented by a pointer containing only a few bytes, and only those few bytes need be sent over the link.

Disk Based Compression

The disk-based compression engine uses anywhere between tens of gigabytes and terabytes of memory to store compression history, allowing more and better compression matches. The disk-based compression engine is very fast but sometimes has a higher latency than the memory-based engines, and is often chosen automatically for bulk transfers.

Compression engines are limited by the size of their compression history. Traditional compression algorithms, such as LZS and ZLIB, use compression histories of 64 KB or less. CloudBridge appliances maintain at least 100 GB of compression history. With more than a million times the compression history of traditional algorithms, the CloudBridge algorithm finds more matches and longer matches, resulting in superior compression ratios.

The CloudBridge compression algorithm is very fast, so that even the entry-level appliances can saturate a 100 Mbps LAN with the output of the compressor. The highest-performance models can deliver well over 1 Gbps of throughput.

Only payload data is compressed. However, headers are compressed indirectly. For example, if a connection achieves 4:1 compression, only one full-sized output packet is sent for every four full-sized input packets. Thus, the amount of header data is also reduced by 4:1.

Compression as a General-Purpose Optimization

CloudBridge compression is application-independent: it can compress data from any non-encrypted TCP connection.

Unlike caching, compression performance is robust in the face of changing data. With caching, changing a single byte of a file invalidates the entire copy in the cache. With compression, changing a single byte in the middle of a file just creates two large matches separated by a single byte of nonmatching data, and the resulting transfer time is only slightly greater than before. Therefore, the compression ratio degrades gracefully with the amount of change. If you download a file, change 1% of it, and upload it again, expect a 99:1 compression ratio on the upload.

Another advantage of a large compression history is that precompressed data compresses easily with CloudBridge technology. A JPEG image or a YouTube video, for example, is precompressed, leaving little possibility for additional compression the first time it is sent over the link. But whenever it is sent again, the entire transfer is reduced to just a handful of bytes, even if it is sent by different users or with different protocols, such as by FTP the first time and HTTP the next.

In practice, compression performance depends on how much of the data traversing the link is the same as data that has previously traversed the link. The amount varies from application to application, from day to day, and even from moment to moment. When looking at a list of active accelerated connections, expect to see ratios anywhere from 1:1 to 10,000:1.

Compressing Encrypted Protocols

Many connections showing poor compression performance do so because they are encrypted. Encrypted traffic is normally uncompressible, but CloudBridge appliances can compress encrypted connections when the appliances join the security infrastructure. CloudBridge appliances join the security infrastructure automatically with Citrix XenApp and XenDesktop, and can join the security infrastructure of SSL, Windows file system (CIFS/SMB), and Outlook/Exchange (MAPI) servers with manual configuration.

Adaptive, Zero-Config Operation

To serve the different needs of different kinds of traffic, CloudBridge appliances use not one but five compression engines, so the needs of everything from the most massive bulk transfer to the most latency-sensitive interactive traffic can be accommodated with ease. The compression engine is matched dynamically to the changing needs of individual connections, so that compression is automatically optimized. An added benefit is that the compression engine requires no configuration.

Memory Based Compression

Most of the compression engines use RAM to store their compression history. This is called memory-based compression. Some appliances devote gigabytes of memory to these compression engines. Memory-based compression has a low latency and is often chosen automatically for interactive tasks such as XenApp/XenDesktop traffic.