Product Documentation

Measuring Disk Based Compression Performance

Dec 07, 2012
The Compression Status tab of the Reports: Compression page reports the system compression performance since the system was started or since the Clear button was used to reset the statistics. Compression for individual connections is reported in the connection closure messages in the system log. For example:

Compression performance varies with a number of factors, including the amount of redundancy in the data stream and, to a lesser extent, the structure of the data protocol.

Some applications, such as FTP, send pure data streams; the TCP connection payload is always byte-for-byte identical to the original data file. Others, such as CIFS or NFS, do not send pure data streams, but mix commands, metadata, and data in the same stream. The compression engine distinguishes the file data by parsing the connection payload in real time. Such data streams can easily produce compression ratios between 100:1 and 10,000:1 on the second pass.

Average compression ratios for the link depend on the relative prevalence of long matches, short matches, and no matches. This ratio is dependent on the traffic and is difficult to predict in practice.

Test results show the effect of multi-level compression as a whole, with memory based and disk based compression each making its contribution.

Maximum compression performance is not achieved until the storage space available for disk based compression is filled, providing a maximum amount of previous data to match with new data. In a perfect world, testing would not conclude until the appliance's disks had not only been filled, but filled and overwritten at least once, to ensure that steady-state operation has been reached. However, few administrators have that much representative data at their disposal.

Another difficulty in performance testing is that acceleration often exposes weak links in the network, typically in the performance of the client, the server, or the LAN, and these are sometimes misdiagnosed as disappointing acceleration performance.

You can use Iperf or FTP for preliminary and initial testing. Iperf is useful for preliminary testing. It is extremely compressible (even on the first pass) and uses relatively little CPU and no disk resources on the two endpoint systems. Compressed performance with Iperf should send more than 200 Mbps over a T1 link if the LANs on both sides use Gigabit Ethernet, or slightly less than 100 Mbps if there is any Fast Ethernet equipment in the LAN paths between endpoints and appliances.

Iperf is preinstalled on the appliances (under the Diagnostics menu) and is available fromhttp://iperf.sourceforge.net/. Ideally, it should be installed and run from the endpoint systems, so that the network is tested from end to end, not just from appliance to appliance.

FTP is useful for more realistic testing than is possible with Iperf. FTP is simple and familiar, and its results are easy to interpret. Second-pass performance should be roughly the same as with Iperf. If not, the limiting factor is probably the disk subsystem on one of the endpoint systems.

To test the disk based compression system
  1. Transfer a multiple-gigabyte data stream between two appliances with disk based compression enabled. Note the compression achieved during this transfer. Depending on the nature of the data, considerable compression may be seen on the first pass.
  2. Transfer the same data stream a second time and note the effect on compression.