Session Recording is a highly scalable system that handles thousands or tens of thousands of sessions. Installing and running Session Recording requires few extra resources beyond what is necessary to run Citrix Virtual Apps and Desktops or Citrix DaaS (formerly Citrix Virtual Apps and Desktops service). However, we still recommend you consider the performance of your system if you plan to record many sessions. Or, the sessions you plan to record might result in large session files (for example, graphically intense applications).
This article explains how Session Recording achieves high scalability and how you can get the most out of your recording system at a lowest cost.
Why Session Recording scales well
There are two major reasons that Session Recording scales well compared with competitive products:
Small file size
A recorded session file made with Session Recording is highly compact. It is many orders of magnitude smaller than an equivalent video recording made with solutions that screen-scrape. The network bandwidth, disk space, and disk IOPS required to transport/store a recorded session file is typically at least 10 times less than an equivalent video file.
The small size of recorded session files means faster and smoother rendering of video frames. Recordings are also lossless and have no pixelation that is common in most compact video formats. Text in recordings is easy to read during playback as it is in the original sessions. To maintain small file sizes, Session Recording does not record key frames within the files. Session Recording can drop H.264 packages while recording sessions that have videos running and thus reduce the recording file sizes. To use this functionality, set
1on the Session Recording agent and set the value of Use video codec for compression to For actively changing regions.
Low processing required to generate files
A recorded session file contains the ICA protocol data for a session that is extracted virtually in its native format. The file captures the ICA protocol data stream that is used to communicate with Citrix Workspace app. There is no need to run expensive transcoding or encoding software components to change the format of data in real time. The low amount of processing is also important for VDA scalability. It ensures the end-user experience is maintained when many sessions are recorded from the same VDA.
Moreover, only those ICA virtual channels that can be played back are recorded, which results in a further optimization. For example, the printer and client drive mapping channels aren’t recorded. The channels can generate high volumes of data without any benefit in video playback.
Estimate data input and processing rates
The Session Recording server is the central collection point for recorded session files. Each machine that is running a multi-session OS VDA with Session Recording enabled sends recorded session data to the Session Recording server. Session Recording can handle high volumes of data and can tolerate bursts and faults. But there are physical limits on how much data any one server can handle.
Consider how much data you send to each Session Recording server. Estimate how quickly the servers can process and store the data. The rate at which your system can store incoming data must be higher than the data input rate.
To estimate your data input rate, do the following calculation:
- Multiply the number of recorded sessions by the average session size.
- Divide the product by the time for which you are recording sessions.
For example, you might record 5,000 Microsoft Outlook sessions of 20 MB each over an 8-hour work day. In this case, the data input rate is approximately 3.5 Mbps. (5,000 sessions times 20 MB divided by 8 hours, divided by 3,600 seconds per hour.) A typical Session Recording server connected to a 100 Mbps LAN with sufficient disk space to store the recorded data can process data at around 5.0 Mbps. This rate is the processing rate based on the physical limits imposed by disk and network IOPS. In the example, the processing rate (5.0 Mbps) is higher than the input rate (3.5 Mbps), so recording the 5,000 Outlook sessions is feasible.
The amount of data per session varies greatly depending on what is being recorded. Other factors such as screen resolution, color depth, and graphics mode also have impacts. A session where CAD is running likely generates a much larger recording than a session where the user sends and receives emails in Outlook. Therefore, recording the same number of CAD sessions can generate a high input rate and require the use of more Session Recording servers.
Bursts and faults
The previous example assumes a simple uniform throughput of data but doesn’t explain how the system deals with short periods of higher activity, known as bursts. A burst might occur when all users log on at the same time in the morning, known as the 9 o’clock rush. It can also occur when they receive the same email in their Outlook inbox at once. The 5.0 Mbps processing rate of the Session Recording server is highly inadequate at dealing with this sudden demand.
The Session Recording agent running on each VDA uses Microsoft Message Queuing (MSMQ) to send recorded data to the Storage Manager running on the central Session Recording server. The data is sent in a store-and-forward manner similar to how an email is delivered between the sender, mail server, and receiver. If the Session Recording server or network can’t handle a high rate of data in bursts, the recorded data is temporarily stored. The data message might be temporarily stored in the outgoing queue on the VDA if the network is congested. The other case is that the data has traversed the network but the Storage Manager is busy processing other messages. In this case, the data message is stored on the Session Recording server’s receiving queue.
MSMQ also serves as a fault tolerance mechanism. If the Session Recording server goes down or the link is broken, recorded data stays in the outgoing queue on each VDA. When the fault is rectified, all queued data is sent together. MSMQ also allows you to take a server offline for upgrade or maintenance without interrupting session recording and losing data.
The main limitation of MSMQ is that disk space for the temporary storage of data messages is finite. This limitation limits how long a burst, fault, or maintenance event can last before data is eventually lost. The overall system can continue after data loss, but in this situation, individual recordings have chunks of data missing. A file with missing data is still playable but only up to the point where data was first lost. Note the following:
Adding more disk space to each server, especially the Session Recording server, and making it available to MSMQ can increase the tolerance to bursts and faults.
It is important to configure the Message Life setting for each Session Recording agent to an appropriate level (on the Connections tab in Session Recording agent Properties). The default value is 7,200 seconds (two hours). It means that each recorded data message has two hours to reach the Storage Manager before the Storage Manager discards it and damages the recording file. With more disk space available (or fewer sessions to record), you can choose to increase this value. The maximum value is 365 days.
The other limitation with MSMQ is that when data backlogs, there is extra disk IOPS in the queue to read and write data messages. Normally, the Storage Manager receives and processes data from the network directly, without data messages ever being written to disk. Storing the data involves a single write operation to disk that appends the recorded session file. When data is backlogged, the disk IOPS is tripled: each message must be written to disk, read from disk, and written to file. As the Storage Manager is heavily IOPS bound, the processing rate of the Session Recording server drops until the backlog of messages is cleared. To mitigate the effects of this extra IOPS, adopt the following recommendations:
Make sure that the disk on which MSMQ stores messages is different from the recording file storage folders. Even though IOPS bus traffic is tripled, the drop in the true processing rate is never as severe.
Plan outages at off-peak times only. Depending on budget constraints, follow recognized approaches to building high availability servers. The approaches include the use of Uninterruptible Power Supply (UPS), dual NICs, redundant switches, and hot swappable memory and disks.
Design for spare capacity
The data rate of recorded session data is unlikely to be uniform, bursts and faults might occur, and the clearing of message backlogs is expensive in IOPS. For this reason, design each Session Recording server with plenty of spare capacity. Adding more servers or improving the specification of existing servers, as described in later sections, always gains you extra capacity. The general rule of thumb is to run each Session Recording server at a maximum of 50% of its total capacity. In the earlier example, if the server can process 5.0 Mbps, target the system to run only at 2.5 Mbps. Instead of recording 5,000 Outlook sessions that generate 3.5 Mbps on one Session Recording server, reduce to 3,500 sessions that generate only about 2.5 Mbps.
Backlogs and live playback
Live playback is when a reviewer opens a session recording for playback while the session is still active. During live playback, the responsible Session Recording agent switches to a streaming mode for that session. Recording data is sent immediately to the Storage Manager without internal buffering. Because the recording file is constantly updated, the player can continue to be fed with the latest data from the live session. However, data sent from the agent to the Storage Manager is through MSMQ, so the queuing rules described earlier apply. A problem can occur in this scenario. When MSMQ is backlogged, the new recorded data available for live playback is queued like all other data messages. The reviewer can still play the file, but viewing the latest live recorded data is delayed. If live playback is an important feature for reviewers, ensure a low probability of backlog. You can design spare capacity and fault tolerance into your deployment.
Session Recording never reduces session performance and never stops sessions in response to recorded data backlogs. Maintaining the end-user experience and single-server scalability is paramount in the design of the Session Recording system. If the recording system becomes irreversibly overloaded, recorded session data is discarded. Recording ICA sessions has a low impact on the performance and scalability of VDAs. The size of the impact depends on the platform, the memory available, and the graphical nature of the sessions being recorded. With the following configuration, you can expect a single-server scalability impact of between 1% and 5%. In other words, if a server can host 100 users without Session Recording installed, it can host 95–99 users after installation:
- 64-bit server with 8 GB RAM running a multi-session OS VDA
- All sessions running Office productivity applications, such as Outlook and Excel
- The use of applications is active and sustained
- All sessions are recorded as configured by the Session Recording policies
With fewer sessions recorded or session activity less sustained and more sporadic, the impact is less. Often times, the scalability impact is negligible and user density per server remains the same. As mentioned earlier, the low impact results from the simple processing requirements of the Session Recording components on each VDA. Recorded data is extracted from the ICA session stack and sent as-is to the Session Recording server through MSMQ. There is no expensive encoding of data.
There is a minor overhead of using Session Recording even when no sessions are recorded. If you are not going to record any sessions from a particular server, you can disable recording on that server. Removing Session Recording is one way. A less invasive approach is to clear the Enable session recording for this VDA machine check box on the Session Recording tab in Session Recording Agent Properties. If session recording is required in future, reselect this check box.
You can measure the throughput of recorded session data from the sending VDA to the receiving Session Recording server. A simple and effective approach is to observe the size of recording files and the rate at which disk space on the Session Recording server is being consumed. The volume of data written to disk closely reflects the volume of network traffic being generated. The Windows Performance Monitor tool (perfmon.exe) has standard system counters that you can observe in addition to some counters provided by Session Recording. Counters can be used to measure throughput, and identify bottlenecks and system problems. The following table outlines some of the most useful performance counters.
|Performance Object||Counter Name||Description|
|Citrix Session Recording Agent||
||The number of sessions that are currently being recorded on a particular VDA.|
|Citrix Session Recording Agent||
||The number of bytes read from the kernel components responsible for acquiring session data. Useful for determining how much data a single VDA generates for all sessions recorded on that server.|
|Citrix Session Recording Storage Manager||
||Similar to the Citrix Session Recording agent counter except for the Session Recording server. Indicates the total number of sessions currently being recorded for all servers.|
|Citrix Session Recording Storage Manager||
||The throughput of all recorded sessions. Can be used to determine the rate at which the Storage Manager is processing data. If MSMQ is backlogged with messages, the Storage Manager runs at full speed. This value can be used to indicate the maximum processing rate of the Storage Manager.|
||Can be used to measure disk write-through performance, which is important in achieving high scalability for the Session Recording server. Performance of individual drives can also be observed.|
||Can be used to determine the amount of data backlogged in the CitrixSmAudData message queue. If this value increases over time, the rate of recorded data received from the network is greater than the rate at which the Storage Manager can process data. This counter is useful for observing the effect of data bursts and faults.|
||Similar to the Bytes in Queue counter but measures the number of messages.|
||Can be used to measure on both sides of the link to observe how much data is generated when sessions are recorded. When measured on the Session Recording server, this counter indicates the rate at which incoming data is received. Contrasts with the Citrix Session Recording Storage Manager
||Worth monitoring even though CPU is unlikely to be a bottleneck.|
Session Recording server hardware
You can increase the capacity of your deployment by carefully selecting the Session Recording server hardware. You have two choices: scaling up (by increasing the capacity of each server) or scaling out (by adding more servers). In making either of the choices, your aim is to increase scalability at a lowest cost.
When examining a single Session Recording server, consider the following best practices to ensure optimal performance for available budgets. The system depends on IOPS that can ensure a high throughput of recorded data from the network onto the disk. So it is important to invest in appropriate network and disk hardware. For a high-performance Session Recording server, a dual CPU or dual core CPU is recommended but little is gained from any higher specification. 64-bit processor architecture is recommended but an x86 processor type is also suitable. 4 GB of RAM is recommended but again there is little benefit from adding more.
Even with the best scaling up practices, there are limits to performance and scalability that can be reached with a single Session Recording server when recording many sessions. It might be necessary to add extra servers to meet the load. You can install more Session Recording servers on different machines to have the Session Recording servers work as a load balancing pool. In this type of deployment, the Session Recording servers share the storage and the database. To distribute the load, point the Session Recording agents to the load balancer that is responsible for the workload distribution.
A 100 Mbps network link is suitable for connecting a Session Recording server. A Gb Ethernet connection might improve performance, but does not result in 10 times greater performance than a 100 Mbps link. In practice, the gain in throughput is less.
Ensure that network switches used by Session Recording are not shared with third-party applications that might compete for available network bandwidth. Ideally, network switches are dedicated for use with the Session Recording server. If network congestion proves to be the bottleneck, a network upgrade is a relatively inexpensive way to increase the scalability of the system.
Investment in disk and storage hardware is the single most important factor in server scalability. The faster that data can be written to disk, the higher the performance of the overall system. When selecting a storage solution, take more note of the write performance than the read performance.
Store data on a RAID or a SAN.
Storing data on a NAS, based on file-based protocols such as SMB and NFS, might have performance and security implications. Use the latest version of the protocol in place to avoid security implications and perform scale testing to ensure proper performance.
For a local drive setup, aim for a disk controller with built-in cache memory. Caching allows the controller to use elevator sorting during write-back. It minimizes disk head movement and ensures that write operations are completed without waiting for the physical disk operation to complete. It can improve write performance significantly at a minimal extra cost. Caching does however raise the problem of data loss after a power failure. To ensure the integrity of data and the file system, consider a battery backup facility for the caching disk controller.
Consider using a suitable RAID storage solution. There are many RAID levels available depending on performance and redundancy requirements. The following table specifies each of the RAID levels and how applicable each standard is to Session Recording.
|RAID Level||Type||Minimum Number of Disks||Description|
|RAID 0||Striped set without parity||2||Provides high performance but no redundancy. Loss of any disk destroys the array. RAID 0 is a low cost solution for storing recorded session files where the impact of data loss is low. Easy to scale up performance by adding more disks.|
|RAID 1||Mirrored set without parity||2||No performance gain over one disk, making it a relatively expensive solution. Use this solution only if a high level of redundancy is required.|
|RAID 3||Striped set with dedicated parity||3||Provides high write performance with redundancy characteristics similar to RAID 5. RAID 3 is recommended for video production and live streaming applications. As Session Recording is this type of application, RAID 3 is most highly recommended but it is not common.|
|RAID 5||Striped set with distributed parity||3||Provides high read performance with redundancy but at the cost of slower write performance. RAID 5 is the most common for general purpose usages. But due to the slow write performance, RAID 5 is not recommended for Session Recording. RAID 3 can be deployed at a similar cost but with better write performance.|
|RAID 10||Mirrored set and striped set||4||Provides performance characteristics of RAID 0 with redundancy benefits of RAID 1. An expensive solution that is not recommended for Session Recording.|
RAID 0 and RAID 3 are the most recommended RAID levels. RAID 1 and RAID 5 are popular standards but are not recommended for Session Recording. RAID 10 does provide some performance benefits but is too expensive for the additional gain.
Decide on the type and specification of disk drives. IDE/ATA drives and external USB or Firewire drives are not suitable for use in Session Recording. The main choice is between SATA and SCSI. SATA drives provide reasonably high transfer rates at a reduced cost per MB compared with SCSI drives. However, SCSI drives provide better performance and are more common in server deployments. Server RAID solutions mostly support SCSI drives but some SATA RAID products are now available. When evaluating the specifications of disk drive products, consider the rotational speed of disk and other performance characteristics.
Because the recording of thousands of sessions per day can consume significant amounts of disk space, you must choose between overall capacity and performance. From the earlier example, recording 5,000 Outlook sessions over an 8-hour work day consumes about 100 GB of storage space. To store 10 days’ worth of recordings (that is, 50,000 recorded session files), you need 1,000 GB (1 TB). This pressure on disk space can be eased by shortening the retention period before archiving or deleting old recordings. If 1 TB of disk space is available, a seven-day retention period is reasonable, ensuring disk space usage remains around 700 GB, with 300 GB remaining as a buffer for busy days. In Session Recording, the archiving and deleting of files is supported with the ICLDB utility. It has a minimum retention period of two days. You can schedule a background task to run once a day at some off-peak time. For more information about the ICLDB commands and archiving, see Manage your database records.
The alternative to using local drive and controllers is to use a SAN storage solution based on block-level disk access. To the Session Recording server, the disk array appears as a local drive. SANs are more expensive to set up, but as the disk array is shared, SANs do have the advantage of simplified and centralized management. There are two main types of SAN: Fibre Channel and iSCSI. iSCSI is essentially SCSI over TCP/IP and is gaining popularity over Fibre Channel since the introduction of Gb Ethernet.
The volume of data sent to the Session Recording database is small because the database stores only metadata about the recorded sessions. The files of the recorded sessions themselves are written to a separate disk. Typically, each recorded session requires only about 1 KB of space in the database, unless the Session Recording Event API is used to insert searchable events to the session.
The Express Editions of Microsoft SQL Server 2019, Microsoft SQL Server 2017, Microsoft SQL Server 2016, Microsoft SQL Server 2014, Microsoft SQL Server 2012, and Microsoft SQL Server 2008 R2 impose a database size limitation of 10 GB. At 1 KB per recording session, the database can catalog about 4,000,000 sessions. Other editions of Microsoft SQL Server have no database size restrictions and are limited only by available disk space. As the number of sessions in the database increases, performance of the database and speed of searches diminishes only negligibly.
If you are not making customizations through the Session Recording Event API, each recorded session generates four database transactions: two when recording starts, one when the user logs on to the session being recorded, and one when recording ends. If you use the Session Recording Event API to customize sessions, each searchable event recorded generates one transaction. Because even the most basic database deployment can handle hundreds of transactions per second, the processing load on the database is unlikely to be stressed. The impact is light enough that the Session Recording database can run on the same SQL Server as other databases, including the Citrix Virtual Apps and Desktops data store database.
If your Session Recording deployment requires many millions of recorded sessions to be cataloged in the database, follow Microsoft guidelines for SQL Server scalability.