Citrix Virtual Apps and Desktops

Audio features

You can configure and add the following Citrix policy settings to a policy that optimizes HDX audio features. For usage details plus relationships and dependencies with other policy settings, see Audio policy settings and Bandwidth policy settings and Multi-stream connections policy settings.

Adaptive audio

With adaptive audio, you don’t need to manually configure the audio quality policies on the VDA. Adaptive audio optimizes settings for your environment and replaces obsolete audio compression formats to provide an excellent user experience.

Adaptive audio is enabled by default. To disable adaptive audio, see Audio policy settings.


Citrix recommends delivering audio using User Datagram Protocol (UDP) rather than TCP when real-time audio applications are required. The following audio transport options are available over UDP:

  • Audio over UDP
  • HDX Adaptive Transport (Enlightened Data Transport)

UDP audio encryption using DTLS is available only between Citrix Gateway and Citrix Workspace app. Therefore, sometimes it might be preferable to use TCP transport. TCP supports end-to-end TLS encryption from the VDA to Citrix Workspace app.

For more information on adaptive audio and UDP audio, see Audio over UDP Real-time Transport and audio UDP port range.

Loss tolerant mode for audio

The loss tolerant mode supports audio. This feature increases the user experience for real-time streaming and improves audio quality over EDT when users are connecting through networks with high latency and packet loss.

This feature is disabled by default and the loss tolerant mode for audio policy must be enabled. Note that HDX Adaptive Transport (EDT) must be enabled for this feature to work.

System requirements

Ensure that the following products are on the minimum versions that support loss tolerant mode:

  • Citrix Virtual Delivery Agent (VDA) 2308
  • Citrix Workspace app for Windows 2309

In addition, the following features must be enabled:


If the conditions above are not met, audio is sent over the EDT Reliable transport.

Additional information

The loss tolerant mode is a loss-tolerant transport protocol that allows packet loss in transmission without resending multimedia content, resulting in a more real-time experience for users.

Enlightened Data Transport (EDT) is a Citrix-proprietary transport protocol that delivers a superior user experience on challenging long-haul connections while maintaining server scalability. Loss tolerant mode is a feature of Citrix Gateway service that uses the loss tolerant mode as the transport protocol to maintain a stable connection even in the face of network congestion. This ensures a consistent and stable experience for remote workers. During normal conditions, both EDT and the loss tolerant mode provide similar results. However, during network conditions with packet loss, loss tolerant mode provides a better audio experience compared to EDT. This makes it an essential feature for remote workers who rely on real-time multimedia for their work.

Audio quality

In general, higher sound quality consumes more bandwidth and server CPU utilization by sending more audio data to user devices. Sound compression allows you to balance sound quality against overall session performance; use Citrix policy settings to configure the compression levels to apply to sound files.

By default, the Audio quality policy setting is set to High - high definition audio when TCP transport is used. The policy is set to Medium - optimized-for-speech when UDP transport (recommended) is used. The High Definition audio setting provides high fidelity stereo audio, but consumes more bandwidth than other quality settings. Do not use this audio quality for non-optimized voice chat or video chat applications (such as softphones). The reason is that it might introduce latency into the audio path that is not suitable for real-time communications. We recommend the optimized for speech policy setting for real-time audio, regardless of the selected transport protocol.

When the bandwidth is limited, for example satellite or dial-up connections, reducing audio quality to Low consumes the least possible bandwidth. In this situation, create separate policies for users on low-bandwidth connections so that users on high-bandwidth connections are not adversely impacted.

For setting details, see Audio policy settings. Remember to enable Client audio settings on the user device.

Bandwidth guidelines for audio playback and recording:

  • Adaptive audio (default)
    • Bitrate: variable adaptive
    • Number of channels: 2 (Stereo) for playback, 1 (mono) for microphone capture
    • Frequency: 48000 Hz
    • Bit-depth: 16-bit
  • High quality
    • Bitrate: ~100 kbps (min 75, max 175 kbps) for playback / ~70 kbps for microphone capture
    • Number of Channels: 2 (Stereo) for playback, 1 (mono) for microphone capture
    • Frequency: 44100 Hz
    • Bit-depth: 16-bit
  • Medium quality (recommended for VoIP)
    • Bitrate: ~16 kbps (min 20, max 40 kbps) for playback, ~16 kbps for microphone capture
    • Number of Channels: 1 (Mono) for both playback and capture
    • Frequency: 16000 Hz (wideband)
    • Bit-depth: 16-bit
  • Low quality
    • Bitrate: ~ 11 kbps (min 10; max 25 kbps) for playback, ~11 kbps for microphone capture
    • Number of Channels: 1 (Mono) for both playback and capture
    • Frequency: 8000 Hz (narrowband)
    • Bit-depth: 16-bit

Client audio redirection

To allow users to receive audio from an application on a server through speakers or other sound devices on the user device, leave the Client audio redirection setting at Allowed. This is the default.

Client audio mapping puts extra load on the servers and the network. However, prohibiting client audio redirection disables all HDX audio functionality.

For setting details, see Audio policy settings. Remember to enable client audio settings on the user device.

Client microphone redirection

To allow users to record audio using input devices such as microphones on the user device, leave the Client microphone redirection setting at its default (Allowed).

For security, user devices alert their users, when servers they don’t trust, try to access microphones. Users can choose to accept or reject access before using the microphone. Users can disable this alert on the Citrix Workspace app.

For setting details, see Audio policy settings. Remember to enable Client audio settings on the user device.

Audio Plug N Play

The Audio Plug N Play policy setting allows or prevents the use of multiple audio devices to record and play sound. This setting is Enabled by default. Audio Plug N Play enables audio devices to be recognized. The devices are recognized even if they are not plugged in until after the user session has started.

This setting applies only to Windows Multi-session OS machines.

For setting details, see Audio policy settings.

Audio redirection bandwidth limit and audio redirection bandwidth limit percent

The Audio redirection bandwidth limit policy setting specifies the maximum bandwidth (in kilobits per second) for a playing and recording audio in a session.

The Audio redirection bandwidth limit percent setting specifies the maximum bandwidth for audio redirection as a percentage of the total available bandwidth.

By default, zero (no maximum) is specified for both settings. If both settings are configured, the one with the lowest bandwidth limit is used.

For setting details, see Bandwidth policy settings. Remember to enable Client audio settings on the user device.

Audio over UDP Real-time Transport and Audio UDP port range

By default, Audio over User Datagram Protocol (UDP) Real-time Transport is allowed (when selected at the time of installation). It opens up a UDP port on the server for connections that use Audio over UDP Real-time Transport. If there is network congestion or packet loss, we recommend configuring UDP/RTP for audio to ensure the best possible user experience. For any real time audio such as softphone applications, UDP audio is preferred to EDT. UDP allows for packet loss without retransmission, ensuring that no latency is added on connections with high packet loss.


When Citrix Gateway is not in the path, audio data transmitted with UDP is not encrypted. If Citrix Gateway is configured to access Citrix Virtual Apps and Desktops resources, then audio traffic between the endpoint device and Citrix Gateway is secured using DTLS protocol.

The Audio UDP port range specifies the range of port numbers that the Windows VDA uses to exchange audio packet data with the user device.

By default, the range is 16500 through 16509.


If Audio over UDP Real-time Transport is not required for adaptive audio, Citrix recommends configuring the policy setting to Disabled. This helps avoid Citrix Workspace app clients requesting open UDP connections or triggering unwanted Citrix Workspace app client firewall configuration dialog windows to appear.

For setting details about Audio over UDP Real-time Transport, see Audio policy settings. For details about Audio UDP port range, see Multi-stream connections policy settings. Remember to enable Client audio settings on the user device.

Audio over UDP requires the Windows VDA. For supported policies on the Linux VDA, see Policy support list.

Audio setting policies for user devices

  1. Load the group policy templates by following Configuring the Group Policy Object administrative template.
  2. In the Group Policy Editor, expand Administrative Templates > Citrix Components > Citrix Workspace > User Experience.
  3. For Client audio settings, select Not Configured, Enabled, or Disabled.
    • Not Configured. By default, Audio Redirection is enabled using high quality audio or the previously configured custom audio settings.
    • Enabled. Enables audio redirection using the selected options.
    • Disabled. Disables audio redirection.
  4. If you select Enabled, choose a sound quality. For UDP audio, use Medium (default).
  5. For UDP audio only, select Enable Real-Time Transport and then set the range of incoming ports to open in the local Windows firewall.
  6. To use UDP Audio with Citrix Gateway, select Allow Real-Time Transport Through gateway. Configure Citrix Gateway with DTLS. For more information, see this article.

As an Administrator, if you do not have control on endpoint devices to make these changes, use the default.ica attributes from StoreFront to enable UDP Audio. For example, for bring your own devices or home computers.

  1. On the StoreFront machine, open C:\inetpub\wwwroot\Citrix\<Store Name>\App_Data\default.ica with an editor such as notepad.
  2. Make the following entries under the [Application] section.

    ; This text enables Real-Time Transport


    ; This text allows Real-Time Transport Through gateway


    ; This text sets audio quality to Medium


    ; UDP Port range



If you enable User Datagram Protocol (UDP) audio by editing default.ica, then UDP audio is enabled for all users who are using that store.

Avoid echo during multimedia conferences

Users in audio or video conferences might hear an echo. Echoes usually occur when speakers and microphones are too close to each other. For that reason, we recommend the use of headsets for audio and video conferences.

HDX provides an echo cancellation option (enabled by default) that minimizes any echo. The effectiveness of echo cancellation is sensitive to the distance between the speakers and the microphone. Ensure that the devices aren’t too close or too far away from each other.

You can change a registry setting to disable echo cancellation. For information, see Avoid echo during multimedia conferences in the list of features managed through the registry.


A softphone is software acting as a phone interface. You use a softphone to make calls over the internet from a computer or other smart device. By using a softphone, you can dial phone numbers and carry out other phone-related functions using a screen.

Citrix Virtual Apps and Desktops support several alternatives for delivering softphones.

Generic softphone support

Generic softphone support enables you to host an unmodified softphone on XenApp or XenDesktop in the data center. The audio traffic goes over the Citrix ICA protocol (preferably using UDP/RTP) to the user device running the Citrix Workspace app.

Generic softphone support is a feature of HDX RealTime. This approach to softphone delivery is especially useful when:

  • An optimized solution for delivering the softphone is not available and the user is not on a Windows device where Local App Access can be used.
  • The media engine that is needed for optimized delivery of the softphone isn’t installed on the user device or isn’t available for the operating system version running on the user device. In this scenario, Generic HDX RealTime provides a valuable fallback solution.

There are two softphone delivery considerations using Citrix Virtual Apps and Desktops:

  • How the softphone application is delivered to the virtual/published desktop.
  • How the audio is delivered to and from the user headset, microphone, and speakers, or USB telephone set.

Citrix Virtual Apps and Desktops include numerous technologies to support generic softphone delivery:

  • Optimized-for-Speech codec for fast encode of the real-time audio and bandwidth efficiency.
  • Low latency audio stack.
  • Server-side jitter buffer to smooth out the audio when the network latency fluctuates.
  • Packet tagging (DSCP and WMM) for Quality of Service.
    • DSCP tagging for RTP packets (Layer 3)
    • WMM tagging for Wi-Fi

The Citrix Workspace app versions for Windows, Linux, Chrome, and Mac also are Voice over Internet Protocol capable. Citrix Workspace app for Windows offers these features:

  • Client-side jitter buffer - Ensures smooth audio even when the network latency fluctuates.
  • Echo cancellation - Allows for greater variation in the distance between microphone and speakers for workers who do not use a headset.
  • Audio plug-n-play - Audio devices do not need to be plugged in before starting a session. They can be plugged in at any time.
  • Audio device routing - Users can direct ringtone to speakers but the voice path to their headset.
  • Multi-stream ICA - Enables flexible Quality of Service-based routing over the network.
  • ICA supports four TCP and two UDP streams. One of the UDP streams supports the real-time audio over RTP.

For a summary of Citrix Workspace app capabilities, see Citrix Receiver Feature Matrix.

System configuration recommendations

Client Hardware and Software: For optimal audio quality, we recommend the latest version of Citrix Workspace app and a good quality headset that has acoustic echo cancellation (AEC). Citrix Workspace app versions for Windows, Linux, and Mac support Voice over Internet Protocol. Also, Dell Wyse offers Voice over Internet Protocol support for ThinOS (WTOS).

CPU Considerations: Monitor CPU usage on the VDA to determine if it is necessary to assign two virtual CPUs to each virtual machine. Real-time voice and video are data intensive. Configuring two virtual CPUs reduces the thread switching latency. Therefore, we recommend that you configure two vCPUs in a Citrix Virtual Desktops VDI environment.

Having two virtual CPUs does not necessarily mean doubling the number of physical CPUs, because physical CPUs can be shared across sessions.

Citrix Gateway Protocol (CGP), which is used for the Session Reliability feature, also increases CPU consumption. On high-quality network connections, you can disable this feature to reduce CPU consumption on the VDA. Neither of the preceding steps might be necessary on a powerful server.

UDP Audio: Audio over UDP provides excellent tolerance of network congestion and packet loss. We recommend it instead of TCP when available.

LAN/WAN configuration: Proper configuration of the network is critical for good real-time audio quality. Typically, you must configure virtual LANs (VLANs) because excessive broadcast packets can introduce jitter. IPv6-enabled devices might generate many broadcast packets. If IPv6 support is not needed, you can disable IPv6 on those devices. Configure to support Quality of Service.

Settings for use WAN connections: You can use voice chat over LAN and WAN connections. On a WAN connection, audio quality depends on the latency, packet loss, and jitter on the connection. If delivering softphones to users on a WAN connection, we recommend using the NetScaler SD-WAN between the data center and the remote office. Doing so maintains a high Quality of Service. NetScaler SD-WAN supports Multi-Stream ICA, including UDP. Also, for a single TCP stream, it’s possible to distinguish the priorities of various ICA virtual channels to ensure that high priority real-time audio data receives preferential treatment.

Use Director or the HDX Monitor to validate your HDX configuration.

Remote user connections: Citrix Gateway supports DTLS to deliver UDP/RTP traffic natively (without encapsulation in TCP). Open firewalls bidirectionally for UDP traffic over Port 443.

Codec selection and bandwidth consumption: Between the user device and the VDA in the data center, we recommend using the Optimized-for-Speech codec setting, also known as Medium Quality audio. Between the VDA platform and the IP-PBX, the softphone uses whatever codec is configured or negotiated. For example:

  • G711 provides good voice quality but has a bandwidth requirement of from 80 kilobits per second through 100 kilobits per second per call (depending on Network Layer2 overheads).
  • G729 provides good voice quality and has a low bandwidth requirement of from 30 kilobits per second through 40 kilobits per second per call (depending on Network Layer 2 overheads).

Delivering softphone applications to the virtual desktop

There are two methods by which you can deliver a softphone to the XenDesktop virtual desktop:

  • The application can be installed in the virtual desktop image.
  • The application can be streamed to the virtual desktop using Microsoft App‑V. This approach has manageability advantages because the virtual desktop image is kept uncluttered. After being streamed to the virtual desktop, the application runs in that environment as if it was installed in the usual manner. Not all applications are compatible with App-V.

Delivering audio to and from the user device

Generic HDX RealTime supports two methods of delivering audio to and from the user device:

  • Citrix Audio Virtual Channel. We generally recommend the Citrix Audio Virtual Channel because it’s designed specifically for audio transport.
  • Generic USB Redirection. Supports audio devices having buttons or a display (or both), human interface device (HID), if the user device is on a LAN or LAN-like connection back to the Citrix Virtual Apps and Desktops server.

Citrix audio virtual channel

The bidirectional Citrix Audio Virtual Channel (CTXCAM) enables audio to be delivered efficiently over the network. Generic HDX RealTime takes the audio from the user headset or microphone and compresses it. Then, it sends it over ICA to the softphone application on the virtual desktop. Likewise, the audio output of the softphone is compressed and sent in the other direction to the user headset or speakers. This compression is independent of the compression used by the softphone itself (such as G.729 or G.711). It is done using the Optimized-for-Speech codec (Medium Quality). Its characteristics are ideal for Voice over Internet Protocol. It features quick encode time, and it consumes only approximately 56 Kilobits per second of network bandwidth (28 Kbps in each direction), peak. This codec must be explicitly selected in the Studio console because it is not the default audio codec. The default is the HD Audio codec (High Quality). This codec is excellent for high fidelity stereo soundtracks but is slower to encode compared to the Optimized-for-Speech codec.

Generic USB Redirection

Citrix Generic USB Redirection technology (CTXGUSB virtual channel) provides a generic means of remoting USB devices, including composite devices (audio plus HID) and isochronous USB devices. This approach is limited to LAN-connected users. This reason being that the USB protocol tends to be sensitive to network latency and requires considerable network bandwidth. Isochronous USB redirection works well when using some softphones. This redirection provides excellent voice quality and low latency. However, Citrix Audio Virtual Channel is preferred because it is optimized for audio traffic. The primary exception is when you’re using an audio device with buttons. For example, a USB telephone attached to the user device that is LAN-connected to the data center. In this case, Generic USB Redirection supports buttons on the phone set or headset that control features by sending a signal back to the softphone. There isn’t an issue with buttons that work locally on the device.

Audio diagnostic command line tool

The audio diagnostic command line tool on the VDA can be used to query session data related to audio policies, configuration, and data transport.


Open a command prompt and run CtxAudio.exe from the C:\Program Files\Citrix\HDX\bin folder.

  • Running the tool as an administrator displays all active ICA session(s) audio information.
  • Running the tool as a non-administrator displays the current user’s ICA session audio information.


The tool outputs various configuration settings that can help diagnose audio-related issues within a session.

Section Description
Policy information Audio policies applied to the current session(s).
Settings information Audio related configuration settings stored in the registry.
State information Audio state, version, codecs, and transport applied to the current session(s).
Devices information Device names, their roles, and their statuses used in the session.


The output varies depending on if you run the tool on a multi-session (TS) VDA or a single-session VDA (WSVDA).


You install an audio device on your client, enable the audio redirection, and start an RDS session. The audio files might fail to play and an error message appears.

As a workaround, add the registry key on the RDS machine, and then restart the machine. For information, see Audio limitation in the list of features managed through the registry.