Citrix DaaS

Size and scale considerations for Cloud Connectors

When evaluating Citrix DaaS (formerly Citrix Virtual Apps and Desktops service) for sizing and scalability, consider all the components. Research and test the configuration of Citrix Cloud Connectors and StoreFront for your specific requirements. Providing insufficient resources for sizing and scalability negatively affects the performance of your deployment.

Note:

These recommendations apply to Citrix DaaS Standard for Azure in addition to Citrix DaaS.

This article provides details of the tested maximum capacities and best practice recommendations for Cloud Connector machine configuration.

The information provided applies to deployments in which each resource location contains either VDI workloads or RDS workloads. For resources locations that contain mixed workloads of VDI and RDS together, contact Citrix Consulting Services.

Separate information is provided for customers using Citrix Workspace and customers using StoreFront. Smaller workloads were tested with Citrix Workspace. Larger workloads were tested with StoreFront. Citrix Workspace was tested without the service continuity feature enabled. Size and scalability recommendations for the service continuity are planned for a future version of this article.

The Cloud Connector links your workloads to Citrix DaaS in the following ways:

  • Provides a proxy for communication between your VDAs and Citrix DaaS
  • Provides a proxy for communication between Citrix DaaS and your Active Directory (AD) and hypervisors
  • In deployments that include StoreFront servers, the Cloud Connector serves as a temporary session broker during cloud outages, providing users with continued access to resources

It is important to have your Cloud Connectors properly sized and configured to meet your specific needs.

Each set of Cloud Connectors is assigned to a resource location (also know as a zone). A resource location is a logical separation that specifies which resources communicate with that set of Cloud Connectors. At least one resource location is required per domain to communicate with the Active Directory (AD).

Each machine catalog and hosting connection is assigned to a resource location.

For deployments with more than one resource location, assign machine catalogs and VDAs to the resource locations to optimize the ability of Local Host Cache (LHC) to broker connections during outages. For more information on creating and managing resource locations, see Connect to Citrix Cloud. For optimum performance, configure your Cloud Connectors on low-latency connections to VDAs, AD servers, and hypervisors.

For performance similar to that seen in these tests, use modern processors that support SHA extensions. SHA extensions reduce the cryptographic load on the CPU. Recommended processors include:

  • Advanced Micro Devices (AMD) Zen and newer processors
  • Intel Ice Lake and newer processors

The tests described in this article were performed with AMD EPYC and Intel Cascade Lake processors.

Cloud Connectors have a heavy cryptographic load while communicating with the cloud. Cloud Connectors using processors with SHA extensions experience lower load on their CPU which is expressed by lower CPU usage by the Windows Local Security Authority Subsystem Service (LSASS).

Citrix recommends using modern storage with adequate I/O operations per second (IOPS), especially for deployments that use LHC. Solid state drives (SSDs) are suggested but premium cloud storage tiers are not needed. Higher IOPS are needed for LHC scenarios where the Cloud Connector runs a small copy of the database. This database is updated with site configuration changes regularly and provides brokering capabilities to the resource location in times of Citrix Cloud outages.

Cloud Connectors run Microsoft SQL Express Server LocalDB, which is automatically installed when you install the Cloud Connector. For deployments that use LHC, the CPU configuration of the Cloud Connector, especially the number of cores available to SQL Express Server LocalDB, directly affects LHC performance. The number of CPU cores available to SQL Server Express Server LocalDB affects LHC performance even more than memory allocation does. This CPU overhead is observed only when in LHC mode when Citrix DaaS is not reachable, and the LHC broker is active. For any deployment using LHC, Citrix recommends four cores per socket, with a minimum of four CPU cores per Cloud Connector. For information on configuring compute resource for SQL Express Server LocalDB, see Compute capacity limits by edition of SQL Server.

If compute resources available to the SQL Express Server LocalDB are misconfigured, configuration synchronization times might be increased and performance during outages might be reduced. In some virtualized environments, compute capacity might depend on the number of logical processors and not CPU cores.

Summary of test findings

All results in this summary are based on the findings from a test environment as configured in the detailed sections of this article. Different system configurations might yield different results.

This illustration gives a graphical overview of the tested configuration.

Tested configuration overview

This table provides a quick guide to sizing your Cloud Connectors. Results are based on Citrix internal testing. The configurations described were tested with varying workloads, including high-rate session launch tests and registration storms.

Each configuration shown has two Cloud Connectors, the minimum required for each resource location to ensure high availability. Citrix recommends using the N+1 redundancy model when deploying Cloud Connectors to maintain a highly available connection with Citrix Cloud.

  Minimum Small Medium Large Maximum
VDAs 500 VDI or 50 RDS 1000 VDI or 100 RDS 1000 VDI or 100 RDS 5000 VDI or 500 RDS 10,000 VDI or 1000 RDS
Hosting connections 10 10 20 40 40
Workspace or StoreFront Workspace Workspace StoreFront with Citrix ADC StoreFront with Citrix ADC StoreFront with Citrix ADC
NetScaler Gateway service proxy Yes Yes No No No
Rendezvous v1 Yes Yes No No No
Local Host Cache No No Yes Yes Yes
CPUs for Connectors 2 vCPU 4 vCPU 4 vCPU 4 vCPU 8 CPU
Memory for Connectors 4 GB 4 GB 6 GB 8 GB 10 GB

About these test configurations

  • CPU and memory requirements are for the base OS and Citrix services only. Third-party apps and services might require additional resources.
  • VDAs are any virtual or physical machines running Citrix Virtual Delivery Agent.
  • All VDAs tested were power-managed using Citrix DaaS.
  • RDS sessions were tested up to 20,000 per resource location.
  • Citrix Workspace was tested using the Rendezvous v1 protocol. Citrix recommends using the Rendezvous protocol for deployments using Citrix Workspace. The Rendezvous protocol reduces CPU load on the Cloud Connector by handing off HDX traffic to the Citrix Gateway Service. For more information on the Rendezvous protocol, see Rendezvous protocol.
  • Tested configuration did not have the Workspace service continuity feature enabled.

Test methodology

Tests were conducted to add load and to measure the performance of the environment components. The components were monitored by collecting performance data and procedure timing, such as logon time and registration time. In some cases, proprietary Citrix simulation tools were used to simulate VDAs and sessions. These tools are designed to exercise Citrix components the same way that traditional VDAs and sessions do, without the same resource requirements to host real sessions and VDAs. Tests were conducted in both cloud brokering and Local Host Cache mode for scenarios with Citrix StoreFront.

Recommendations for Cloud Connector sizing in this article are based on data gathered from these tests.

The following tests were run:

  • Session logon/launch storm: a test that simulates high-volume logon periods.
  • VDA registration storm: a test that simulates high-volume VDA registration periods. For example, following an upgrade cycle or transitioning between cloud brokering and Local Host Cache mode.
  • VDA power action storm: a test that simulates high-volume of VDA power actions.

Citrix Workspace scenarios (minimum and small workloads)

Citrix Workspace is a digital workspace solution that delivers secure and unified access to apps, desktops, and content (resources) from anywhere, on any device. Unless the Citrix Workspace service continuity feature is enabled, Citrix Workspace does not use LHC to make resources available to users during outages. When service continuity is not enabled, the Citrix High Availability Service and the Microsoft SQL Express Server LocalDB are disabled. Service continuity was not enabled for these tests.

Workloads of up to 1000 VDI or up to 200 RDS were tested using Citrix Workspace.

To ensure high availability, a minimum of two Cloud Connectors for each resource location, using the N+1 redundancy model. Because Cloud Connectors might be restarted or taken down for maintenance, these tests were performed using one Cloud Connector. Using two Cloud Connectors might produce slightly better performance than these test results.

In configurations that use Citrix Workspace, Cloud Connector handles:

  • communications between VDAs and Citrix DaaS
  • requests from Citrix DaaS to on-premises AD
  • proxy power actions to hypervisors
  • sessions launch requests
  • VDA registration

Test conditions:

  • Tested using one Cloud Connector. Two Cloud Connectors are required for high availability.
  • Tested with the Cloud Connector configured with Intel Cascade Lake processors.
  • RDS session counts are a recommendation and not a limit. Test your own RDS session limit in your environment.
  • Sessions were launched via Citrix Workspace using Rendezvous v1 protocol.
  • Tested without service continuity enabled.

Test results are summarized in the following table.

Minimal workloads

These workloads were tested with 2 vCPUs and 4 GB memory.

Test workloads VDA registration time Registration CPU and memory usage Launch test length Session launch CPU and memory usage Launch rate
500 VDI 5 minutes CPU maximum = 16%, CPU average = 4%, memory maximum = 2.5 GB 3 minutes CPU maximum = 45%, CPU average = 40%, memory maximum = 3.0 GB 150 per minute
50 RDS, 1000 sessions 2 minutes CPU maximum = 15%, CPU average = 3%, memory maximum = 2.3 GB 6 minutes CPU maximum = 25%, CPU average = 15%, memory maximum = 2.9 GB 166 per minute

Small workloads

These workloads were tested with 4 vCPUs and 4 GB memory.

Test workloads VDA registration time Registration CPU and memory usage Launch test length Session launch CPU and memory usage Launch rate
1000 VDI 5 minutes CPU maximum = 15%, CPU average = 5%, memory maximum = 3.5 GB 6 minutes CPU maximum = 48%, CPU average = 33%, memory maximum = 3.4 GB 166 per minute
200 RDS, 5000 sessions 3 minutes CPU maximum = 5%, CPU average = 2%, memory maximum = 3.5 26 minutes CPU maximum = 18%, CPU average = 3%, memory maximum = 3.2 GB 192 per minute

Citrix StoreFront scenarios (medium, large, and maximum workloads)

For larger workloads, Citrix recommends using LHC for high availability. For more information about using LHC, see the Local Host Cache article. LHC requires an on-premises StoreFront server. For detailed information about StoreFront, see the StoreFront product documentation.

Workloads of 1000 to 10,000 VDI or 200–1000 RDS were tested using StoreFront.

Recommendations for StoreFront configurations:

  • If you have multiple resource locations with a single StoreFront server or server group, enable the advanced health check option for the StoreFront store. See StoreFront requirement in the Local Host Cache article.
  • For higher session launch rates, use a StoreFront server group. See Configure server groups in the StoreFront product documentation.

Test conditions:

  • Tested using one Cloud Connector. Two Cloud Connectors are required for high availability.
  • Tested with the Cloud Connector configured with Intel Cascade Lake processors.
  • RDS session counts are a recommendation and not a limit. Test your own RDS session limit in your environment.
  • Sessions were launched via a single Citrix StoreFront server.
  • LHC outage sessions launch tests conducted after machines had re-registered. Test results are summarized in the following table.

Medium workloads

These workloads were tested with 4 vCPUs and 6 GB memory.

Test workloads Site condition VDA registration time Registration CPU and memory usage Launch test length Session launch CPU and memory usage Launch rate
1000 VDI Online 5 minutes CPU maximum = 36%, CPU average = 33%, memory maximum = 5.3 GB 2 minutes CPU maximum = 29%, CPU average = 27%, memory maximum = 3.7 GB 500 per minute
1000 VDI Outage 4 minutes CPU maximum = 11%, CPU average = 10%, memory maximum = 4.5 GB 2 minutes CPU maximum = 42%, CPU average = 28%, memory maximum = 4.0 GB 500 per minute
200 RDS, 5000 sessions Online 3 minutes CPU maximum = 14%, CPU average = 4%, memory maximum = 3.5 GB 9 minutes CPU maximum = 46%, CPU average = 21%, memory maximum = 3.7 GB 555 per minute
200 RDS, 5000 sessions Outage 3 minutes CPU maximum = 15%, CPU average = 5%, memory maximum = 3.7 9 minutes CPU maximum = 51%, CPU average = 32%, memory maximum = 4.2 GB 555 per minute

Large workloads

These workloads were tested with 4 vCPUs and 8 GB memory.

Test workloads Site condition VDA registration time Registration CPU and memory usage Launch test length Session launch CPU and memory usage Launch rate
5000 VDI Online 3–4 minutes CPU maximum = 45%, CPU average = 25%, memory maximum = 7.0 GB 5 minutes CPU maximum = 75%, CPU average = 55%, memory maximum = 7.0 GB 1000 per minute
5000 VDI Outage 4–6 minutes CPU maximum = 15%, CPU average = 5%, memory maximum = 7.5 GB 5 minutes CPU maximum = 45%, CPU average = 40%, memory maximum = 7.5 GB 1000 per minute
500 RDS, 10,000 sessions Online 3 minutes CPU maximum = 45%, CPU average = 25%, memory maximum = 7.0 GB 10 minutes CPU maximum = 75%, CPU average = 55%, memory maximum = 7.0 GB 1000 per minute
500 RDS, 10,000 sessions Outage 3 minutes CPU maximum = 15%, CPU average = 5%, memory maximum = 7.5 10 minutes CPU maximum = 45%, CPU average = 40%, memory maximum = 7.5 GB 1000 per minute

Maximum workloads

These workloads were tested with 8 vCPUs and 10 GB memory.

Test workloads Site condition VDA registration time Registration CPU and memory usage Launch test length Session launch CPU and memory usage Launch rate
10,000 VDI Online 3–4 minutes CPU maximum = 85%, CPU average = 10%, memory maximum = 8.5 GB 7 minutes CPU maximum = 66%, CPU average = 28%, memory maximum = 7.0 GB 1400 per minute
10,000 VDI Outage 4–5 minutes CPU maximum = 90%, CPU average = 17%, memory maximum = 8.2 GB 5 minutes CPU maximum = 90%, CPU average = 45%, memory maximum = 8.5 GB 2000 per minute
1000 RDS, 20,000 sessions Online 1–2 minutes CPU maximum = 60%, CPU average = 20%, memory maximum = 8.6 GB 17 minutes CPU maximum = 66%, CPU average = 25%, memory maximum = 6.8 GB 1200 per minute
1000 RDS, 20,000 sessions Outage 3–4 minutes CPU maximum = 22%, CPU average = 10%, memory maximum = 8.5 21 minutes CPU maximum = 90%, CPU average = 50%, memory maximum = 7.5 GB 1000 per minute

Configuration synchronization resource usages

The configuration synchronization process keeps the Cloud Connectors up to date with Citrix DaaS. Updates are automatically sent to the Cloud Connectors to make sure that the Cloud Connectors are ready to take over brokering if an outage occurs. The configuration synchronization updates the LHC database, SQL Express Server LocalDB. The process imports the data to a temporary database then switches to that database once imported. This ensures that there is always an LHC database ready to take over.

CPU, memory, and disk usage are temporarily increased while data is imported to the temporary database.

Test results:

  • Data import time: 7–10 minutes
  • CPU usage:
    • maximum = 25%
    • average = 15%
  • Memory usage:
    • maximum = 9 GB
    • increase of approximately 2 GB to 3 GB
  • Disk usage:
    • 4 MB/s disk read spike
    • 18 MB/s disk write spike
    • 70 MB/s disk write spike during downloading and writing of xml config files
    • 4 MB/s disk read spike at the completion of import
  • LHC database size:
    • 400–500 MB database file
    • 200–300 MB log database

Test conditions:

  • Tested on an 8 vCPU AMD EPYC
  • The imported site configuration database was for an environment with site-wide total of 80,000 VDAs and 300,000 users (three shifts of 100,000 users)
  • Data import time was tested on a resource location with 10,000 VDI

Additional resource usage considerations:

  • During import the full site configuration data is downloaded. This download might cause a memory spike, depending on the site size.
  • The tested site used approximately 800 MB for the database and database log files combined. During a configuration synchronization, these files are duplicated with a maximum combined size of approximately 1600 MB. Ensure that your Cloud Connector has enough disk space for the duplicated files. The configuration synchronization process fails if the disk is full.
Size and scale considerations for Cloud Connectors