XenDesktop 7.6: Connection leasing design considerations
May 23, 2016
Description: The connection leasing feature in XenDesktop 7.6 allows users to connect and reconnect to their most recently used applications and desktops, even when the Site database is not available. This article provides design considerations, configuration parameters, and limitations for this feature.
Connection leasing overview
The best and recommended way to make sure users have reliable access to their resources is a XenDesktop Site database hosted on a highly available SQL Server configuration. In situations where this is not possible or if the whole SQL cluster has failed, connection leasing lets users connect to recently accessed resources, provided the VDAs for those connections are still accessible. By default, users must have connected to the resources within the past 14 days. An administrator can configure this time period. (See a full list of configurable parameters below.)
Connection leasing creates lease files (XML files) for applications and desktops, and caches them on each Delivery Controller in the Site. Each of these XML files contains information about the user, their available resources, and the resources they have previously connected to.
Connection leasing is not intended to be a replacement for a SQL Server high-availability configuration. It is only applicable to XenApp/RDS and XenDesktop-based assigned desktops. It is not applicable to XenDesktop-based pooled desktops.
For Remote PC deployments, these are considered assigned desktops and therefore are covered by connection leasing.
Connection leasing has certain limitations when the Site database has failed or is inaccessible:
- Citrix Studio and Director operations are unavailable.
- Citrix PowerShell cmdlets requiring database access will not work.
- No VDA load balancing will occur.
- Users can connect only to the last host they connected to when the Site database was available.
- There is a small two-minute window during which no sessions will be brokered when the Site database becomes unavailable or is restored. This is to allow for environments with SQL HA enabled to fail over, such that leasing does not become enabled when there is only a short window where site database connectivity is interrupted.
- Users must have logged on to the resources within the default 14-day period. You can configure this time period via a registry setting. (See a full list of configurable parameters below.)
- Anonymous users are not supported.
If the Site database and resource hosts are not available, then users will not be redirected to alternative hosts if the last host they connected to is unavailable. In this respect, the lease files are a snapshot of the last successful result of a user connecting to resources.
By default, connection leasing is enabled in XenDesktop 7.6.
When a user connects to a resource for the first time, the Controller collects information about the connection and sends this information to the Site SQL database. At regular 10-second intervals, the information is then synchronized down to each Controller. If the database fails in between synchronization intervals, the Controller won’t be able to pull down the lease file.
With connection leasing, each Controller synchronizes with the SQL database; there is no direct Controller-to-Controller communication.
Delivery Controller requirements and scalability considerations
During the initial connection lease creation phase, there will be a significant number of small files written to each Delivery Controller, dependant on user count, application count, and deployment type. The default configuration is such that systems should not be impacted in terms of IOPS, CPU, or session launch rates. However, each Controller in the Site needs to have additional space for lease file storage, so systems with shared persistent storage for the Controllers may require reassessment. An additional 2 GB of disk space on each Controller is a good starting point for lease storage.
The lease files are created in timed batches to distribute the required disk I/O. By default, each Controller will synchronize at a rate of up to 1,000 leases every 10 seconds until all leases have been synchronized. You can configure this via a registry setting. (See a full list of configurable parameters below.)
Adding, editing, or retiring user resources will also cause updates to the lease files, with the default rate of 1,000 leases being updated every 10 seconds.
In previous versions of XenDesktop, a Site database failure would impact all users trying to access resources, and brokering of new sessions wouldn’t occur until full connectivity was restored. With connection leasing, when a database failure is detected, there is a configurable two-minute period where the locally cached leases will be used to broker sessions instead of brokering new connections. During this time, VDAs will start to deregister with the Controller. When database connectivity is restored, there will be another two-minute window before sessions are brokered.
As with previous versions of XenDesktop, VDAs will then start to reregister with the Controllers. This process will consume some CPU depending on how many VDAs are attempting to reregister. For large-scale VDI deployments with thousands of desktops, the time it takes for all VDAs to reregister will depend on how many reregistration requests each Controller can handle.
For Controllers using physical hosts, a battery-backed caching Controller can smooth the impact in environments where relatively large amounts of users are logging on in relatively short time windows. For virtual hosts and those using shared storage, make sure there is a sufficient amount of IOPS available to the Controllers and disk space.
In normal Site operations, connection leasing causes minimal impact on overall system operations. But for environments with a likelihood of Site database failure, you may need extra Controllers due to the VDA reregistration loads occurring during Site database loss because load balancing is not available when connection leasing is enabled.
When connection leasing is active, VDA registration storms can cause significant CPU usage depending on the total number of VDAs in the environment. Intermittent database problems can exacerbate this problem as VDAs register and deregister as connectivity to the database is restored and lost again. In such a scenario, it is worth considering removing database connectivity completely until the database issues are fully resolved.
Multiple Controllers and partial connectivity loss
In multiple-Controller environments, connection leasing will only activate if all Controllers lose database connectivity. If one or more Controllers are still connected, VDAs will attempt to register with one of those connected Controllers.
This increase in registration demand will require CPU for processing. You should examine partial-failure scenarios and how many VDA registration requests a single Controller will be required to meet if all others lose database connectivity.
Lease layout on the Controller
ach lease file is relatively small, typically less than 1KB. But because each user will have several lease files depending on how many applications and desktops they have access to, the number of files created on each Controller can be significant and must be planned for accordingly.
By default, leases files are stored in subdirectories in %programdata%\Citrix\Broker\Cache:
You can change the leases’ location by modifying a registry key. (See the full list of configurable parameters below.)
The Apps directory contains information about published applications. There is one file per published app per delivery group. As such, this subdirectory should remain relatively small in terms of size and number of objects.
The Desktops directory contains an entry per user VDA. In a VDI environment, this will be one for every user-assigned VDI desktop, and in an RDS worker, there will be one entry per published desktop. A VDI environment normally requires much more disk space than an RDS environment because there is one-to-one mapping between users and their assigned desktop, rather than the many-users-to-one RDS host desktop.
The Icons directory will have one entry per unique published application and one for desktops. Desktops typically share one standard icon unless otherwise configured. If a published application shares the same base executable as another, only one icon entry will be created. Icons will tend to be larger than lease files because they contain the raw bitmap information on how to draw the icon for the application or desktop, but this should still be only in the hundred Ks for a typical icon.
The Leases\Enumeration directory contains an entry for the resources available to each user; there is one entry per user. The size of the file will depend on the number of resources available to the user.
The Leases\Launch directory contains an entry for each successful user VDA logon, one for each desktop that the user is entitled to (and has launched), and one for applications. Only a single application lease file is created no matter how many are available to the user, as session sharing will normally direct the user to the same host in normal operations. The user can launch any app published from a delivery group from which they have previously launched an app, even if it’s not the same one. It is possible that the enumeration lease may include details of apps and desktops that are no longer available to the user. For example, in a scenario where the Controller or desktop that hosts the resources is unavailable, only the previously connected host will be used, as load balancing is not active during connection leasing.
The Workers directory contains one entry per VDA. Like the Desktops directory, a VDI environment will generally contain more lease files than an RDS one; each assigned desktop has data associated with the user, rather than many users accessing the same RDS host.
Note that published applications launched inside a desktop session will create their own lease files. In some Site database failure scenarios, like a partial network failure for example, a user may be able to connect to a desktop whose host is connected but not the published applications available on that desktop if they’re hosted on a separate server affected by a network outage.
Calculating the number of lease files
The number of lease files created can be calculated as follows:
Apps + unique icons + Users + (Users \* Desktops) + (Users \* Application delivery groups) + VDAs (for RDS, 1 per hosted desktop. For VDI, one for each user desktop)
In this instance, “users” refers to the number of individual users who use the XenDesktop Site.
If a resource is accessible from a LAN and a WAN via Access Gateway, then the calculation is:
(Users \*2) + (Users \* Desktops) + (Users \* Application delivery groups) + VDAs (for RDS, 1 per hosted desktop. For VDI, one for each user desktop)
Lease file size
The size of the lease files will vary between environments, but the following figures may be used as a guide:
Average Application entry = 0.5K
Average Desktop entry ~1K
Average Icon (>1K<512K) ~256K
Enumeration Lease ~0.5K per available desktop + ~0.5K per available application
Launch Lease ~0.5K per desktop, 0.5K if the users have applications published.
For a normal NTFS file system with a default block size of 4K, each lease will consume a 4K block, icons being larger will be rounded up to the next 4K block. As the number of lease files grows, there will be a disparity between the “size” and “size on disk” properties of the cache directory due to the leases consuming 4K, even though they are much smaller.
For RDS desktops and applications
For an environment with one desktop, six applications, and 18 RDS hosts, serving 100,000 users, single client per user. For this example, each user connected to a desktop, and 10,000 users launched a single application.
The size of the icons folder will depend on the amount of unique icons for the applications published and their size. Publishing 100 Notepads for example will only need one icon lease file, but 100 unique application executables will require 100 separate icon files.
When a user connects to a published application, normally each subsequent application launch will use the same host and session share, so one lease will be created for applications.
|Folder||Files||Folders||Size||Size on Disk (4K block size)|
* Lease launch files are sparse files, see below.
For an assigned VDI desktop environment with 40,000 users, single client per user
|Folder||Files||Folders||Size||Size on Disk (4K block size)|
*Lease launch files are sparse files; see below.
The apparent lack of space consumed by the VDI Leases\Launch directory is due to the very small size of the lease XML files created within. These files are less than 512 bytes in size, enabling NTFS to utilize a feature called sparse files and store their data inside the Master File Table (MFT) itself. You should, however, plan for them as if they were consuming a 4K block.
Because of the potential large number of files being created, connection leasing limits the number of files that it will create in a single directory, creating new subdirectories as required.
It is important to remember that lease creation and synchronization occurs when:
- Users access their resources
- Application details, icons, or desktops are changed or updated
- Leases expire (default is two weeks since last access)
- Leasing is manually disabled and re-enabled
- Long lived sessions receive updates
For a new environment, when a user logs on to a resource for the first time, the Controller sends the information to the Site database and then syncs down the lease files at a rate of 1,000 leases every 10 seconds. During the initial synchronization of leases, there will be considerable I/O on the Controller and SQL Server, depending on the rate at which users are connecting to resources.
Consider an environment supporting 40,000 users with a single VDI desktop:
- Each assigned VDI desktop has an entry in the Desktops subdirectory.
- Upon logon, the Leases\Enumeration subdirectory will have one entry per user to determine which resources the user can access.
- An entry in the Workers subdirectory is created for the user desktop.
- When the user launches their desktop, one entry per user is created in the Leases\Launch directory.
There will be a total of 160,000 XML files created once all users have logged in. If we take the pathological extreme, where the environment was created and 40,000 users were able to launch almost instantaneously, for the default 1,000 leases per 10s we would need:
- 160,000/1,000 = 160 synchronisation operations, spaced 10 seconds apart
- 1,600 seconds = 26.67 minutes before all the leases were fully synchronized from the database
Depending on your disk subsystem, the 1,000 lease files may take a number of seconds for the disk to commit them because of IOPS restrictions.
While 1,000 leases per 10 seconds averages to 100 per second, systems will not necessarily handle the load linearly, so perfmon counters should be used to observe the disk system to see how the load is distributed. The data itself is relatively small in our 40,000 single desktop VDI—116MB—the large number of files will be taxing the IOPS of the disk.
If you consider the average IOPS of a single traditional spinning disk with no caching Controller, the IOPS available will be as follows:
|15k rpm: 180-210 IOPS|
|10k rpm: 130-150 IOPS|
|7,200 rpm: 80-100 IOPS|
|5,400 rpm: 50-80 IOPS|
A low-end blade server with single SATA 7200RPM disk and no caching Controller is going to struggle somewhat in this extreme example, as this load is in addition to any IOPS required by normal Windows operations. The leases will be synchronized, but there will be a backlog in the disk queue. In the unlikely event that this level of storage was being used for a relatively large number of users, the synchronization rate may have to be lowered, which will mean a longer time for all of the leases to be cached on the Controller.
In enterprise systems with battery-backed caching Controllers, even a modest amount of cache (128MB) would soak the IOPS over the 26-minute period, and a more aggressive number of leases per sync or shorter sync period could be set.
For environments employing shared storage, whether using physical or VMs for the Controllers, the shared storage will experience bursts dependent on the number of Controllers on the shared storage, but once again, battery-backed cache and tiered storage will alleviate this load.
In reality, the desktops and enumeration files would not be being created at the same time as the launch leases, and user logon times would have a wider spread.
For the following test, 40,000 users were launched at a rate of 60 logons every 1.08 seconds, with the last logon completing at 11:32. The Controllers used in this environment were physical machines with a single local SAS disk and no RAID card.
During this test run, the Controller is pulling down the leases at the default rate of 1,000 every 10 seconds, and the leases are keeping the disk system busy until all 160,000 are synchronized from the database. As expected, the disk queue and IOPS vary as users are logged on, but all the leases are synced within a few minutes of the last successful logon.
If your Controllers are using shared storage, it is important to note that each one will have its own copy of the leases. So in an environment with five Controllers and the worst-case scenario of no desktops being registered, there will be a total of 800K files created on the storage (assuming no inline de-duplication or other advanced storage technology in operation).
In our example RDS host setup, we have ~110K files to synchronize with the Site database, which will take about 18 minutes, assuming that number of users could launch at that rate. The amount of data is larger at 820MB, but over an 18-minute interval, this is still quite modest at less than 1 MB per second. A caching Controller will once again limit the impact on the system. As for VDI environments, this is for each Controller, so consider the impact for shared storage environments.
Disk systems with compression enabled
With the possible exception of the icon XML files, the lease files are text information that is likely to compress fairly well on systems with compression enabled.
Forcing lease refresh
You can force leases to be re-created on one or all Controllers in the Site. For a single Controller, use the following PowerShell cmdlet:
Update-BrokerLocalLeaseCache \[-Workers\] \[-Applications\] \[-Icons\] \[-Desktops\] \[-Leases\] \[-LoggingId \<Guid\>\] \[-AdminAddress \<String\>\] \[\<CommonParameters\>\] Update-BrokerLocalLeaseCache
This will remove all or the specified leases from the local or target address.
If you want to force all Controllers to refresh their leases, you can disable then re-enable connection leasing using the following PowerShell cmdlets:
Set-BrokerSite -ConnectionLeasingEnabled $false
The leases will start to be deleted Site-wide from each Controller; this will take some time depending on the number of leases present.
To enable leasing again:
Set-BrokerSite -ConnectionLeasingEnabled $true
This will force synchronization with the Site database for each Controller in the environment.
Disk activity when leasing is active
If connectivity to the Site database is lost or it becomes otherwise unavailable, the Controllers will use the lease files to broker connections. For each user, the Controller will have to:
- Read the enumeration lease to verify what resources are available to the user
- Read the launch lease
- Read the worker lease to direct the user to the appropriate RDS host or desktop
It most scenarios, the lease files haven’t been accessed since their creation, and it’s not likely that they will be sitting in any cache, so disk I/O will be required to broker the session. If a large number of users try to connect in a short period of time, the disk subsystems will see a large number of IOPS requests.
Cached lease file updates
Each Controller checks with the Site database every 10 seconds to see if changed or updated lease files are available. The new or updated leases are then synchronized at the normal rate of up to 1,000 per 10 seconds.
By default, leases have a two-week timeout. If a user hasn’t accessed a resource in longer than two weeks, their associated lease files will be cleaned up and new leases created on their next logon, subject to the Site database being accessible at this time.
Controller CPU load
During the logon phase, the broker service will be busy servicing requests, but lease creation runs on a background thread, syncing every 10 seconds. As a result, the additional CPU load should be minimal. During 40,000 logon testing, there was minimal difference between Controller CPU during initial logon seed leasing and logon after seed leasing. However, if a Site database problem occurs, the brokers will have to handle load caused by the VDAs.
CPU load on Controllers when leasing is active
The additional CPU load caused by leases being used to broker sessions has not been significant compared to normal brokering operations. When the Site database goes down—as in previous releases of XenDesktop—VDA workers will start to deregister with the Controllers and then try and reregister, causing a spike in reregistration requests. This is particularly relevant to the VDI case, where there will be a significant number of desktops trying to reregister against the Controllers in the Site until reregistration is successful. In our 40,000 desktops scenario, peaks of 178 reregistration requests were being processed per second, with CPU load peaking at 100 percent.
In previous versions of XenDesktop, the Controller CPU load caused by Site database failure was a lesser issue than restoring the database connectivity. For extended database outages, where all VDAs have become unregistered, it may take some time for all VDAs to become registered again.
In an RDS host environment, there are usually fewer VDAs than in the VDI environment (for example, 100,000 users on 1,000 RDS hosts), so the registration storm is less severe, and the Controllers will not be as stressed.
With connection leasing enabled, the extra load will be dependent on the rate at which users are brokering sessions while the database is down. But the load is relatively small compared to the load caused by the registration storm, and logon rates should approach those of normal environment operations.
For Controllers deployed in a virtual environment where CPU resources have been heavily overcommitted, logon and reregistration rates may be impacted. Using hypervisor and Windows performance counters is recommended to characterize the load.
Connection launch rates with connection leasing active
During a test run, a target of 40,000 VDI logons was attempted at a rate of 60 logons every 1.08 seconds. During the logon phase, the Site database was deliberately disconnected and then reconnected twice during the test. The transition period of the database going down to connection leasing becoming active and the subsequent reconnection of the database are visible as flat lines. The launch rate takes a couple minutes to recover to the rate when the database is available, but it does recover to a comparable pre-failure rate. The second outage is more problematic because it occurs while VDAs are still in a registration phase caused by the first outage.
In situations where intermittent database issues are occurring, it is worth considering removing database connectivity until all issues have been resolved to alleviate VDA registration storms and some users being unable to log on during the transition periods.
Configuring connection leasing parameters
The default values used by connection leasing should work well with many environments. If you want to change the default settings, you can either:
- Use Group Policy via GPOs
- Manually configure the registry on each Controller
If you are using GPOs to configure your Controllers, the GPO will store the configuration details in each Controller registry under:
You should not change the registry values directly if using Group Policy to configure the settings.
If you are not using a GPO, the configuration is stored under:
If you are not using GPOs, in a multi-Controller Site, you will need to edit the registry on each Controller. Note that connection leasing does not populate the registry keys by default, so you will need to create them if you want to change the default values.
As with any changes to the registry, care should be taken when modifying parameters, and a rollback/recovery plan should be in place before making changes.
|DeletionCheckItemLimitPerCycle||int||100||Minimum=1||Controls the number of items (lease files, directories) deleted each time the regular cleanup process runs, which is set by SyncCleanDelaySecs. The default is 100 objects every 2 minutes. The number of items checked in a particular cycle is by whole subdirectories, so this limit may be exceeded by the size of the last subdirectory encountered.|
|EnumerationLeaseKeyMask||int||9||Setting that controls the components of the enumeration lease key. Bit 0 - User Sid Bit 1 - Client Name Bit 2 - Client IP Address Bit 3 - ViaAG flag Bit 4 – AccessTags|
|LaunchRefCacheExpiryMaxMins||int||3||Minutes||Setting that controls the maximum time the logon tickets are cached in memory before they are discarded.|
|LeaseExpirationTimeInMins||int||20160||Minutes||Setting that specifies the time in minutes after which the lease will expire once it’s stored in the database. The default value of 20,160 minutes corresponds to 14 days. The Site service removes expired leases every 30 minutes. If the lease needs to be removed earlier than that, you need to change the Site service frequency. The Controller won’t use expired leases, even if they are still present in the lease cache.|
|LeaseMarkedDeletedTimeInMins||int||30||Minutes||The maximum time a lease will remain in deleted state before its purged.|
|MaxItemsPerSyncCycle||int||1000||The maximum number of leases to sync per sync cycle. This helps to throttle and restrict the number of disk writes that would be generated every time the sync runs.|
|MaxRetryDuringLocalCacheDeletion||int||5||Controls the maximum number of delete attempts of the local cache directory in case of IO exceptions.|
|MinLeaseLifetimeFractionBeforeRefresh||int||10||Specifies the lifetime of an unchanged lease before its expiration time is refreshed. The value is specified as a fraction of the LeaseExpirationTimeInMins.|
|PendingFailureMaxSecs||int||90||Seconds||Specifies the maximum time in seconds to wait before entering leasing mode on hitting pending failure state in the DAL layer. Note that there is a default timeout of 30 seconds for SQL Server queries; this timeout is in addition to that, so the total time for connection leasing to become active will be 120 seconds.|
|SyncCleanupDelaySecs||int||120||Seconds||Controls the time between stale lease and cached data cleanup cycles.|
|SyncIntervalSecs||int||10||Seconds||Specifies the intervals in which to check for any leases to sync. A sane value must be larger than SiteDynamicDataRefreshPeriodMs, as Site dynamic data refreshed will tell when lease and other data last changed.|
|SyncLocation||string||%ProgramData%\Citrix\Broker\Cache||The location on the local disk where the leases are cached.|
|SyncStartDelayMins||int||1||Minutes||Controls the elapsed time before the first sync can run after the Controller service has been started.|
|UploadQueueIdleMaxSecs||int||10||Seconds||Controls the max time to wait for the upload queue to be idle. Once the queue idle time passes this limit, even if the queue item threshold is not reached, the contents of the lease queue will be uploaded for sync.|
|UploadQueueMaxItems||int||100||Setting that controls the max items to queue before lease upload is triggered.|
This article has been modified from a white paper written by Joe Deller and posted on the Citrix blog. To download the original document and see the blog comments, go to https://www.citrix.com/blogs/2014/11/11/xendesktop-7-6-connection-leasing-design-considerations/.
XenDesktop 7.6: Connection leasing design considerations
In this article
- Connection leasing overview
- Delivery Controller requirements and scalability considerations
- Multiple Controllers and partial connectivity loss
- Lease layout on the Controller
- Example environments
- Disk activity when leasing is active
- Cached lease file updates
- Controller CPU load
- Connection launch rates with connection leasing active
- Configuring connection leasing parameters