Local Host Cache
To ensure that the Citrix Virtual Apps and Desktops Site database is always available, Citrix recommends starting with a fault-tolerant SQL Server deployment, by following high availability best practices from Microsoft. (For supported SQL Server high availability features, see Databases.) However, network issues and interruptions may result in users not being able to connect to their applications or desktops.
The Local Host Cache (LHC) feature allows connection brokering operations in a Site to continue when an outage occurs. An outage occurs when the connection between a Delivery Controller and the Site database fails in an on-premises Citrix environment. Local Host Cache engages when the site database is inaccessible for 90 seconds.
As of XenApp and XenDesktop 7.16, the connection leasing feature (a predecessor high availability feature in earlier releases) was removed from the product, and is no longer available.
Local Host Cache includes the following information, which is a subset of the information in the main database:
- Identities of users and groups who are specifically assigned rights to resources published from the Site.
- Identities of users who are currently using, or who have recently used, published resources from the Site.
- Identities of VDA machines (including Remote PC Access machines) configured in the Site.
- Identities (names and IP addresses) of client Citrix Receiver machines being actively used to connect to published resources.
It also contains information for currently active connections that were established while the main database was unavailable:
- Results of any client machine endpoint analysis performed by Citrix Receiver.
- Identities of infrastructure machines (such as NetScaler Gateway and StoreFront servers) involved with the Site.
- Dates and times and types of recent activity by users.
How it works
The following graphic illustrates the Local Host Cache components and communication paths during normal operations.
During normal operations:
- The principal broker (Citrix Broker Service) on a Controller accepts connection requests from StoreFront, and communicates with the Site database to connect users with VDAs that are registered with the Controller.
- A check is made periodically (one minute after the previous check finished) to determine whether changes have been made to the principal broker’s configuration. Those changes could have been initiated by PowerShell/Studio actions (such as changing a Delivery Group property) or system actions (such as machine assignments).
- If a change has been made since the last check, the Citrix Config Synchronizer Service (CSS) synchronizes (copies) information to the Citrix High Availability Service on the Controller. (In some documentation, the High Availability Service is referred to as the secondary broker.) All broker configuration data is copied, not just items that have changed since the previous check. The High Availability Service imports the data into a Microsoft SQL Server Express LocalDB database on the Controller. The CSS ensures that the information in the LocalDB database matches the information in the Site database. The LocalDB database is re-created each time synchronization occurs.
- If no changes have occurred since the last check, no data is copied.
The following graphic illustrates the changes in communications paths if the principal broker loses contact with the Site database (an outage begins).
When an outage begins:
- The principal broker can no longer communicate with the Site database, and stops listening for StoreFront and VDA information (marked X in the graphic). The principal broker then instructs the High Availability Service to start listening for and processing connection requests (marked with a red dashed line in the graphic). The High Availability Service disards all calls from the CSS.
- When the outage begins, the High Availability Service has no current VDA registration data, but as soon as a VDA communicates with it, a re-registration process is triggered. During that process, the High Availability Service also gets current session information about that VDA.
- While the High Availability Service is handling connections, the principal broker continues to monitor the connection to the Site database. When the connection is restored, the principal broker instructs the High Availability Service to stop listening for connection information, and the principal broker resumes brokering operations. The next time a VDA communicates with the principal broker, a re-registration process is triggered. The High Availability Service removes any remaining VDA registrations from the previous outage, and resumes updating the LocalDB database with configuration changes received from the CSS.
The transition between normal and outage mode does not affect existing sessions; it affects only the launching of new sessions.
In the unlikely event that an outage begins during a synchronization, the current import is discarded and the last known configuration is used.
The event log provides information about synchronizations and outages. See the “Monitor” section below for details.
You can also intentionally trigger an outage; see the “Force an outage” section below for details about why and how to do this.
Among its other tasks, the CSS routinely provides the High Availability Service with information about all Controllers in the zone. (If your deployment does not contain multiple zones, this action affects all Controllers in the Site.) Having that information, each High Availability Service knows about all peer High Availability Services.
The High Availability Services communicate with each other on a separate channel. They use an alphabetical list of FQDN names of the machines they’re running on to determine (elect) which High Availability Service will be in charge of brokering operations in the zone if an outage occurs. During the outage, all VDAs re-register with the elected High Availability Service. The non-elected High Availability Services in the zone will actively reject incoming connection and VDA registration requests.
If an elected High Availability Service fails during an outage, another High Availability Service is elected to take over, and VDAs will re-register with the newly-elected High Availability Service.
During an outage, if a Controller is restarted:
- If that Controller is not the elected primary broker, the restart has no impact.
- If that Controller is the elected primary broker, a different Controller is elected, causing VDAs to re-register. After the restarted Controller powers on, it automatically takes over brokering, which causes VDAs to re-register again. In this scenario, performance may be affected during the re-registrations.
If you power off a Controller during normal operations and then power it on during an outage, Local Host Cache cannot be used on that Controller if it is elected as the primary broker.
The event log provides information about elections. See the “Monitor” section below.
Design considerations and requirements
There is no time limit imposed for operating in outage mode. However, restore the site to normal operation as quickly as possible.
- You cannot use Studio or run PowerShell cmdlets.
- Hypervisor credentials cannot be obtained from the Host Service. All machines are in the unknown power state, and no power operations can be issued. However, VMs on the host that are powered-on can be used for connection requests.
- An assigned machine can be used only if the assignment occurred during normal operations. New assignments cannot be made during an outage.
- Automatic enrollment and configuration of Remote PC Access machines is not possible. However, machines that were enrolled and configured during normal operation are usable.
- Server-hosted applications and desktop users may use more sessions than their configured session limits, if the resources are in different zones.
- Users can launch applications and desktops only from registered VDAs in the zone containing the currently active/elected High Availability Service. Launches across zones (from a High Availability Service in one zone to a VDA in a different zone) are not supported during an outage.
- If a site database outage occurs before a scheduled restart begins for VDAs in a Delivery Group, the restarts begin when the outage ends. This can have unintended results. For more information, see Scheduled restarts delayed due to database outage.
- Tag restrictions where tags are used to designate zones are not supported for session launches. When such tag restrictions are configured, and a StoreFront store’s advanced health check option is enabled, sessions might intermittently fail to launch.
Local Host Cache is supported for server-hosted applications and desktops, and static (assigned) desktops.
By default, power-managed desktop VDAs in pooled delivery groups (created by MCS or Citrix Provisioning) that have the
ShutdownDesktopsAfterUse property enabled are not available for new connections during a Local Host Cache event. You can change this default, to allow those desktops to be used during Local Host Cache. However, you cannot rely on the power management during the outage. (Power management resumes after normal operations resume.) Also, those desktops might contain data from the previous user, because they have not been restarted.
To override the default behavior, you must enable it site-wide and for each affected Delivery Group. Run the following PowerShell cmdlets.
Set-BrokerSite -ReuseMachinesWithoutShutdownInOutageAllowed $true
Set-BrokerDesktopGroup -Name "name" -ReuseMachinesWithoutShutdownInOutage $true
Enabling this feature in the Site and the Delivery Groups does not affect how the configured “ShutdownDesktopsAfterUse” property works during normal operations.
Without enabling ReuseMachinesWithoutShutdownInOutageAllowed at the Site level and ReuseMachinesWithoutShutdownInOutage at the delivery group level, all session launch attempts to power-managed desktop VDAs in pooled delivery groups will fail during a Local Host Cache event.
The LocalDB service can use approximately 1.2 GB of RAM (up to 1 GB for the database cache, plus 200 MB for running SQL Server Express LocalDB). The High Availability Service can use up to 1 GB of RAM if an outage lasts for an extended interval with many logons occurring (for example, 12 hours with 10K users). These memory requirements are in addition to the normal RAM requirements for the Controller, so you might need to increase the total amount of RAM capacity.
Note that if you use a SQL Server Express installation for the Site database, the server will have two sqlserver.exe processes.
A Controller’s CPU configuration, particularly the number of cores available to the SQL Server Express LocalDB, directly affects Local Host Cache performance, even more than memory allocation. This CPU overhead is observed only during the outage period when the database is unreachable and the High Availability service is active.
While LocalDB can use multiple cores (up to 4), it’s limited to only a single socket. Adding more sockets will not improve the performance (for example, having 4 sockets with 1 core each). Instead, Citrix recommends using multiple sockets with multiple cores. In Citrix testing, a 2x3 (2 sockets, 3 cores) configuration provided better performance than 4x1 and 6x1 configurations.
As users access resources during an outage, the LocalDB grows. For example, during a logon/logoff test running at 10 logons per second, the database grew by one MB every 2-3 minutes. When normal operation resumes, the local database is recreated and the space is returned. However, sufficient space must be available on the drive where the LocalDB is installed to allow for the database growth during an outage. Local Host Cache also incurs additional I/O during an outage: approximately 3 MB of writes per second, with several hundred thousand reads.
During an outage, one High Availability Service handles all the connections, so in Sites (or zones) that load balance among multiple Controllers during normal operations, the elected High Availability Service might need to handle many more requests than normal during an outage. Therefore, CPU demands will be higher. Every High Availability Service in the Site (zone) must be able to handle the additional load imposed by LocalDB and all of the affected VDAs, because the High Availability Service elected during an outage could change.
- In a single-zone VDI deployment, up to 10,000 VDAs can be handled effectively during an outage.
- In a multi-zone VDI deployment, up to 10,000 VDAs in each zone can be handled effectively during an outage, to a maximum of 40,000 VDAs in the site. For example, each of the following sites can be handled effectively during an outage:
- A site with four zones, each containing 10,000 VDAs.
- A site with seven zones, one containing 10,000 VDAs, and six containing 5,000 VDAs each.
During an outage, load management within the Site may be affected. Load evaluators (and especially, session count rules) may be exceeded.
During the time it takes all VDAs to re-register with a High Availability Service, that service might not have complete information about current sessions. So, a user connection request during that interval could result in a new session being launched, even though reconnection to an existing session was possible. This interval (while the “new” High Availability Service acquires session information from all VDAs during re-registration) is unavoidable. Note that sessions that are connected when an outage starts are not impacted during the transition interval, but new sessions and session reconnections could be.
This interval occurs whenever VDAs must re-register:
- An outage starts: When migrating from a principal broker to a High Availability Service.
- High Availability Service failure during an outage: When migrating from a High Availability Service that failed to a newly-elected High Availability Service.
- Recovery from an outage: When normal operations resume, and the principal broker resumes control.
You can decrease the interval by lowering the Citrix Broker Protocol’s HeartbeatPeriodMs registry value (default = 600000 ms, which is 10 minutes). This heartbeat value is double the interval the VDA uses for pings, so the default value results in a ping every 5 minutes.
For example, the following command changes the heartbeat to five minutes (300000 milliseconds), which results in a ping every 2.5 minutes:
New-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer -Name HeartbeatPeriodMs -PropertyType DWORD –Value 300000
Use caution when changing the heartbeat value. Increasing the frequency results in greater load on the Controllers during both normal and outage modes.
The interval cannot be eliminated entirely, no matter how quickly the VDAs register.
The time it takes to synchronize between High Availability Services increases with the number of objects (such as VDAs, applications, groups). For example, synchronizing 5000 VDAs might take ten minutes of more to complete. See Monitor for information about synchronization entries in the event log.
Although this Local Host Cache implementation shares the name of the Local Host Cache feature in XenApp 6.x and earlier XenApp releases, there are significant improvements. This implementation is more robust and immune to corruption. Maintenance requirements are minimized, such as eliminating the need for periodic dsmaint commands. This Local Host Cache is an entirely different implementation technically.
Manage Local Host Cache
For Local Host Cache to work correctly, the PowerShell execution policy on each Controller must be set to RemoteSigned, Unrestricted, or Bypass.
The Microsoft SQL Server Express LocalDB that Local Host Cache uses is installed automatically when you install a Controller or upgrade a Controller from a version earlier than 7.9. There is no administrator maintenance needed for the LocalDB. Only the High Availability Service communicates with this database. You cannot use PowerShell cmdlets to change anything about this database. The LocalDB cannot be shared across Controllers.
The SQL Server Express LocalDB database software is installed regardless of whether Local Host Cache is enabled.
To prevent its installation, install or upgrade the Controller using the XenDesktopServerSetup.exe command, and include the
/exclude "Local Host Cache Storage (LocalDB)" option. However, keep in mind that the Local Host Cache feature will not work without the database, and you cannot use a different database with the High Availability Service.
Installation of this LocalDB database has no effect on whether or not you install SQL Server Express for use as the site database.
For information about replacing an earlier SQL Server Express LocalDB version with a newer version, see Replace SQL Server Express LocalDB.
During a new installation of Citrix Virtual Apps and Desktops (minimum version 7.16), Local Host Cache is enabled. After an upgrade (to version 7.16 or later), Local Host Cache is enabled if there are fewer than 10,000 VDAs in the entire deployment.
To enable Local Host Cache, enter:
Set-BrokerSite -LocalHostCacheEnabled $true
To determine whether Local Host Cache is enabled, enter:
Check that the LocalHostCacheEnabled property is True.
To disable Local Host Cache, enter:
Set-BrokerSite -LocalHostCacheEnabled $false
Remember: As of XenApp and XenDesktop 7.16, connection leasing (the feature that preceded Local Host Cache beginning with version 7.6) was removed from the product, and is no longer available.
To verify that Local Host Cache is set up and working correctly:
- Ensure that synchronization imports complete successfully. Check the event logs.
- Ensure that the SQL Server Express LocalDB database was created on each Delivery Controller. This confirms that the High Availability Service can take over, if needed.
- On the Delivery Controller server, browse to C:\Windows\ServiceProfiles\NetworkService.
- Verify that HaDatabaseName.mdf and HaDatabaseName_log.ldf are created.
- Force an outage on the Delivery Controllers. After you’ve verified that Local Host Cache works, remember to place all of the Controllers back into normal mode. This can take approximately 15 minutes, to avoid VDA registration storms.
Force an outage
You might want to deliberately force a database outage.
- If your network is going up and down repeatedly. Forcing an outage until the network issues resolve prevents continuous transition between normal and outage modes.
- To test a disaster recovery plan.
- While replacing or servicing the site database server.
To force an outage, edit the registry of each server containing a Delivery Controller. In
REG_DWORD to 1. This instructs the broker to enter outage mode, regardless of the state of the database. Setting the value to 0 takes the server out of outage mode.
Event logs indicate when synchronizations and outages occur.
Config Synchronizer Service:
During normal operations, the following events can occur when the CSS copies and exports the broker configuration and imports it to the LocalDB using the High Availability Service.
- 503: A change was found in the principal broker configuration, and an import is starting.
- 504: The broker configuration was copied, exported, and then imported successfully to the LocalDB.
- 505: An import to the LocalDB failed; see below for more information.
- 507: An import was abandoned due to a pending outage. When an outage begins during a synchronization, the current import is discarded and the last known configuration is used.
- 510: No Configuration Service configuration data received from primary Configuration Service.
- 517: There was a problem communicating with the primary Broker.
- 518: Config Sync script aborted because the secondary Broker (High Availability Service) is not running.
High Availability Service:
- 3502: An outage occurred and the High Availability Service is performing brokering operations.
- 3503: An outage has been resolved and normal operations have resumed.
- 3504: Indicates which High Availability Service is elected, plus others involved in the election.
Several troubleshooting tools are available when an synchronization import to the LocalDB fails and a 505 event is posted.
CDF tracing: Contains options for the ConfigSyncServer and BrokerLHC modules. Those options, along with other broker modules, will likely identify the problem.
Report: You can generate and provide a report that details the failure point. This report feature affects synchronization speed, so Citrix recommends disabling it when not in use.
To enable and produce a CSS trace report, enter:
New-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -PropertyType DWORD -Value 1
The HTML report is posted at C:\Windows\ServiceProfiles\NetworkService\AppData\Local\Temp\CitrixBrokerConfigSyncReport.html
After the report is generated, disable the reporting feature:
Set-ItemProperty -Path HKLM:\SOFTWARE\Citrix\DesktopServer\LHC -Name EnableCssTraceMode -Value 0
Export the broker configuration: Provides the exact configuration for debugging purposes.
Export-BrokerConfiguration | Out-File <file-pathname>
Export-BrokerConfiguration | Out-File C:\\BrokerConfig.xml.