Recently, I’ve written several blogs about the various optimizations technologies XenServer provides for Citrix Machine Creation Services (MCS) workloads. You might wonder which ones you should use and how these all fit together.
The previous blogs have already explained the XenServer IntelliCache and Read Caching techniques, however MCS additionally provides a Write Back Cache (WBC) mechanism. So, it might be unclear how these all fit together, should you use only one of these or all of them.
Each of these techniques optimizes different aspects of an MCS delivered machine and depending on if you use thick or thin provisioning, advantages will vary. This blog will explain when what combination of techniques is likely to give you the best outcome.
Overview of the different scenarios and matching recommendations:
Pooled Machines with WBC
The recommendation in this case would be to use the XenServer Read Cache to optimize these workloads in addition to the Citrix WBC.
The Citrix WBC will optimize the in session data changes caching updates made during the session and the XenServer optimizations will cache the base image, thus improving boot performance and load.
- Both IntelliCache or read cache will enable shared storage load to be significantly reduced, but only one of these techniques should be used.
- If dom0 memory is available, the Read Cache will provide the best improvement (since the caching is only for read requests).
- If there are many different images being booted then IntelliCache might be a better option. The different images may cause the cache content to be flushed too frequently and render the Read Cache inefficient..
Pooled Machines without WBC
In this case I would recommend using IntelliCache to optimize these workloads.
- Intellicache will optimize all the OS reads and the user read and write and reduce the shared storage use significantly.
- Additionally, using Read Cache may reduce boot times, providing the memory allocated on the XenServer host is sufficient will prevent boot storm issues for the shared storage.
- If there are many different images being booted then the Read Cache will most likely not give any performance benefit.
Dedicated machines (thin provisioned)
Recommendation would be to use IntelliCache to optimize these workloads.
- IntelliCache will optimize all the OS reads and the user read and write and reduce the shared storage use significantly.
- Additionally using Read Cache may speed up boot times, especially in scenarios where the number of images is low and there is enough free memory available in dom 0.
Dedicated machines (full clones)
No optimizations are available for this scenario.
Behind the scenes: A detailed discussion of the all the scenarios:
Pooled machines
These are machines that are designed to be used by end users that get a random machine for each session they start. The machine is reset to a known state each time they reboot and do not persist any changes made during the session.
With WBC
If you use the Citrix WBC then all user changes in the session are redirected through a cache. This is efficient as it uses the VM’s memory where possible and overflows to disk as needed. This disk cache can use any storage so can be used to redirect read and write from within the user’s session, tiering storage use by supporting local or remote storage as required.
The WBC however does nothing to optimize the read from the Master Image (OS disk). Especially during boot there is a high volume of reads from this disk, and the WBC does nothing to resolve potential bottlenecks caused by boot storms.
Both the XenServer Read Cache and IntelliCache mechanism can be used as described in the previous blogs to optimize this. IntelliCache will support redirection of the reads using local storage and read cache will use memory in the hypervisor hosts to achieve this. Depending on your resource availability you can choose one of these mechanisms to significantly reduce the load on the shared storage. Both these mechanisms will enable the VMs to run on any host in a XenServer pool enabling migrations as needed to load balance the machines across hosts.
Note: XenServer Read cache is using a single allocation of memory per host to cache the image, whereas the Citrix WBC is using memory from within each running VM.
Without WBC
In this mode all user changes are stored into the differencing disk provided by the hypervisor and the OS data is provided from the master image disk.
There are therefore 2 things XenServer can optimize
- User changes (Delta disk)
- OS reads (Master Image)
User changes
Both the Read Cache and the IntelliCache techniques will improve performance here.
IntelliCache can offload all reads and writes from the user disk to local storage and the reads can be optionally further optimized though the use of the read cache if desired. IntelliCache will permanently cache all this data to avoid it ever hitting the shared storage. The Read Cache will cache reads but since the reads are likely to be different from machine to machine the cache may well be flushed by other reads before the next read of the data so may not be particularly beneficial here.
The OS disk
Especially during boot there is a high volume of reads to this disk Both the XenServer Read Cache and IntelliCache mechanism can be used as described in the previous blogs to optimize this. IntelliCache will optimize this using local storage on the hypervisor and Read Cache will use memory to achieve this. So depending on your resource availability you can choose one of these mechanisms to significantly reduce the load on the shared storage. Both these mechanisms will enable the VMs to run on any host in a XenServer pool enabling migrations as needed to load balance the machines across hosts.
Dedicated machines
These machines are persistent, often user specific, so each time the user starts a session they will continue to use the same machine with the changes being persisted between sessions and reboots.
Dedicated machines cannot use the Citrix WBC mechanisms as these caches are not designed to be persisted across the boots of the machines.
Dedicated machines can be provided using thin provisioning or full clone provisioning methods.
Thin provisioned
This structure is the same as for the pooled machine, the only difference being that the user changes are persisted so the same optimizations apply to this as for the pooled machine without WBC.
Full clones
These machines do not separate out the common base image, they are fully recreated for each VM and all user changes are merged into the disk image.
There is no caching at the hypervisor layer that can optimize these workloads.
Limitations of the different cache mechanisms:
Citrix Write Back Cache
- Can only be used with non-persistent machines (i.e machines that clean their state on each boot)
- Customer has to be able to size memory and disk sizes correctly. This is dependent on the workload. Too large will waste resources and making them too small will result in VM issues. There is no easy way to get this to be optimum size.
- Will require more memory use in the VM
- Will support all hypervisor storage types (file and block based)
- Does not require any special hypervisor configuration to use and works in the same way across any virtualization provider
XenServer IntelliCache
- Requires a ‘thin-provisioned’ local storage (i.e. EXT 3/4)
- If local storage fills up, read/write paths fall back to shared which puts unexpected/unanticipated load on storage.
- Only supports VM disks on NFS storage
- Works for both pooled and dedicated desktops and enables them to remain agile
- Pooled desktops may crash if local disk space is exhausted
XenServer Read Cache
- Read caching only caches the read only shared master image disks
- For file-based SRs, such as NFS and EXT3/EXT4 SR types, read-caching is enabled by default. Read-caching is disabled and not available for other SR Types.
- Performance improvements depend on the amount of free memory available in the host’s control domain (dom0). Increasing the amount of dom0 memory allows more memory to be allocated to the read-cache.
- If there are many different images using read caching then the results might be indeterminate as data in the cache may be flushed too frequently.
Related reading:
- IntelliCache:
- XenServer Storage Read Caching
- XenServer YouTube -page