Since this week is a short week for us, being that Thanksgiving is tomorrow, we decided to work with our Disaster Recovery specialist on certifying our XenDesktop, XenApp, and Provisioning Server Infrastructure. Let me note that most DR tests that are conducted here usually take a full week. Being the overconfident guys that we are, we scheduled 3 days. This is where this blog starts to unfold as I explain an issue we had during our testing.
Like most infrastructure concepts, one piece is built at a time to ensure that each component successfully works. In this scenario, we had built one Provisioning Server and streamed 10 XenDesktops from that server to test functionality, performance and the build process. All was great! We then went further and then built the second Provisioning Server and added it to the same farm as the first. We then copied the VHD and PVP file over from one side to the other. We checked the load balancing of the file in the console and, viola... load balanced. We thought we have everything locked down for our testing.
On the first day we allowed people to launch a XenDesktop and left the session running. We then went ahead and stopped the Stream Service on one of the Provisioning Servers... Not all of the XenDesktops failed over to the other node. We tried to scour the Event Viewer for messages and could not find anything to indicate a cause or reason. We then started the Stream Service back on and those XenDesktops showed still on that node. We then decided to try and fail over to the other side... It worked successfully. Now we are really confused and started to run through the configuration changes to explain how this can be happening. When we contacted one of our consultants about the situation he directed us to look into a known Citrix bug as that might be the culprit. I must give him the benefit of the doubt though, since he did not get the opportunity to look through our environment to determine what the root cause might be.