I am developing an application which is deployed to Azure App Service. It runs on .NET 5.0 on Linux. I have set up a simple DevOps process so that committing changes to GitHub runs an Azure DevOps pipeline that deploys the application to a staging slot on Azure App Service for Linux. Then I can use Swap in the Azure portal to update the production slot. Swap simply exchanges the content of the staging slot with that in the production slot, so there is a route back in the event of disaster. Swap also restarts the application and forces users to log back in.
Yesterday I fixed a bug, deployed the change to the staging slot, and performed a swap. Logged back into the application, but the bug was still there, though intermittent. That was the bit I could not figure out: what was causing the code to behave differently on different requests? I became suspicious that it was sometimes serving the old version. I proved this by refreshing a page that demonstrated the bug. My page has an application version in the footer, and I could see that when the bug appeared, the version was older.
Well this is odd. In the App Service Deployment slot settings I have traffic set to 100% for the production slot:
In general I tend to assume a bug in my code or an error in my configuration settings is more likely than an issue with the Azure App Service. This does look odd though: why, if traffic is going 100% to the production slot, does the application sometimes serve the old version?
The pragmatic fix was easy. A second deployment to the staging slot means both now have code that works. The bug no longer appears; but I have kept the version number different and can see that the issue is actually still occurring.
I will update this post when I have more information, just in case anyone else hits this issue.