Fully Automated Deployments to a Running Enterprise Data Warehouse

For fully automated deployment to the data warehouse, we need all of the configuration and scripts to be transferred from source code control to the linux server controlling the data warehouse, and for changes to be applied to the database where necessary (see separate post about DbAlign).

 

Because we use git flow, we have formal hotfix process that is compatible with our regular end to end testing process. The approach described here is used for everything from the smallest hotfixes to the largest changes through the regular release process.

 

It's not as simple as providing a button that will get the latest code from the repository, put it on the linux data warehouse "orchestrator" and run DbAlign. A running process might be interrupted by the replacement of the relevant configuration or SQL files, or by the changes done by DbAlign. 

 

We can't wait until nothing is running and then push the button, because we might not have such a gap when we want it, and also someone or something could initiate a new process while the deployment is in progress. 

 

For these reasons we used a simple blocking mechanism as follows: 

  1. All processes that can't run across a deployment are run via a manager mechanism on the orchestrator

  2. Initiating a deployment puts a block file on the orchestrator

  3. Deployments then wait for any running processes to complete before proceeding

  4. The manager mechanism recognises the block file and forces any new processes to wait rather than start immediately, meaning no new processes can interrupt the deployment

  5. Once nothing is running, the deployment proceeds  

  6. When the code and database changes are complete, the deployment process removes the block file 

  7. The manager mechanism recognises that there is now no block file, and allows the queued processes to start immediately 

 

The mechanism effectively creates a clean gap for the deployment to proceed, immediately after the currently in flight job component has completed. The subsequent job then picks up again immediately after the deployment, meaning minimum wasted time. For example there was no need to declare that the data warehouse would be unavailable for a window of x minutes. 

 

This allowed us to push the deployment button whenever we needed, without thinking about what was actually running, saving a lot of time and effort, and reliably getting changes into production quicker.