Submit your use case or success story to the 2023 edition of the Dataiku Frontrunner Awards ENTER YOUR SUBMISSION

Improve Code Env Rebuild Process for On Prem Upgrades

When upgrading Dataiku in place, it can take 10+ hours to rebuild images and all code envs. Workarounds include including core packages in base images or caching pypi indices but these are minor. Upgrade time scales linearly with the number of envs on nodes, which is unfortunate. We cannot execute an actual blue/green since Dataiku doesn't support data dir copies post version upgrade. 

If we could somehow allow for minor version blue/green upgrade cycles it would be fantastic.

 

 

4 Comments
Turribeach
Level 6

How exactly are you upgrading Dataiku in place? We upgrade Dataiku all the time and we never rebuild any code environments. Never faced an issue. Not sure what you mean by rebuilding “images”. 

importthepandas
Level 5

Hi @Turribeach 

We deploy Dataiku in AWS and maintain CentOS images for containerized execution in EKS. Per upgrade documents, images need to be rebuilt and all code envs need to be updated after image rebuilds. Once you rebuild images, all old code envs no longer work.

Rebuilding 150+ code envs is quite cumbersome, but would be fine if we could port the DATADIR over post upgrade in a blue/green deployment (which is not supported)

June
Level 3

We have this problem as well.  Upvoting this. 

importthepandas
Level 5

@June our dev ops team is going to attempt to build images and envs in a clone and register in ECR, then upgrade in place and point to ECR with the new stuff in place. It'll still take us a day or two to get all of the images and envs up and running but itll save us a Saturday... hopefully.