Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Migrating a custom linux install to cloudstacks

yashpuranik
Neuron
Neuron
Migrating a custom linux install to cloudstacks

Hi Team,

We have DSS 10.0.4 Design, Automation, API and Govern Nodes installed on Linux VMs by way of custom installs. The original install (DSS 8 I think) was carried out when Cloudstacks as an installation option was not available.  Since then, we have carried out manual upgrades a couple of times.

We would like to move everything over to a Cloudstacks installation to make it easier for future upgrades and better management. Our existing instance has ~40 users, 100+ projects, custom plugins, apps and more. Manually importing everything after creating a new Cloudstacks installation is not an option.

Is there a programmatic/systematic way to migrate our setup?

yashpuranik
4 Replies
Turribeach
Level 6

I am not familiar with the Cloudstacks process to upgrade Dataiku but if you have developed your own custom installation script I would stay with that unless there are any pitfalls that you want to address. We have a custom GCP VM startup script that builds all our DSS nodes automatically. We have been upgrading since v8 to v10 very frequently and found no issues with it. I am not sure if you are aware but the process of upgrading a DSS can be equated with a full disaster recovery. In effect you build a new VM, install all the required OS packages, configure the OS and then install the latest DSS version you want. Once you have a DSS vanilla install completed all you need to do is to shutdown the DSS node, restore the full DSS data dir from your backup and run the the upgrade script. It’s as simple as that. All your projects, users, files, DSS configuration is restored. The only thing that is not restored obviously is data held in external databases and storage layers like cloud buckets. 

I like this process a lot because it gives me confidence in our DR solution as in effect I am always restoring from a backup when I am doing DSS upgrades. the process is sudo automatic with only a few manual steps like picking which backup should be restored and uploading the new DSS version to our Artifactory server. It is a bit nerve racking to “destroy” the existing VM via our IaC to do an upgrade but you have to have confidence in your IaC pipeline and scripts otherwise what’s the point in using them?

Occasionally  we have done some in-place upgrades as well which are a bit quicker but we prefer full rebuild since it’s a proper repeatable process. 

Hope it helps

Thanks 

0 Kudos
AlexT
Dataiker
Dataiker

Hi @yashpuranik ,

A migration plan for Custom Install to Cloudstacks can be tailored to your needs with the help of our Cloud Architects who have experience with this.

The migration path will vary on a case-by-case basis.

I would recommend you reach out to your Customer Success Manager or Partner Success Manager to start this conversation and learn more about the benefits of Cloud Stacks.

Turribeach
Level 6

So I just had a look at what Dataiku Cloud Stacks in the documentation:

https://doc.dataiku.com/dss/latest/installation/cloudstacks-aws/index.html

While I can see the benefit of such managed setup I don't think it's for everyone. For a start we are in GCP not AWS nor Azure so for it's not even available if we wanted it as we are on GCP. Then looking at all the steps that Cloud Stacks cover I am pretty sure it will be a significant challenge for us to have Dataiku Cloud Stacks work in tandem with our GCP IaC implementation and within the limits of our GCP project restrictions. 

So while I don't deny that Dataiku Cloud Stacks may be suitable in some scenarios it's certainly not a silver bullet in particular for more complex enterprise installations. AlexT's suggestion is a good start but this looks like reengineering for us for little ROI so not for us...

0 Kudos
Turribeach
Level 6

PS: Personally I think the Cloud Stacks use case is for a small/medium company starting to deploy Dataiku in AWS/Azure in their own cloud projects rather than using the managed SaaS Dataiku Cloud option. Complex enterprise setups will still be better served with a custom setup mathcing their environment needs. 

0 Kudos

Labels

?
Labels (1)

Setup info

?
A banner prompting to get Dataiku