Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on June 12, 2022 9:03PM
Likes: 2
Replies: 3
Hello all,
I've been building a large data-pipeline, and the project is starting to get messy, as I have been creating new branches as I develop new versions of the pipeline. So I want to ask: what are the best practices for separating projects into development, staging, and production?
Should I separate development, staging, and production as separate branches in the same project, create separate projects, or should I use separate DSS servers?
Also if separating into distinct projects, should I use a shared data source for each, or should I re-import the input data source for each?
Operating system used: pop-os
Hi @Fahraynk
,
Our approach is to utilize separate DSS instances for development, testing / staging, and production. Our processes reside within one project. We develop in that project on the development DSS instance and then deploy it to our test and ultimately our production instance when ready.
Some of our data source connections are set up so that they point to different databases on each DSS instance. This allows us to automatically separate dev and test data from production data. Other connections are set to point to the same data tables across dev, test, and prod.
We also make use of instance level variables (defined differently on each DSS instance). Another mechanism we use are project variable overrides (i.e., "local variables" on the project variables screen). We set the project variable to the production value and then on the development instance version of the project we override that variable to the development value.
Our development and test instances are on one server and the production instance is on another server.
We also run our production projects under a service account.
This all works quite nicely.
Hope this is helpful.
Marlan
Thank you. How do you go about deploying development on production? Is it an easy export/upload or are you manually copying the files over?
Hi @Fahraynk
,
We use the bundle functionality as described here: https://doc.dataiku.com/dss/latest/deployment/index.html
That documentation refers to the Project Deployer which we haven't got set up yet. We are manually deploying bundles. Still this takes just a couple of minutes to deploy a project to another instance.
It all works quite nicely.
Marlan