Migration & Synchronize issue in automation node
Hi, I'm using the dataiku 12.5.2 in my workplace, and I found synchronizing design node and automation node is quite hard.
Minor debugging issues that occur during ingestion are all processed by the automation node,
while architectural modifications are build and tested in the design node.
As time goes on, differences between the two nodes' projects will become more noticeable. Is there any method to minimize the effort required to address this problem during deploying (new bundle)?
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,377 DataikerHi,
We recommend doing frequent deployments of new bundles from Design to Automation to keep the projects as close together as possible.
This can be automated with Python API / Scenarios Steps or CI/CD Pipelines
https://developer.dataiku.com/latest/api-reference/python/project-deployer.html#dataikuapi.dss.projectdeployer.DSSProjectDeployer.create_deployment
Scenario Steps:CI/CD
https://knowledge.dataiku.com/latest/mlops-o16n/ci-cd/tutorial-jenkins-pipeline-project-deployer.html
Kind Regards, -
I think the best way would be to consistently use project packages and Dataiku version control. That is, develop and test on the design node. Then create a package for deployment on the automation node. While synchronizing both nodes with Git or regular package deployments minimizes drift and reduces the number of manual fixes.
