The Dataiku Frontrunner Awards have launched to recognize your achievements! SUBMIT YOUR ENTRY

Bluk replace a connection for datasets

GirishLakade
Level 1
Bluk replace a connection for datasets

Hello 

We want to import multiple projects and while importing into new environment, we would like to bulk replace connection from "file system" to "HDFS".
Standard Export - Import only allows to choose/replace connections of same type.  But in our case, we would like to change connection type itself. 
Since there are multiple projects (and each having multiple datasets) doing this manually would be very time consuming and error prone.


I thought of modifying bundle zip file; eventually thinking of modifying datasets and recipe references to change connection type.
But the moment we unzip and re-zip without even modifying ANYTHING the import fails throwing error that bundle zip is not valid. 

Wondering what kind of ZIP format Dataiku is using?  Why import fails if all I did was unzip and re-zipping of exact same project content/bundle (without making any single change)? 

Any help here would be greatly appreciated!

Regards,

Girish 

0 Kudos
2 Replies
AlexT
Dataiker
Dataiker

Hi Girish,

The reason you are seeing the error is likely because you are adding an extra root directory so we are unable to find the manifest.json. 

When extracting the zip it will extract by default in the root directory PROJECTID. 

When you recreate the zip make sure to exclude the parent directory , one way of doing this would be 

cd PROJECTID
zip -r PROJECTIDv2.zip ./*

This should allow you to reimport the project export zip. Hope this helps!

0 Kudos
Manuel
Dataiker
Dataiker

Even if there are multiple projects, at the project level it is fairly easy to change the connection in bulk:

  • In your flow,
  • (bottom left) View > Connections > connection > Select all checked
  • (bottom right) Other Actions > Change Connection

See the image below. 

I hope this helps.

0 Kudos
A banner prompting to get Dataiku DSS
Public