Bluk replace a connection for datasets

GirishLakade
GirishLakade Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 1 ✭✭✭✭

Hello

We want to import multiple projects and while importing into new environment, we would like to bulk replace connection from "file system" to "HDFS".
Standard Export - Import only allows to choose/replace connections of same type. But in our case, we would like to change connection type itself.
Since there are multiple projects (and each having multiple datasets) doing this manually would be very time consuming and error prone.


I thought of modifying bundle zip file; eventually thinking of modifying datasets and recipe references to change connection type.
But the moment we unzip and re-zip without even modifying ANYTHING the import fails throwing error that bundle zip is not valid.

Wondering what kind of ZIP format Dataiku is using? Why import fails if all I did was unzip and re-zipping of exact same project content/bundle (without making any single change)?

Any help here would be greatly appreciated!

Regards,

Girish

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
    edited July 17

    Hi Girish,

    The reason you are seeing the error is likely because you are adding an extra root directory so we are unable to find the manifest.json.

    When extracting the zip it will extract by default in the root directory PROJECTID.

    When you recreate the zip make sure to exclude the parent directory , one way of doing this would be

    cd PROJECTID
    zip -r PROJECTIDv2.zip ./*

    This should allow you to reimport the project export zip. Hope this helps!

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭

    Even if there are multiple projects, at the project level it is fairly easy to change the connection in bulk:

    • In your flow,
    • (bottom left) View > Connections > connection > Select all checked
    • (bottom right) Other Actions > Change Connection

    See the image below.

    I hope this helps.

Setup Info
    Tags
      Help me…