project duplicate modifies sharepoint files

SanderVW
SanderVW Registered Posts: 47 ✭✭✭✭

Hello, for the migration of our development environment to, for instance, our acceptance environment, we use a python script to duplicate the project (and then change the connections/configurations etc.).
However, when we run this, any sharepoint files used in the flow get modified (within sharepoint itself). We're very confused as to why this is, as there should not be any writing rights at all, but a duplication also should not interfene with the sharepoint environment, I believe.

We were able to narrow the cause down to the following code:
project = client.get_project("project_name")
project.duplicate() <-- here sharepoint files are being overwritten

The modification is a bit weird, all but one (the first?) sheets are removed (even though there is no reference to duplicating Sharepoint files or even reading from / writing to them).

Is anyone familiar with this issue or have an idea on how to resolve it?

Thank you in advance!


Operating system used: Windows

Best Answer

  • SanderVW
    SanderVW Registered Posts: 47 ✭✭✭✭
    Answer ✓

    I found the answer, in the duplication script we need the parameter duplication_mode to be set to "NONE". That way, the input files were not being updated.

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,165 Neuron

    Irrespective of the reason as to why the duplicate() method behaves in this way what you are doing it’s not best practice and you should move away from. Projects should move between the different Automation nodes (Development, UAT, Prod) via Deployer where you can have all the connections remapped as part of the deployment configuration. If you want to automate this via the Python API that’s all good, all of the capabilities of the Deployer are available via the API. But duplicating a project is not the right approach. Finally with regards to Sharepoint the only way achieve full separation between environments is to have different Sharepoint areas for each environment. This in turn means you need to handle the data on those areas yourself as you move between environments. This can be handled in the site preferences of each DSS environment

    Using the same Sharepoint area for different Dataiku environments can cause you to miss bugs or to have data from one environment cross contaminate another environment. It’s also dangerous for business users as they don’t necessarily know which environment generates an output file, if that’s what you are doing.

  • SanderVW
    SanderVW Registered Posts: 47 ✭✭✭✭

    Thank you for your answer! Unfortunately, changing the process is not something we can do (time availability wise), though I agree it's not best practice. We want/need to preserve the current method as much as possible but need to resolve the issue of the files being modified. I can mention that the Sharepoint already has separate environments.

    The migration of area's also should not impact Sharepoint at all, there is nothing being written to or read from Sharepoint but duplicating the project somehow modifies all Sharepoint files used in that project.


  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,165 Neuron

    Many thanks for posting back the solution for others to see and benefit.

Setup Info
    Tags
      Help me…