project duplicate modifies sharepoint files

Solved!
SanderVW
Level 3
project duplicate modifies sharepoint files

Hello, for the migration of our development environment to, for instance, our acceptance environment, we use a python script to duplicate the project (and then change the connections/configurations etc.).
However, when we run this, any sharepoint files used in the flow get modified (within sharepoint itself). We're very confused as to why this is, as there should not be any writing rights at all, but a duplication also should not interfene with the sharepoint environment, I believe.

We were able to narrow the cause down to the following code:
project = client.get_project("project_name")
project.duplicate() <-- here sharepoint files are being overwritten

The modification is a bit weird, all but one (the first?) sheets are removed (even though there is no reference to duplicating Sharepoint files or even reading from / writing to them).

Is anyone familiar with this issue or have an idea on how to resolve it?

 

Thank you in advance!


Operating system used: Windows

0 Kudos
1 Solution
SanderVW
Level 3
Author

I found the answer, in the duplication script we need the parameter duplication_mode to be set to "NONE". That way, the input files were not being updated.

View solution in original post

4 Replies
Turribeach

Irrespective of the reason as to why the duplicate() method behaves in this way what you are doing itโ€™s not best practice and you should move away from. Projects should move between the different Automation nodes (Development, UAT, Prod) via Deployer where you can have all the connections remapped as part of the deployment configuration. If you want to automate this via the Python API thatโ€™s all good, all of the capabilities of the Deployer are available via the API. But duplicating a project is not the right approach. Finally with regards to Sharepoint the only way achieve full separation between environments is to have different Sharepoint areas for each environment. This in turn means you need to handle the data on those areas yourself as you move between environments. This can be handled in the site preferences of each DSS environment  

Using the same Sharepoint area for different Dataiku environments can cause you to miss bugs or to have data from one environment cross contaminate another environment. Itโ€™s also dangerous for business users as they donโ€™t necessarily know which environment generates an output file, if thatโ€™s what you are doing. 

0 Kudos
SanderVW
Level 3
Author

Thank you for your answer! Unfortunately, changing the process is not something we can do (time availability wise), though I agree it's not best practice. We want/need to preserve the current method as much as possible but need to resolve the issue of the files being modified. I can mention that the Sharepoint already has separate environments.

The migration of area's also should not impact Sharepoint at all, there is nothing being written to or read from Sharepoint but duplicating the project somehow modifies all Sharepoint files used in that project.


0 Kudos
SanderVW
Level 3
Author

I found the answer, in the duplication script we need the parameter duplication_mode to be set to "NONE". That way, the input files were not being updated.

Turribeach

Many thanks for posting back the solution for others to see and benefit. 

0 Kudos