How to upload folder into Dataiku library

Solved!
epsi95
Level 3
How to upload folder into Dataiku library

Capture.PNG

 

I am using the Dataiku library to add custom packages.

I have a few constraints--

  • I can't use GIT
  • My packages and sub packages are very long like

package

    |--sub package 1

    |--subpackage 2

    |--sub package 3

 as so on

 

Currently, I am manually creating folders and uploading files. It is a very tedious job, since every time I change some files I need to upload all the files since I don't remember which files are getting changed.

 

Questions:

  1. Why can't we upload Folder? It is very odd to me, why Dataiku does not allow uploading a folder in Library. Can anyone help me with this?
  2. Can I write some python script or FTP to upload the files and folder or create on Dataiku?
    Operating system used: Windows
0 Kudos
1 Solution
VitaliyD
Dataiker

Hi,

Unfortunately, as you mentioned it is not possible to upload a folder to the libraries at the moment. I see this feature request in out backlog, but we won't be able to provide a timeline of when it could be implemented.

If your instance is not UIF enabled, you can try to write a script to do it manually using a python notebook, but please proceed with caution, it is never a good idea to modify files in the data_dir directory directly as if you break something it may lead to downtime (also make sure you have a backup) and for sure this won't be a recommended way of doing it. 

So you can reach the lib directory of the project like below and then add your folder there:

import dataiku, json, os
path = os.getcwd()
project_name = dataiku.get_custom_variables()["projectKey"]
dip_home = dataiku.get_custom_variables()["dip.home"]
libraries_path = os.path.join(dip_home, 'config/projects/' + project_name + '/lib')
print(libraries_path)
print(os.listdir(libraries_path))

Screenshot 2021-11-19 at 12.23.01.png

Hope this helps.

-Best.

View solution in original post

6 Replies
VitaliyD
Dataiker

Hi,

Unfortunately, as you mentioned it is not possible to upload a folder to the libraries at the moment. I see this feature request in out backlog, but we won't be able to provide a timeline of when it could be implemented.

If your instance is not UIF enabled, you can try to write a script to do it manually using a python notebook, but please proceed with caution, it is never a good idea to modify files in the data_dir directory directly as if you break something it may lead to downtime (also make sure you have a backup) and for sure this won't be a recommended way of doing it. 

So you can reach the lib directory of the project like below and then add your folder there:

import dataiku, json, os
path = os.getcwd()
project_name = dataiku.get_custom_variables()["projectKey"]
dip_home = dataiku.get_custom_variables()["dip.home"]
libraries_path = os.path.join(dip_home, 'config/projects/' + project_name + '/lib')
print(libraries_path)
print(os.listdir(libraries_path))

Screenshot 2021-11-19 at 12.23.01.png

Hope this helps.

-Best.

epsi95
Level 3
Author

Hi, @VitaliyD Thanks for the reply. What I am trying to do is to upload files from my computer to Dataiku. I can access the Dataiku URL since my computer and Dataiku are on the same internet. I am using `dataikuapi` library, can you guide me on how to proceed with this library?

```python

import dataikuapi

# Set Dataiku URL and API Key
host = "https://xxxx:xxxxxx"
apiKey = "xxxxxx"

# Create API client
client = dataikuapi.DSSClient(host, apiKey)

# Ignore SSL checks as these may fail without access to root CA certs
client._session.verify = False

project = client.get_project("xxxx")

```

0 Kudos
VitaliyD
Dataiker

Hi, How to use Dataiku Api remotely you can learn from this guide. On high-level, the complete solution will look like this: using Ddataiku API remotely upload a zipped library to a DSS local filesystem managed from your local computer. Then on the DSS side, either write a macro (as a plugin developer) or a python notebook that copies the file from the managed folder to a project library directory and unzip it there utilising os and zipfile python packages( How to find the project library path I already mentioned earlier). 

To find a managed folder system path with the below code:

folder = dataiku.Folder("folderID") # replace with managed folder id
folder_path = folder.get_path()

I hope this helps.

-Best

epsi95
Level 3
Author

Hi @VitaliyD I am getting the following error, seems like I need to get permission for reading from and write to this specific directory. As of now successfully uploaded the zipped file to the file system folder and unzipped it. Once I get the permission I will override the folders in Library.

```python

---------------------------------------------------------------------------
PermissionError Traceback (most recent call last)
<ipython-input-5-7ebe77cfbc2d> in <module>()
4 libraries_path = os.path.join(dip_home, 'config/projects/' + project_name + '/lib')
5 print(libraries_path)
----> 6 print(os.listdir(libraries_path))

PermissionError: [Errno 13] Permission denied: '/app/dataiku_design/dssdata-9.0.3/config/projects/xxxx/lib'

```

0 Kudos
VitaliyD
Dataiker

Hi, this most likely means that you have User isolation enabled on your instance. In this case, you won't be able to modify any DSS/system directories with any DSS user used to run this python code without changing permissions manually(which defeats the idea of having UIF enabled in the first place. It is maybe acceptable for test instances, but for sure not recommended for production). You can check if UIF is enabled in DSS settings (Administration > Settings > Login (LDAP, SSO) & Security):

Screenshot 2021-11-24 at 08.56.22.png

-Best

epsi95
Level 3
Author

Yes User Isolation is enabled. I have asked the admin to give me access to read and write for that specific Library Path.

0 Kudos