Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

issue with reading dataset with python

Level 2
issue with reading dataset with python

Hi

 i place files under managed folder and try to read it but it fails.

 

import dataiku

handle = dataiku.Folder("test") # test is the dataset name
print(handle) # it output <dataiku.core.managed_folder.Folder object at 0x7fb5f8723c90>

but when i tried : 

paths = handle.get_info() 

i got Exception: None: b'Managed folder name not found: test in TESTRUN'

and needless to say, when i tried to read the dataset for the file to parse, it fails. why is it so, please advise.

i also tried creating a  python recipe and outputting to a dummy one. in the generated code, dataframe is used which is not what i want. my file is in matlab and i need some preprocessing.

# Read recipe inputs
test = dataiku.Dataset("test")
test_df = test.get_dataframe() 

0 Kudos
3 Replies
Dataiker
Dataiker

Hi,

the error message implies that you don't have a managed folder named "test" in your TESTRUN project. Note also that Folder and Dataset are 2 different objects, so you can't access one with the wrapper for the other. Here is just a folder on the left and a dataset to the right

Screenshot 2020-10-07 at 07.08.03.png

 You should probably use the folder id instead of the folder name in dataiku.Folder(...) : navigate to your test folder in the UI, and in the location bar of your browser pick the id which is a random identifier after the managedfolder/ part : http://host:port/projects/TESTRUN/managedfolder/RywHS6LY/view/ => the folder id is RywHS6LY

 

 

Level 2
Author

yes, i think i get confused after the "dataset" and "folder". it works after i have created the "folder". Is there any link about the differentiating the ideas of both of them in the dataiku, i may have missed them out? Thank you. 

0 Kudos