issue with reading dataset with python

Options
lchunleo
lchunleo Registered Posts: 10 ✭✭✭✭

Hi

i place files under managed folder and try to read it but it fails.

import dataiku

handle = dataiku.Folder("test") # test is the dataset name
print(handle) # it output <dataiku.core.managed_folder.Folder object at 0x7fb5f8723c90>

but when i tried :

paths = handle.get_info()

i got Exception: None: b'Managed folder name not found: test in TESTRUN'

and needless to say, when i tried to read the dataset for the file to parse, it fails. why is it so, please advise.

i also tried creating a python recipe and outputting to a dummy one. in the generated code, dataframe is used which is not what i want. my file is in matlab and i need some preprocessing.

# Read recipe inputs
test = dataiku.Dataset("test")
test_df = test.get_dataframe()

Best Answers

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Answer ✓
    Options

    Hi,

    the error message implies that you don't have a managed folder named "test" in your TESTRUN project. Note also that Folder and Dataset are 2 different objects, so you can't access one with the wrapper for the other. Here is just a folder on the left and a dataset to the right

    Screenshot 2020-10-07 at 07.08.03.png

    You should probably use the folder id instead of the folder name in dataiku.Folder(...) : navigate to your test folder in the UI, and in the location bar of your browser pick the id which is a random identifier after the managedfolder/ part : http://host:port/projects/TESTRUN/managedfolder/RywHS6LY/view/ => the folder id is RywHS6LY

  • lchunleo
    lchunleo Registered Posts: 10 ✭✭✭✭
    Answer ✓
    Options

    yes, i think i get confused after the "dataset" and "folder". it works after i have created the "folder". Is there any link about the differentiating the ideas of both of them in the dataiku, i may have missed them out? Thank you.

Answers

Setup Info
    Tags
      Help me…