Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

python api to import hdfs

skandagn
Level 2
python api to import hdfs

How do I import a dataset from an HDFS connection using python api? I have a set of datasets to import, and dont want to do it manually through UI. 

 

Thanks,

Skanda


Operating system used: Linux

0 Kudos
1 Reply
JordanB
Dataiker

Hi @skandagn,

It is recommended to access the HDFS connection through a Dataiku Managed Folder by pointing the managed folder to your HDFS connection. You can then make use of the Python API of DSS to read and write directly from/to the managed folders stored in HDFS.
Please refer to the following link for a Python usage example:
 
Please let us know if you have any questions.
 
Thanks!
Jordan
0 Kudos

Labels

?
Labels (3)
A banner prompting to get Dataiku