Ready for Dataiku 10? Try out the Crash Course on new features!GET STARTED

Moving a file from a notebook environment into a managed folder

Solved!
ben_p
Neuron
Neuron
Moving a file from a notebook environment into a managed folder

Hi everyone,

I have some python code in a notebook which is generating a PDF, I would like to copy this PDF into a managed folder. I think I want to use the function `put_file()`, but I cannot get it to work!

I have followed to docs to this stage...

handle = dataiku.Folder("acquisition_reports")
handle.put_file("tuto1.pdf","tuto1.pdf")

But I get this error:

AttributeError: 'Folder' object has no attribute 'put_file'

Where am I going wrong?

The file generated in my notebook is called tuto1.pdf.

Thanks in advance,
Ben

0 Kudos
1 Solution
HarizoR
Dataiker
Dataiker

Hi Ben,

Maybe your upload operation will require handling the byte stream in a more specific way. Can you try:

with open("path/to/local.pdf", "rb") as f:
    stream = io.BytesIO(f.read())
    folder.upload_stream("path/to/managed/folder/file.pdf", stream.getvalue())

Best,

Harizo

View solution in original post

0 Kudos
4 Replies
HarizoR
Dataiker
Dataiker

Hi Ben,

If you want to use a dataiku.Folder object, the appropriate method to upload a file to the managed folder is upload_stream(), as documented here.

Best,

Harizo

0 Kudos
ben_p
Neuron
Neuron
Author

@HarizoR thanks for your response! 

I tried the following:
with open("tuto1.pdf") as f:
   handle.upload_stream("tuto1.pdf", f)

But got the error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9c in position 139: invalid start byte

Also wondered if I could just use upload_file, but this generated the same error:
handle.upload_file("acquisition_reports/tuto1.pdf","tuto1.pdf")

What am I doing wrong? 🙂

0 Kudos
HarizoR
Dataiker
Dataiker

Hi Ben,

Maybe your upload operation will require handling the byte stream in a more specific way. Can you try:

with open("path/to/local.pdf", "rb") as f:
    stream = io.BytesIO(f.read())
    folder.upload_stream("path/to/managed/folder/file.pdf", stream.getvalue())

Best,

Harizo

View solution in original post

0 Kudos
ben_p
Neuron
Neuron
Author

Thanks @HarizoR, that worked! My image is getting trimmed somewhere, but that seems to be before the PDF is copied, so can't be this process effecting it.

Thank you for your help!
Ben

0 Kudos
A banner prompting to get Dataiku DSS