How to write an excel file in a folder with python where folder use sharepoint plugin
Hi,
Python script is not working when I want to write an excel file to a folder associated to sharepoint plugin.
Python fails to retrieve path needed to excelwriter.
path seems to not be part of folder_info structure of folder when using sharepoint plugin while it is on server's filesystem folder.
Here the code used :
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import os
print ("FOLDER_INFO")
print (folder_info)
f_path = os.path.join(folder_info['path'] , file_name)
writer = pd.ExcelWriter(f_path,engine='xlsxwriter')
...
error is the following :
Job failed: Error in Python process: At line 24: <class 'KeyError'>: path
LOGS (using sharepoint plugin):
[2021/09/27-10:50:53.849] [null-out-64] [INFO] [dku.utils] - FOLDER_INFO
[2021/09/27-10:50:53.850] [null-out-64] [INFO] [dku.utils] - {'projectKey': 'LPC_PROD_OUT_REF_PREV', 'directoryBasedPartitioning': False, 'name': 'OUTPUT_FOLDER', 'id': 'yrztMCMF', 'accessInfo': {}, 'type': 'fsprovider_sharepoint-server_sharepoint-server_shared-documents'}
LOGS when it's working with server's filesystem folder:
[2021/09/23-15:29:38.739] [null-out-67] [INFO] [dku.utils] - FOLDER_INFO
[2021/09/23-15:29:38.740] [null-out-67] [INFO] [dku.utils] - {'path': '/data/dataiku/dss_design/managed_folders/LPC_PROD_OUT_REF_PREV/ivaCaknz', 'projectKey': 'LPC_PROD_OUT_REF_PREV', 'directoryBasedPartitioning': False, 'name': 'OUT', 'id': 'ivaCaknz', 'accessInfo': {'root': '/data/dataiku/dss_design/managed_folders/LPC_PROD_OUT_REF_PREV/ivaCaknz'}, 'type': 'Filesystem'}
Tell me if there's another way to export an excel file on a sharepoint folder ?
Thanks in advance for your help
Philippe
Best Answer
-
Hi Philippe,
Assuming a simple flow with a python recipe taking a dataset as an input and a folder as output
you can use the following code:
# -*- coding: utf-8 -*- import dataiku import pandas as pd from io import BytesIO import openpyxl # Read recipe inputs source = dataiku.Dataset("Source") source_df = source.get_dataframe() # First, save the data as excel format into a bytes string stream = BytesIO() excel_writer = pd.ExcelWriter(stream, engine='openpyxl') source_df.to_excel(excel_writer, sheet_name='Sheet1') excel_writer.save() # Rewind to the begining of the bytes string stream.seek(0) # Write recipe outputs, with the Dataiku API in case the folder is remote # (note: written this way, the recipe will also work if the target folder is local) target_folder = dataiku.Folder("<<put your folder reference here>>") with target_folder.get_writer("excelfile.xlsx") as writer: writer.write(stream.read())
Best regards,
Alex
Answers
-
Hi Philippe,
When accessing the filesystem folder, you are using a local filesystem. This is why you can obtain the files path and interact directly with it.
However, in the case of the SharePoint folder, you are interacting with a remote folder. This is done via a DSS plugin, and because of that you need to use the DSS API to do so. You can find more information here.
Note that the DSS API allows you to interact with both type of filesystems.
Alex
-
Thanks Alex for this answer.
Documentation links are confirming your message and it is clear to me we can not proceed remote folder the same way we proceed with local folder.
Then it doesn't seems easy to export an excel to a remote folder. It would be very usefull to have a sample ...Best regards,
Philippe
-
Hi Alex,
Sorry for the delay of my feedback : I was not able to adapt my script before.
Script needed many adaptations but your sample was very usefull.
At last it is working as expected.
Thanks a lot for your help,Philippe
-
how would this work with multiple excel files to be written to a folder?