Error 429 while reading managed folder - Sharepoint
Hi,
We are having some problems while trying to access a managed folder which refers to a sharepoint location. I have reauthenticated user accounts for sharepoint.
Error message -
Oops: an unexpected error occurred Listing files failed, caused by: Exception: Error 429 (get_files) Please see our options for getting help HTTP code: 500, type: java.io.IOException
Best Answer
-
Hi !
Version 1.0.12 of the plugin has been released and should fix this issue.
Answers
-
Hi !
Error 429 occurs when too many requests have been made to SharePoint Online in a short amount of time . You can reduce the occurrence of this problem by using SSO as a form of authentication (option 2 in the plugin documentation ).
-
Hi,
I am already using SSO as the auth method using Presets. When I try to write files to a Sharepoint location in a loop, I always get this error (its not fixed at which point this gets thrown).
Exception: None: b"Failed to write data : <class 'sharepoint_client.SharePointClientError'> : Error 429 (create_folder)"
Thanks.
-
Could you give us more details about your setup, so that we can reproduce the problem on our side ? In particular, how do you loop the write ? Is it done from a python code ? What is the size and number of the files and the delay between two rewrites ?
-
Yes, it is from a python code, I write 1000+ files.
The loop goes something like this -out_folder = dataiku.Folder("odbID")
for currentCountry in country1:
mydataset_df_Country = mydataset_df[mydataset_df['COUNTRY'] == currentCountry]
compCodeFiltered = mydataset_df_Country['COMPANY CODE'].unique()
#Looping through the company codes in each country
for compCode in compCodeFiltered:
mydataset_df_CompCode= mydataset_df_Country[mydataset_df_Country['COMPANY CODE'] == compCode]
fiscalYrFiltered = mydataset_df_CompCode['COMMENTARY YEAR'].unique()
#Looping through each year present for the company
for fiscalYrCurrent in fiscalYrFiltered:
mydataset_df_FiscalYr= mydataset_df_CompCode[mydataset_df_CompCode['COMMENTARY YEAR'] == fiscalYrCurrent]
sourceQuarterFiltered = mydataset_df_FiscalYr['SOURCE_QUARTER'].unique()
#Looping through each quarter within the year to create separate output files
for quarters in sourceQuarterFiltered:
df = mydataset_df_FiscalYr[mydataset_df_FiscalYr['SOURCE_QUARTER'] == quarters].copy()
#formatting
stream = BytesIO()
#
excel_writer = pd.ExcelWriter(stream, engine='xlsxwriter')
#more formatting
excel_writer.save()# Rewind to the begining of the bytes string
stream.seek(0)
with out_folder.get_writer("/file/%s/%s/%d/static folder/%s_Test_%d_%d_Output.xlsx" %(currentCountry, compCode, commentaryYear, compCode, sourceQuarter,commentaryYear)) as writer:
print("Time is ",time.time())
writer.write(stream.read())
time.sleep(3) -
Thank you for sharing this code. Could you also give me a rough idea of the average excel file size once stored on SharePoint ? With this I should have enough information to reproduce the issue on our side...
-
Hi,
The files are under 50 kB once stored on Sharepoint.
Thanks
-
Hi,
Were you able to recreate the scenario or do you have any pointers to the solution for us?
Thanks.
-
Hi !
Yes we managed to recreate the problem on our side and a fix in currently under review. The new version of the plugin should be soon available on the store.
-
Hi !
The latest version of the plugin (1.0.11) is finally available on the store. It should fix error 429 related issues during large folders uploads.
-
Hi,
Thanks for the version update to the plugin. It has mostly solved the issue, but we are regularly encountering a different error. Stactrace below -
Activity failed com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 268: <class 'dataikuapi.utils.DataikuException'>: com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to write data : <class 'json.decoder.JSONDecodeError'> : Expecting value: line 1 column 1 (char 0) at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileIfPossible(JobExecutionResultHandler.java:106) at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileOrLogs(JobExecutionResultHandler.java:39) at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileOrLogs(JobExecutionResultHandler.java:34) at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:26) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:75) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:57) at com.dataiku.dip.recipes.code.python.PythonRecipeRunner.run(PythonRecipeRunner.java:72) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)
Any ideas will help.
-
Are there any ideas?
-
Hi,
- Can I assume this is the same setup ? (SSO, writing large number of files to a sharepoint folder using a python script ?)
- How frequently do you have this problem and what is the overall usage of the script generating the problem (number of files transfered, how often and average size of the files)
-
Hi again,
On rare instances, the SharePoint Online API has temporary problems and sends back an error page in HTML format instead of the expected JSON. We will get the plugin to handle this in its next version. In the meantime, you can probably add an exception, or some retry, in your script. So where you currently have:
writer.write(stream.read())
you can have instead:
successful = False attemps = 0 while not successful and attemps < 3: attemps = attemps + 1 try: writer.write(stream.read()) successful = True except: time.sleep(5)
Hope this helps,
Alex