Want to Stop Rebuilding "Expensive" Parts of your Flow? Explicit Builds are the Answer!READ MORE

Error 429 while reading managed folder - Sharepoint

Solved!
zeno_11
Level 3
Error 429 while reading managed folder - Sharepoint
 

Hi,

We are having some problems while trying to access a managed folder which refers to a sharepoint location. I have reauthenticated user accounts for sharepoint.

Error message - 

 

 Oops: an unexpected error occurred
Listing files failed, caused by: Exception: Error 429 (get_files)
Please see our options for getting help

HTTP code: 500, type: java.io.IOException

 

 

 

0 Kudos
1 Solution
AlexB
Dataiker
Dataiker

Hi !

Version 1.0.12 of the plugin has been released and should fix this issue.

View solution in original post

14 Replies
AlexB
Dataiker
Dataiker

Hi !

Error 429 occurs when too many requests have been made to SharePoint Online in a short amount of time . You can reduce the occurrence of this problem by using SSO as a form of authentication (option 2 in the plugin documentation ).

0 Kudos
zeno_11
Level 3
Author

Hi,

I am already using SSO as the auth method using Presets. When I try to write files to a Sharepoint location in a loop, I always get this error (its not fixed at which point this gets thrown).

 Exception: None: b"Failed to write data : <class 'sharepoint_client.SharePointClientError'> : Error 429 (create_folder)"

Thanks. 

0 Kudos
AlexB
Dataiker
Dataiker

Could you give us more details about your setup, so that we can reproduce the problem on our side ? In particular, how do you loop the write ? Is it done from a python code ? What is the size and number of the files and the delay between two rewrites ?

0 Kudos
zeno_11
Level 3
Author
Yes, it is from a python code, I write 1000+ files. 

The loop goes something like this -

out_folder = dataiku.Folder("odbID")
for currentCountry in country1:
           mydataset_df_Country = mydataset_df[mydataset_df['COUNTRY'] == currentCountry]
           compCodeFiltered = mydataset_df_Country['COMPANY CODE'].unique()
#Looping through the company codes in each country
          for compCode in compCodeFiltered:
             mydataset_df_CompCode= mydataset_df_Country[mydataset_df_Country['COMPANY CODE'] == compCode]
             fiscalYrFiltered = mydataset_df_CompCode['COMMENTARY YEAR'].unique()
             #Looping through each year present for the company
                 for fiscalYrCurrent in fiscalYrFiltered:
                  mydataset_df_FiscalYr= mydataset_df_CompCode[mydataset_df_CompCode['COMMENTARY YEAR'] == fiscalYrCurrent]
                  sourceQuarterFiltered = mydataset_df_FiscalYr['SOURCE_QUARTER'].unique()
                  #Looping through each quarter within the year to create separate output files
                     for quarters in sourceQuarterFiltered:

                   df = mydataset_df_FiscalYr[mydataset_df_FiscalYr['SOURCE_QUARTER'] == quarters].copy()
                   #formatting
                   stream = BytesIO()
#

                   excel_writer = pd.ExcelWriter(stream, engine='xlsxwriter')
                   #more formatting
                   excel_writer.save()

                   # Rewind to the begining of the bytes string
                   stream.seek(0)

                   with out_folder.get_writer("/file/%s/%s/%d/static folder/%s_Test_%d_%d_Output.xlsx" %(currentCountry, compCode, commentaryYear, compCode, sourceQuarter,commentaryYear)) as writer:
                   print("Time is ",time.time())
                   writer.write(stream.read())
                   time.sleep(3)

0 Kudos
AlexB
Dataiker
Dataiker

Thank you for sharing this code. Could you also give me a rough idea of the average excel file size once stored on SharePoint ? With this I should have enough information to reproduce the issue on our side...

0 Kudos
zeno_11
Level 3
Author

Hi,

The files are under 50 kB once stored on Sharepoint.

Thanks

0 Kudos
zeno_11
Level 3
Author

Hi,

Were you able to recreate the scenario or do you have any pointers to the solution for us?

Thanks.

0 Kudos
AlexB
Dataiker
Dataiker

Hi !

Yes we managed to recreate the problem on our side and a fix in currently under review. The new version of the plugin should be soon available on the store.

 

AlexB
Dataiker
Dataiker

Hi !

The latest version of the plugin (1.0.11) is finally available on the store. It should fix error 429 related issues during large folders uploads.

 

0 Kudos
zeno_11
Level 3
Author

Hi, 

Thanks for the version update to the plugin. It has mostly solved the issue, but we are regularly encountering a different error. Stactrace  below -

 

Activity failed
com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 268: <class 'dataikuapi.utils.DataikuException'>: com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to write data : <class 'json.decoder.JSONDecodeError'> : Expecting value: line 1 column 1 (char 0)
	at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileIfPossible(JobExecutionResultHandler.java:106)
	at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileOrLogs(JobExecutionResultHandler.java:39)
	at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileOrLogs(JobExecutionResultHandler.java:34)
	at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:26)
	at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:75)
	at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:57)
	at com.dataiku.dip.recipes.code.python.PythonRecipeRunner.run(PythonRecipeRunner.java:72)
	at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)

 Any ideas will help.

0 Kudos
zeno_11
Level 3
Author

Are there any ideas?

0 Kudos
AlexB
Dataiker
Dataiker

Hi,

- Can I assume this is the same setup ? (SSO, writing large number of files to a sharepoint folder using a python script ?)

- How frequently do you have this problem and what is the overall usage of the script generating the problem (number of files transfered, how often and average size of the files)

 

0 Kudos
AlexB
Dataiker
Dataiker

Hi again,

On rare instances, the SharePoint Online API has temporary problems and sends back an error page in HTML format instead of the expected JSON. We will get the plugin to handle this in its next version. In the meantime, you can probably add an exception, or some retry, in your script. So where you currently have:

writer.write(stream.read())

you can have instead:

successful = False
attemps = 0
while not successful and attemps < 3:
    attemps = attemps + 1
    try:
        writer.write(stream.read())
        successful = True
    except:
        time.sleep(5)

 Hope this helps,

Alex

0 Kudos
AlexB
Dataiker
Dataiker

Hi !

Version 1.0.12 of the plugin has been released and should fix this issue.