Error 429 while reading managed folder - Sharepoint

Options
zeno_11
zeno_11 Registered Posts: 15 ✭✭✭✭

Hi,

We are having some problems while trying to access a managed folder which refers to a sharepoint location. I have reauthenticated user accounts for sharepoint.

Error message -

 Oops: an unexpected error occurred
Listing files failed, caused by: Exception: Error 429 (get_files)
Please see our options for getting help

HTTP code: 500, type: java.io.IOException

Best Answer

Answers

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    Options

    Hi !

    Error 429 occurs when too many requests have been made to SharePoint Online in a short amount of time . You can reduce the occurrence of this problem by using SSO as a form of authentication (option 2 in the plugin documentation ).

  • zeno_11
    zeno_11 Registered Posts: 15 ✭✭✭✭
    edited July 17
    Options

    Hi,

    I am already using SSO as the auth method using Presets. When I try to write files to a Sharepoint location in a loop, I always get this error (its not fixed at which point this gets thrown).

     Exception: None: b"Failed to write data : <class 'sharepoint_client.SharePointClientError'> : Error 429 (create_folder)"

    Thanks.

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    Options

    Could you give us more details about your setup, so that we can reproduce the problem on our side ? In particular, how do you loop the write ? Is it done from a python code ? What is the size and number of the files and the delay between two rewrites ?

  • zeno_11
    zeno_11 Registered Posts: 15 ✭✭✭✭
    Options
    Yes, it is from a python code, I write 1000+ files.

    The loop goes something like this -

    out_folder = dataiku.Folder("odbID")
    for currentCountry in country1:
    mydataset_df_Country = mydataset_df[mydataset_df['COUNTRY'] == currentCountry]
    compCodeFiltered = mydataset_df_Country['COMPANY CODE'].unique()
    #Looping through the company codes in each country
    for compCode in compCodeFiltered:
    mydataset_df_CompCode= mydataset_df_Country[mydataset_df_Country['COMPANY CODE'] == compCode]
    fiscalYrFiltered = mydataset_df_CompCode['COMMENTARY YEAR'].unique()
    #Looping through each year present for the company
    for fiscalYrCurrent in fiscalYrFiltered:
    mydataset_df_FiscalYr= mydataset_df_CompCode[mydataset_df_CompCode['COMMENTARY YEAR'] == fiscalYrCurrent]
    sourceQuarterFiltered = mydataset_df_FiscalYr['SOURCE_QUARTER'].unique()
    #Looping through each quarter within the year to create separate output files
    for quarters in sourceQuarterFiltered:

    df = mydataset_df_FiscalYr[mydataset_df_FiscalYr['SOURCE_QUARTER'] == quarters].copy()
    #formatting
    stream = BytesIO()
    #

    excel_writer = pd.ExcelWriter(stream, engine='xlsxwriter')
    #more formatting
    excel_writer.save()

    # Rewind to the begining of the bytes string
    stream.seek(0)

    with out_folder.get_writer("/file/%s/%s/%d/static folder/%s_Test_%d_%d_Output.xlsx" %(currentCountry, compCode, commentaryYear, compCode, sourceQuarter,commentaryYear)) as writer:
    print("Time is ",time.time())
    writer.write(stream.read())
    time.sleep(3)

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    Options

    Thank you for sharing this code. Could you also give me a rough idea of the average excel file size once stored on SharePoint ? With this I should have enough information to reproduce the issue on our side...

  • zeno_11
    zeno_11 Registered Posts: 15 ✭✭✭✭
    Options

    Hi,

    The files are under 50 kB once stored on Sharepoint.

    Thanks

  • zeno_11
    zeno_11 Registered Posts: 15 ✭✭✭✭
    Options

    Hi,

    Were you able to recreate the scenario or do you have any pointers to the solution for us?

    Thanks.

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    Options

    Hi !

    Yes we managed to recreate the problem on our side and a fix in currently under review. The new version of the plugin should be soon available on the store.

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    Options

    Hi !

    The latest version of the plugin (1.0.11) is finally available on the store. It should fix error 429 related issues during large folders uploads.

  • zeno_11
    zeno_11 Registered Posts: 15 ✭✭✭✭
    edited July 17
    Options

    Hi,

    Thanks for the version update to the plugin. It has mostly solved the issue, but we are regularly encountering a different error. Stactrace below -

    Activity failed
    com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 268: <class 'dataikuapi.utils.DataikuException'>: com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to write data : <class 'json.decoder.JSONDecodeError'> : Expecting value: line 1 column 1 (char 0)
     at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileIfPossible(JobExecutionResultHandler.java:106)
     at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileOrLogs(JobExecutionResultHandler.java:39)
     at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.throwFromErrorFileOrLogs(JobExecutionResultHandler.java:34)
     at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:26)
     at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:75)
     at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:57)
     at com.dataiku.dip.recipes.code.python.PythonRecipeRunner.run(PythonRecipeRunner.java:72)
     at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)

    Any ideas will help.

  • zeno_11
    zeno_11 Registered Posts: 15 ✭✭✭✭
    Options

    Are there any ideas?

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    Options

    Hi,

    - Can I assume this is the same setup ? (SSO, writing large number of files to a sharepoint folder using a python script ?)

    - How frequently do you have this problem and what is the overall usage of the script generating the problem (number of files transfered, how often and average size of the files)

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    edited July 17
    Options

    Hi again,

    On rare instances, the SharePoint Online API has temporary problems and sends back an error page in HTML format instead of the expected JSON. We will get the plugin to handle this in its next version. In the meantime, you can probably add an exception, or some retry, in your script. So where you currently have:

    writer.write(stream.read())

    you can have instead:

    successful = False
    attemps = 0
    while not successful and attemps < 3:
        attemps = attemps + 1
        try:
            writer.write(stream.read())
            successful = True
        except:
            time.sleep(5)

    Hope this helps,

    Alex

Setup Info
    Tags
      Help me…