Python API equivalent to "CHECK NOW" button

Options
JohnB
JohnB Registered Posts: 32 ✭✭✭✭✭

The UI provides a "CHECK NOW" button on the dataset settings, schema tab

Can this be run in code?

Answers

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    edited July 17
    Options

    Hello,
    You can achieve the same as the dataset schema tab "CHECK NOW" button with the following code sample:

    import dataiku
    client = dataiku.api_client()
    p = client.get_project('PROJECT_KEY')
    future = p.get_dataset('DATASET_NAME').test_and_detect()
    future.wait_for_result()
    try:
        warningLevel = future.get_result()['format']['schemaDetection']['warningLevel']
        print('Warning level:' + warningLevel)
    except KeyError:
        print('No warning')

    In the schemaDetection dict you have many informations like the reasons of the warning in "textReasons" and the type of the warning in "type".

    I hope it helps,
    Arnaud

  • JohnB
    JohnB Registered Posts: 32 ✭✭✭✭✭
    Options

    Hi Arnudde,

    Thanks for the response.

    Will this work with custom defined SQLServer datasets?

    I get this exception on test_and_detect():

    DataikuException: java.lang.ClassCastException: Cannot cast com.dataiku.dip.datasets.sql.ManagedSQLTableDatasetTestHandler to com.dataiku.dip.datasets.sql.ExternalSQLDatasetTestHandler

    Actually I raised a support ticket on a similar question for which the response was that this was not currently possible with the API.

    I would need something that also emulated the "Reload schema from table" button if a warning was found.

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    edited July 17
    Options

    Hi,

    Do you get the same "Java.lang.ClassCastException" exception when using the "Check Now" button ?

    You can achieve the same as the "Reload schema from table" button by extracting the detected schema from the result of "test_and_detect()" method and set it for the dataset with the DSSDataset.set_schema method

    detected_schema = future.get_result()['format']['schemaDetection'["detectedSchema"]

    Hope it helps,

Setup Info
    Tags
      Help me…