API to check managed SQL dataset schema consistency

Solved!
ocean_rhythm
Level 3
API to check managed SQL dataset schema consistency

Looking for an API to do the above.

This can be done manually by going to a managed SQL dataset > settings > connection > test, or by going to dataset > settings > schema > check now.

Internally from developer console, it seems like one of the two following private API's are being called.

/dip/api/datasets/managed-sql/test/
/dip/api/datasets/test-schema-consistency
 
1 Solution
ocean_rhythm
Level 3
Author

API actually exists and I was able to use it successfully:

1. Instantiate project flow with get_flow()

2. call flow.start_tool(type="CHECK_CONSISTENCY") (docs)

3. call tool.update() to get a future. Note that options are required, I used `

options={
"recheckAll": True,
"datasets": {"consistencyWithData": True},
"recipes": {"schemaConsistency": True, "otherExpensiveChecks": True},
}

4. call future.wait_for_result() that returns a dict of results. 

5. Parse results and voila!

View solution in original post

0 Kudos
10 Replies
Turribeach

I don't believe these are available on the API. How bad you want them? There might be a way to execute them but it will need some work and it won't be supported. 

There is a Flkow Actions => Check Consistency option in the Flow which will run against all the datasets. 

ocean_rhythm
Level 3
Author

@Turribeach yes, would like an API or automated method if at all possible.

We're building automation around validations and requiring manual clicks in the flow is likely to be missed when there are so many projects & developers.

0 Kudos
Turribeach

So I did get the SQL test connection private API working, which is not available via the public API. It might be possible to get these ones working. But first you should raise it with Dataiku Support to confirm these are not available via the public API. Then raise the Product Idea on the community site. Then we can discuss how to take it forward. 

0 Kudos
ocean_rhythm
Level 3
Author

Hi @Turribeach - I raised a Product Idea here https://community.dataiku.com/t5/Product-Ideas/Expose-public-API-to-test-schema-consistency-of-manag...

There might be some slight overlap with your idea to test connection https://community.dataiku.com/t5/Product-Ideas/Expose-Public-API-to-test-SQL-connections/idi-p/34644

Dataiku Support confirmed that API doesn't exist today:

https://support.dataiku.com/support/tickets/56243

2023-10-12_12-24-25.png

โ€ƒLet me know how to proceed.

Much appreciated!

WH

Turribeach

Yes the Test Connection is totally related (you didn't vote for it though, just click on the Up arrow to vote). A few months ago I started trying to use this API (see this post) and eventually I got it working. I will post it tomorrow. But you will need to see if you can make it work for these APIs. 

 

 

 

ocean_rhythm
Level 3
Author

many thanks @Turribeach, looking forward to test your method.

 

0 Kudos
ocean_rhythm
Level 3
Author

@Turribeach wondering if you had a chance to post about using the API yet? 

0 Kudos
Turribeach

Hi apologies for the delay, haven't forgotten but I need to re-write this at home so taking a bit of time. I will post back here when done. 

ocean_rhythm
Level 3
Author

API actually exists and I was able to use it successfully:

1. Instantiate project flow with get_flow()

2. call flow.start_tool(type="CHECK_CONSISTENCY") (docs)

3. call tool.update() to get a future. Note that options are required, I used `

options={
"recheckAll": True,
"datasets": {"consistencyWithData": True},
"recipes": {"schemaConsistency": True, "otherExpensiveChecks": True},
}

4. call future.wait_for_result() that returns a dict of results. 

5. Parse results and voila!

0 Kudos
gauviv
Level 2

I used your solution:

tool = flow.start_tool(type='CHECK_CONSISTENCY')
options = {
        "recheckAll": False,
        "datasets": {"consistencyWithData": True},
        "recipes": {"schemaConsistency": False, "otherExpensiveChecks": False},
    }
future = tool.update(options)
future.wait_for_result()

 But I got check results on "recipes" with state as checked. I would like to avoid this behavior. Do you have any idea?

0 Kudos