Python SQLExecutor2 fails

John_Kelly · ‎09-26-2018

I'm trying to use the SQLExecutor2 in Python and am unable to get it to work in any form. I have tried multiple Python environments with no luck. The code below produces an error that the table can't be found, even though it very clearly exists.


from dataiku.core.sql import SQLExecutor2

test_dataset = dataiku.Dataset("companies_for_sim_0")
test_query = """
SELECT
    qx_company_name_pidx
FROM ERM_MASTER_PORTFOLIO_companies_for_sim_0
WHERE qx_company_name_pidx = 222263"""

executor = SQLExecutor2(dataset="companies_for_sim_0")
#executor = SQLExecutor2(connection="internal_envelop_postgresql")
executor.query_to_df(query=test_query)

If I use


SQLExecutor2.exec_recipe_fragment(output_dataset=dataiku.Dataset("test"), query=test_query)

then I get the following error:


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-42-e1ed12e378ef> in <module>()
     11 #executor = SQLExecutor2(connection="internal_envelop_postgresql")
     12 #executor.query_to_df(query=test_query)
---> 13 SQLExecutor2.exec_recipe_fragment(output_dataset=dataiku.Dataset("test"), query=test_query)

~/dataiku-dss-5.0.1/python/dataiku/core/sql.py in exec_recipe_fragment(output_dataset, query, pre_queries, post_queries, overwrite_output_schema, drop_partitioned_on_schema_mismatch)
    163                                     overwrite_output_schema=True,
    164                                     drop_partitioned_on_schema_mismatch=False):
--> 165         spec = json.loads(os.environ["DKUFLOW_SPEC"])
    166         jek_void_call("sql/execute-partial-query-recipe",
    167             data={

~/dss_data/code-envs/python/alex-temp/lib/python3.5/os.py in __getitem__(self, key)
    723         except KeyError:
    724             # raise KeyError with the original key value
--> 725             raise KeyError(key) from None
    726         return self.decodevalue(value)
    727 

KeyError: 'DKUFLOW_SPEC'

Any ideas?

Thomas · ‎09-26-2018

exec_recipe_fragment is only running in Python recipes from the Flow - maybe are you trying to use it from a Notebook ?

View solution in original post

nicog · ‎09-26-2018

Let's imagine there's no issue with the connexions settings (otherwise, try also: conn = SQLExecutor2('cnxname')), is the output dataset named "test" exists ?

John_Kelly · ‎09-26-2018

Yes "test" exists and the existence of test doesn't matter in the first place for the executor.query_to_df(test_query) call. And yes I have also tried creating the executor with the connection name instead of the dataset name, as can be seen in the commented out line of code.

John_Kelly · ‎09-26-2018

Ok I got the query_to_df call to work by adding escaped " around the table name like so:


test_query = """
SELECT
    qx_company_name_pidx
FROM \"ERM_MASTER_PORTFOLIO_companies_for_sim_0\"
WHERE qx_company_name_pidx = 222263"""

I still get the same


KeyError: 'DKUFLOW_SPEC'

for SQLExecutor2.exec_recipe_fragment(output_dataset="test", query=test_query), though, which is the one I really need to use (the actual dataset is far too big to load into memory).

Thomas · ‎09-26-2018

You may be trying to use exec_recipe_fragment from a Jupyter notebook ? It is only supported in Python recipes though (in the Flow directly).

Thomas · ‎09-26-2018

exec_recipe_fragment is only running in Python recipes from the Flow - maybe are you trying to use it from a Notebook ?

John_Kelly · ‎09-27-2018

Yes, that was it. Thank you!

Sign up to take part

Python SQLExecutor2 fails

Python SQLExecutor2 fails

Labels