Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on September 26, 2018 1:01PM
Likes: 0
Replies: 6
I'm trying to use the SQLExecutor2 in Python and am unable to get it to work in any form. I have tried multiple Python environments with no luck. The code below produces an error that the table can't be found, even though it very clearly exists.
from dataiku.core.sql import SQLExecutor2
test_dataset = dataiku.Dataset("companies_for_sim_0")
test_query = """
SELECT
qx_company_name_pidx
FROM ERM_MASTER_PORTFOLIO_companies_for_sim_0
WHERE qx_company_name_pidx = 222263"""
executor = SQLExecutor2(dataset="companies_for_sim_0")
#executor = SQLExecutor2(connection="internal_envelop_postgresql")
executor.query_to_df(query=test_query)
If I use
SQLExecutor2.exec_recipe_fragment(output_dataset=dataiku.Dataset("test"), query=test_query)
then I get the following error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-42-e1ed12e378ef> in <module>()
11 #executor = SQLExecutor2(connection="internal_envelop_postgresql")
12 #executor.query_to_df(query=test_query)
---> 13 SQLExecutor2.exec_recipe_fragment(output_dataset=dataiku.Dataset("test"), query=test_query)
~/dataiku-dss-5.0.1/python/dataiku/core/sql.py in exec_recipe_fragment(output_dataset, query, pre_queries, post_queries, overwrite_output_schema, drop_partitioned_on_schema_mismatch)
163 overwrite_output_schema=True,
164 drop_partitioned_on_schema_mismatch=False):
--> 165 spec = json.loads(os.environ["DKUFLOW_SPEC"])
166 jek_void_call("sql/execute-partial-query-recipe",
167 data={
~/dss_data/code-envs/python/alex-temp/lib/python3.5/os.py in __getitem__(self, key)
723 except KeyError:
724 # raise KeyError with the original key value
--> 725 raise KeyError(key) from None
726 return self.decodevalue(value)
727
KeyError: 'DKUFLOW_SPEC'
Any ideas?
Ok I got the query_to_df call to work by adding escaped " around the table name like so:
test_query = """
SELECT
qx_company_name_pidx
FROM \"ERM_MASTER_PORTFOLIO_companies_for_sim_0\"
WHERE qx_company_name_pidx = 222263"""
I still get the same
KeyError: 'DKUFLOW_SPEC'
for SQLExecutor2.exec_recipe_fragment(output_dataset="test", query=test_query), though, which is the one I really need to use (the actual dataset is far too big to load into memory).