Using Dataiku
- Just the title. I've never seen this, but i've only been using dataiku for a short time. Anyone else deal with this? Operating system used: WindowsSolution by LouisDHulst
Hi @ldj002e
,This can happen when your Prepare recipe has invalid steps. Your second to last recipe step seems to be causing the issue here.
Solution by LouisDHulstHi @ldj002e
,This can happen when your Prepare recipe has invalid steps. Your second to last recipe step seems to be causing the issue here.
- Hi, I would like to know if it is possible to write unit tests to a Dataiku API code (in Python) using the API Designer, in order to check if it works according to the desired functionality before the…Solution by Alexandru
Hi @afostor
,
There is no specific way to do this in the API endpoint. Usually testing is done using the test queries in API Designer directly.
If you want to perform additional test, you can write a unit test directly in the API endpoint code directly and only invoke this when using param perform_test= true.import unittest import traceback # Insert here initialization code class TestApiPyFunction(unittest.TestCase): def test_api_py_function(self): # Test case 1 pass result = api_py_function(2, 3, 4) self.assertEqual(result, 2 + 3 * 4) # Expected: 2 + 3 * 4 = 14 # Test case 2 (Intentional failure) result = api_py_function(0, 5, 10) self.assertEqual(result, 0 + 5 * 10 + 1) # Expected: 0 + 5 * 10 + 1 = 51 return "All tests passed successfully." def api_py_function(param1, param2, param3, perform_test=False): result = param1 + param2 * param3 if perform_test: try: # Run the test cases if perform_test is True # The test cases are defined below in the TestApiPyFunction class test_result = TestApiPyFunction().test_api_py_function() except AssertionError as e: tb = traceback.format_exc() return result, tb return result, "All tests passed successfully." return result
Solution by AlexandruHi @afostor
,
There is no specific way to do this in the API endpoint. Usually testing is done using the test queries in API Designer directly.
If you want to perform additional test, you can write a unit test directly in the API endpoint code directly and only invoke this when using param perform_test= true.import unittest import traceback # Insert here initialization code class TestApiPyFunction(unittest.TestCase): def test_api_py_function(self): # Test case 1 pass result = api_py_function(2, 3, 4) self.assertEqual(result, 2 + 3 * 4) # Expected: 2 + 3 * 4 = 14 # Test case 2 (Intentional failure) result = api_py_function(0, 5, 10) self.assertEqual(result, 0 + 5 * 10 + 1) # Expected: 0 + 5 * 10 + 1 = 51 return "All tests passed successfully." def api_py_function(param1, param2, param3, perform_test=False): result = param1 + param2 * param3 if perform_test: try: # Run the test cases if perform_test is True # The test cases are defined below in the TestApiPyFunction class test_result = TestApiPyFunction().test_api_py_function() except AssertionError as e: tb = traceback.format_exc() return result, tb return result, "All tests passed successfully." return result
- Hi i am getting an error while doing the TESTING QUERIES on the API designer for python function endpoint. Failed: Failed to run function : class 'Exception' : No DSS URL or API key found from any loc…Solution bySolution by Turribeach
You should use the Dataiku API package since you are going to call the Dataiku API from the API node and that's considered outside DSS:
import dataikuapi dataiku_url = 'https://dss_url/' dataiku_api_key = 'API key' external_client = dataikuapi.DSSClient(dataiku_url, dataiku_api_key) # Only needed if you are using SSL and need to ignore the SSL cert validation # external_client._session.verify = False project_handle = external_client.get_project('your_project_key') project_variables = project_handle.get_variables()
Also once you are outside DSS you need to set a project key first before you can retrieve the project variables since things don't run inside a project like recipes, jobs and scenarios do in DSS. Finally your incorrectly named method getcustom_variables() (the correct name is get_custom_variables()) is called get_variables() in the dataikuapi package.
- Can the flows be controlled on which flow should be run this time? Like I have 3 flows based on the input I want them to run, if the input dataset is empty it should skip it and run the flow where the…Last answer byLast answer by Turribeach
Here is another approach using a metric as well but without having to use any Python code:
- One of our users is getting this error when trying to open datasets in a project. Somehow when he refreshes the browser couple of times, it works out and sometimes it does not work out. The user belon…Last answer byLast answer by Alexandru
Hi @TheMLEngineer
,It's unclear why refreshing would work after a few refreshes the only way I can think of getting "Action forbidden on cluster" the interactions with the dataset require Interactive SQL with Spark which is configured to use a cluster and the user does not have permissions.
Please open a support ticket with the instance diagnostics and let us know the username/datasets affected in the ticket and any relevant screenshotsIdeally, please generate the instance diagnostics right after the user reproduces the issue.
Thanks - I am using the Dataiku Python API to create a code environment. I need to add a small bit of code to the Resource Initialization script, but it seems the `code_env.get_settings()` method offers no sup…Solution bySolution by Sarina
Hi @MyNamesJames
,
You can create a resources init script from the API like so:import dataiku client = dataiku.api_client() code_env = client.get_code_env('PYTHON','<CODE_ENV_NAME>') settings = code_env.get_settings() raw_settings = settings.get_raw() raw_settings['resourcesInitScript'] = 'NEW SCRIPT CONTENTS' # save the settings settings.save()
Let us know if you have any questions about this.
Thank you,
Sarina - Hi, I have a basic python script that reads in a file and splits this file (based on a supplier) into multiple CSV file and stores them in a dataiku managed folder. I now want to convert this notebook…Last answer byLast answer by Turribeach
Please always post code using the code block (the </> icon in the toolbar) as otherwise the padding is lost and you probably know Python code can not run without the proper padding.
There should be no reason as to why your code doesn't work in a recipe. Please post the error/issue/behavior that you see as "unable to do so" doesnt' really say much.
- Hello, I have a requirement wherein I have to get the URL of the dashboard for a given Dataiku Project through Python recipe. How can this be done can someone please help me. Thanks VarunLast answer by
- Hi, I'm writing to the dataset as follows dkuWriteDataset(model.df,"MasterDataADS") How do we partition the data based on the column "Market"? Similarly, how to read a particular partition of the data…Last answer byLast answer by Alexandru
Hi @MNOP
,
You can specify the partition :
https://doc.dataiku.com/dss/api/12/R/dataiku/reference/dkuWriteDataset.htmldkuWriteDataset(df, name, partition = "", schema = TRUE,
convertLists = TRUE)
Note if you are running as part of recipe in the flow it should inherit whatever partition is defined in the job running you only need to specify it when in R notebook or if you are ignoring the flow and writing to other partitions - Is there a way to take an address (USA and Canada) and return what time zone that address is in?Last answer byLast answer by Turribeach
If you have latitude/longitude you can use this Python package to get the time zone:
https://pypi.org/project/timezonefinder/
If you don't have a read at this page:
https://doc.dataiku.com/dss/latest/geographic/geocoding.html