Using Dataiku
- Hi, Is there a way to check with the Python API if a processor in a Prepare recipe is compatible (or not) with a "In-database" engine. Attached: the information I would like to get, but in the UI. Tha…Last answer by Sarina
Hi @PaulBavaz
,
There isn't a way to pull if a specific prepare recipe processor is compatible with the "in-database" engine in the same way that get_engines_details() can provide engine compatibility information at the recipe-level.For an existing prepare recipe, get_selected_engine_details() will return a statusMessage field that will return a single error message if the prepare recipe has one:
recipe = project.get_recipe('recipe_name') status = recipe.get_status() status.get_selected_engine_details()['statusMessage']
However, if multiple processors within a prepare recipe are returning an error, there isn't a way to pull each processor and the corresponding error message. In the example above, note that multiple processors are "Not translatable to SQL", but you will only get one message returned:
So, it might makes the most sense to refer to the list of in-database fully supported processors and partially supported processors, and even hardcode these if that worked for your use case.
Let me know if you have any further questions about this.
Thanks,
SarinaLast answer by SarinaHi @PaulBavaz
,
There isn't a way to pull if a specific prepare recipe processor is compatible with the "in-database" engine in the same way that get_engines_details() can provide engine compatibility information at the recipe-level.For an existing prepare recipe, get_selected_engine_details() will return a statusMessage field that will return a single error message if the prepare recipe has one:
recipe = project.get_recipe('recipe_name') status = recipe.get_status() status.get_selected_engine_details()['statusMessage']
However, if multiple processors within a prepare recipe are returning an error, there isn't a way to pull each processor and the corresponding error message. In the example above, note that multiple processors are "Not translatable to SQL", but you will only get one message returned:
So, it might makes the most sense to refer to the list of in-database fully supported processors and partially supported processors, and even hardcode these if that worked for your use case.
Let me know if you have any further questions about this.
Thanks,
Sarina - Hi All, I have a python package in form of an .egg file and wanted to understand the steps to use it as a package in DSS. Thank you for your supportLast answer by Alexandru
Python .egg files are deprecated and are replaced by Wheel format. Since DSS uses pip which does not support installing egg format you won't be able to add egg packages directly from DSS.
You should be able to convert your .egg file to wheel: https://wheel.readthedocs.io/en/stable/reference/wheel_convert.html
Once converted you should be to use pip to install.
Last answer by AlexandruPython .egg files are deprecated and are replaced by Wheel format. Since DSS uses pip which does not support installing egg format you won't be able to add egg packages directly from DSS.
You should be able to convert your .egg file to wheel: https://wheel.readthedocs.io/en/stable/reference/wheel_convert.html
Once converted you should be to use pip to install.
- Hi, I've got some issues trying to use selenium to parse web page on jupyter-notebook-recipe. Does someone already use selenium : https://towardsdatascience.com/web-scraping-using-selenium-python-8a60…Last answer byLast answer by anpuke
Hi @mgirard
,easiest way is to download chromedriver from google and do the setup with your package manager. What's the operating system of your dss instance?
Best regards,
Andreas
- Hello, I have many dataset from csv files and every month i have new files (new files replace previous files), i develop a python program to control name columns. I put a new file with a new column or…Last answer byLast answer by Marlan
Hi @Hazou
,OK, I think I'm understanding what you are doing better...
Check out this solution - the problem posted appears to be similar to yours: https://community.dataiku.com/t5/Using-Dataiku-DSS/Refresh-read-schema-on-dataset-via-API/m-p/8730/highlight/true#M4405
Does this help?
Marlan
- I'm trying to write a python recipe that that calls a function from an internal module but I need to be able to pass the name of the input and output table for that recipe. Since the user can specify …Solution by
- Looks like an error was introduced in 5.0.1 (it worked in 5.0.0) that prevents dataiku.core.saved_model.Predictor.predict from working properly because it raises an error with Pandas ValueError: If us…Last answer byLast answer by ricslator
The error message says that if you're passing scalar values, you have to pass an index. Pandas unfortunately always needs an index when created a DatFrame from Dictionary. What this is essentially asking for is a column number for each dictionary to correspond to each dictionary. You can either set it yourself, or use an object with the following structure so pandas can determine the index itself:
df = pd.DataFrame({'A': [a], 'B': [b]})
or use scalar values and pass an index:
pd.DataFrame({'A': a, 'B': b}, index=[0])
- Hello there ! I'm currently working on a plugin to add a new preparation processor that allows us to compute "half-life" weights in a prepare recipe. The weighting function requires two parameters: a …Solution by
- Hi, I need to set a project level variable from a python recipe. How can I do that? I dont want to use Scenario in this case. Thanks for the example, TomasSolution by
- I have been experiencing OutOfMemory errors when running a python recipe when it tried to load a json dataset from file system. The files are about 2.5GB on disk and the host has 64G memory and there …Last answer byLast answer by quincybatten
In most cases, it might be an issue with:
- the delimiters in your data.
- confused by the headers/column of the file.
The error tokenizing data may arise when you're using separator (for eg. comma ',') as a delimiter and you have more separator than expected (more fields in the error row than defined in the header). So you need to either remove the additional field or remove the extra separator if it's there by mistake. The better solution is to investigate the offending file and to fix it manually so you don't need to skip the error lines.
Top Tags
Trending Discussions
- Answered2
- Answered ✓7
Leaderboard
Member | Points |
Turribeach | 3702 |
tgb417 | 2515 |
Ignacio_Toledo | 1082 |