How to deal with "Processor FillEmptyWithComputedValue is not available"?
Hi,
While doing the median imputation for missing values, DSS is showing this message:
Processor FillEmptyWithComputedValue is not available in DSS engine
Does this mean, I'll have to do the imputation in the notebook and not directly in the prepare recipe?
Thanks,
Parul.
(Topic title edited by moderator to be more descriptive. Original title "Using Dataiku")
Answers
-
Hi,
the Prepare recipe is almost entirely a row-by-row affair, so it doesn't blend well with computing things like a median, which take many rows into consideration. You can
- run the Prepare recipe using Spark (or SQL if the input dataset is SQL)
- use a notebook or a python recipe
Using only visual recipes in the flow, you can also retrieve the median with a window recipe (to compute the quantiles) but it requires some convolutions afterwards to get the median value back on the rows.