Use Custom UDFs on Visual Recipes

saulleon
Level 3
Use Custom UDFs on Visual Recipes

Hello Dataikers!

Since all visual recipes are based on SparkSQL, some "advance" aggregations aren't available. In this case, I have 3 values on 3 columns: A, B, C. And I just want to compute Median from them.

The problem is that Median function doesn't exist on my current Spark backend version, so I need to use a UDF to do it. But since this is a visual recipe, I cannot inject my "advance" UDF.

Do you know if there's any place where I can define UDFs and Dataiku can read them and then bind them in visual recipes? Something similar to Global Code.

Thank you in advance,

-Saul

Happy coding!

0 Kudos
2 Replies
ChrisWalter
Level 2

Hey Saul!

I get your struggle. You can define custom UDFs in Dataiku DSS under the "Code Libraries" section, and then you should be able to use them in your visual recipes. Happy coding indeed!

0 Kudos
saulleon
Level 3
Author

Hello Chris,

Thanks for your response, do you think you please provide an example of it?

My goal is compute median over and array in a visual recipe. I already have this code in Code Libs:

 

from pyspark.sql.functions import udf
from pyspark.sql.types import FloatType
import numpy as np

arrayMedianUDF = udf(lambda array: np.median(array), FloatType())

 


Now how can I call arrayMedianUDF on a visual recipe?


recipe.png

 

Thank you in advance,

Saul

 



0 Kudos

Labels

?
Labels (3)
A banner prompting to get Dataiku