Saving custom prepare processor
Hello everyone,
I began using Dataiku a few days ago. I have a lot of "address" data, and I tried to use the Geocoder pluging in order to convert them into usable coordinates and geopoints.
As this plugin induces many java errors on my installation (NullPointerException), I decided to create my own prepare-recipe Python processor which call the Google Geocoding API.
Thus, I have 2 questions :
- Is there a possibility to create multiple columns with a Python processor ? I would like to add latitude, longitude, for example, but I cannot find how to do that.
- is there a way to save a Python processor in order to re-use it on other recipes or other projects ?
Thanks a log for your answers.
Operating system used: AlmaLinux (AWS)
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @PeteGore
,The python processor within a prepared recipe can only apply to one column.
You can compute both and put them to the targeted column for the python processor and later split it to separate columns with another processor.
You can perform this in a Python recipe instead.
You can leverage project libraries if you need to reuse python code https://doc.dataiku.com/dss/latest/python/reusing-code.html
Or package a python recipe as a plugging:
https://doc.dataiku.com/dss/latest/python/reusing-code.html#packaging-code-as-plugins