ML model downloaded in PMML format doesn't work.
Hello dear experts, I downloaded random forest model in PMML format from Dataiku. While trying to make prediction I got
{'prediction': '0', 'proba_0': nan, 'proba_1': nan}
My python script is attached as a screenshot.
While working with other models in PMML format (not from Dataiku) code works well - screenshot is attached. So I can guess that it could be issue with model exported from Dataiku.
Please tell me how to fix it, thank you.
Best Answer
-
Hello,
This is most probably a pypmml bug. I can get probas when executing your pmml with jpmml evaluator which is the default scoring engine for PMML files.
Here is the command I use and the outputjava -Dfile.encoding=utf8 -cp /jpmml-evaluator/example-1.4.3.jar org.jpmml.evaluator.EvaluationExample --model /predict-flag.pmml --input /input.csv --output /out.csv --separator ";"
F1;F2;F3;F4;prediction;proba_0;proba_1 5614.1445;6.7;4.5;53.5;0;0.9714285714285714;0.02857142857142857
I would suggest using jpmml evaluator to score PMML models. Here are some steps to get your started on their repo (Example applications)
Best,
Arnaud d'Esquerre
Answers
-
Hello,
Could you share the pmml model exported by Dataiku ?
Best,
Arnaud -
-
Arnauld d'Esquerre,
Thank you. Appreciate it! It works indeed with jpmml evaluator.
To stay in Python environment I used jpmml-evaluator-python library - attached the snapshot of my code.
As far as I noticed:
- Java must be installed on your PC anyway.
- There is a little difference between outputs of model at Dataiku DSS and imported PMML model (I hope it won't be critical at deployment stage).
Best regards,
Rushad Rakhimov