ML model downloaded in PMML format doesn't work.

RushR Registered Posts: 3 ✭✭✭

Hello dear experts, I downloaded random forest model in PMML format from Dataiku. While trying to make prediction I got

{'prediction': '0', 'proba_0': nan, 'proba_1': nan}

My python script is attached as a screenshot.

While working with other models in PMML format (not from Dataiku) code works well - screenshot is attached. So I can guess that it could be issue with model exported from Dataiku.

Please tell me how to fix it, thank you.

Best Answer

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    Answer ✓

    This is most probably a pypmml bug. I can get probas when executing your pmml with jpmml evaluator which is the default scoring engine for PMML files.
    Here is the command I use and the output

    java -Dfile.encoding=utf8 -cp /jpmml-evaluator/example-1.4.3.jar org.jpmml.evaluator.EvaluationExample --model /predict-flag.pmml --input /input.csv --output /out.csv --separator ";" F1;F2;F3;F4;prediction;proba_0;proba_15614.1445;6.7;4.5;53.5;0;0.9714285714285714;0.02857142857142857

    I would suggest using jpmml evaluator to score PMML models. Here are some steps to get your started on their repo (Example applications)

    Arnaud d'Esquerre


Setup Info
      Help me…