Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

How to reapply rescaling when you want to predict your data

Solved!
UserBird
Dataiker
How to reapply rescaling when you want to predict your data

I created a Decision Tree model and I want to export it so I can use it outside of Dataiku.



I took the pickle file and loaded into Python to continue using it there.




import pickle

f = open('clf.pkl', 'rb')
loaded_model = pickle.load(f, encoding='latin1')


On model settings, I used standard rescaling which uses the avgstd. Dataiku also exports this json file with details about the rescaling:




{
"shifts": [
4.2708957215287455,
5.582300530732055,
4.721780769116731,
6.309030531691733,
4.534705132386515,
50183.866161634876,
4.628957297141036,
5.931597829632046,
1.834355009673187,
21814.135528393213,
0.9999925875959688,
0.23165941222883746,
-0.11146363232269413
],
"columns": [
"col1",
"col2",
"col3",
"col4",
"col5",
"col6",
"col7",
"col8",
"col9",
"col10",
"col11",
"col12",
"col13"
],
"inv_scales": [
0.29041217789420476,
0.32026605114154144,
0.3398879256267485,
0.2539738260220278,
0.27817344479641604,
1.1217850173179438e-05,
0.3181917203503525,
0.2886476076886483,
0.37842451508835384,
2.329233011164756e-05,
0.21830904186227362,
2.003574563132119,
1.386943696546877
]
}


Let's say I have a new input with the original values (before rescaling). How can I use the above information to rescale all the features on the new object I have to predict the results?

0 Kudos
1 Solution
Clรฉment_Stenac
Hi,

For each column:

rescaled_feature = (input_feature - shift) * inv_scale

View solution in original post

0 Kudos
1 Reply
Clรฉment_Stenac
Hi,

For each column:

rescaled_feature = (input_feature - shift) * inv_scale
0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku