We're excited to announce that we're launching the second installment of Dataiku Product Days Register Now

How to reapply rescaling when you want to predict your data

Solved!
UserBird
Dataiker
Dataiker
How to reapply rescaling when you want to predict your data

I created a Decision Tree model and I want to export it so I can use it outside of Dataiku.



I took the pickle file and loaded into Python to continue using it there.




import pickle

f = open('clf.pkl', 'rb')
loaded_model = pickle.load(f, encoding='latin1')


On model settings, I used standard rescaling which uses the avgstd. Dataiku also exports this json file with details about the rescaling:




{
"shifts": [
4.2708957215287455,
5.582300530732055,
4.721780769116731,
6.309030531691733,
4.534705132386515,
50183.866161634876,
4.628957297141036,
5.931597829632046,
1.834355009673187,
21814.135528393213,
0.9999925875959688,
0.23165941222883746,
-0.11146363232269413
],
"columns": [
"col1",
"col2",
"col3",
"col4",
"col5",
"col6",
"col7",
"col8",
"col9",
"col10",
"col11",
"col12",
"col13"
],
"inv_scales": [
0.29041217789420476,
0.32026605114154144,
0.3398879256267485,
0.2539738260220278,
0.27817344479641604,
1.1217850173179438e-05,
0.3181917203503525,
0.2886476076886483,
0.37842451508835384,
2.329233011164756e-05,
0.21830904186227362,
2.003574563132119,
1.386943696546877
]
}


Let's say I have a new input with the original values (before rescaling). How can I use the above information to rescale all the features on the new object I have to predict the results?

0 Kudos
1 Solution
Clément_Stenac
Dataiker
Dataiker
Hi,

For each column:

rescaled_feature = (input_feature - shift) * inv_scale

View solution in original post

0 Kudos
1 Reply
Clément_Stenac
Dataiker
Dataiker
Hi,

For each column:

rescaled_feature = (input_feature - shift) * inv_scale

View solution in original post

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku DSS