Community Conundrum 28: News Engagement is live! Read More

Trainning session loads all columns not only the selected ones causing an OutOfMemory exception

Level 2
Trainning session loads all columns not only the selected ones causing an OutOfMemory exception

Hi,



I'm having an OutOfMemoryException while running a training session. I can see on the logs that all the columns are normalized even they do not participate on the session. 




[2018-06-09 20:00:45,946] [24072/MainThread] [INFO] [root] Reading with FIXED dtypes: {u'Flygbolag': 'str', u'Distance': <type 'numpy.float64'>, u'Adults': <type 'numpy.float64'>, u'HasLuggage': <type 'numpy.object_'>, u'Duration_Of_Stay_In_Days': <type 'numpy.float64'>, u'Paxes': <type 'numpy.float64'>, u'DaysToDeparture': <type 'numpy.float64'>}
[2018-06-09 20:00:46,462] [24072/MainThread] [INFO] [root] Loaded table
[2018-06-09 20:00:46,466] [24072/MainThread] [INFO] [root] Normalizing date : 0 2018-08-13
1 2018-02-05
2 2018-12-27


The column "date" is rejected so should not be loaded, neither normalized.



Am I missing a setting or configuration to avoid this behavior?

0 Kudos
1 Reply
Dataiker
Dataiker
Hi,

You can remove the column before it reaches the ML part by using the "Script" tab and clicking on column > remove (usual preparation steps)
0 Kudos
Labels (2)
A banner prompting to get Dataiku DSS