The Dataiku Frontrunner Awards are now accepting submissions until July 15 to recognize your achievements! ENTER YOUR SUBMISSION

Trainning session loads all columns not only the selected ones causing an OutOfMemory exception

Gustavo_Brian
Level 2
Trainning session loads all columns not only the selected ones causing an OutOfMemory exception

Hi,



I'm having an OutOfMemoryException while running a training session. I can see on the logs that all the columns are normalized even they do not participate on the session. 




[2018-06-09 20:00:45,946] [24072/MainThread] [INFO] [root] Reading with FIXED dtypes: {u'Flygbolag': 'str', u'Distance': <type 'numpy.float64'>, u'Adults': <type 'numpy.float64'>, u'HasLuggage': <type 'numpy.object_'>, u'Duration_Of_Stay_In_Days': <type 'numpy.float64'>, u'Paxes': <type 'numpy.float64'>, u'DaysToDeparture': <type 'numpy.float64'>}
[2018-06-09 20:00:46,462] [24072/MainThread] [INFO] [root] Loaded table
[2018-06-09 20:00:46,466] [24072/MainThread] [INFO] [root] Normalizing date : 0 2018-08-13
1 2018-02-05
2 2018-12-27


The column "date" is rejected so should not be loaded, neither normalized.



Am I missing a setting or configuration to avoid this behavior?

0 Kudos
1 Reply
Clément_Stenac
Dataiker
Dataiker
Hi,

You can remove the column before it reaches the ML part by using the "Script" tab and clicking on column > remove (usual preparation steps)
0 Kudos
Labels (2)
A banner prompting to get Dataiku DSS
Public