Scoring fails.

Solved!
mattmagic
Level 2
Scoring fails.

Hi,



from time to time the scoring of a prediction model fails.

The only thing that seems to solve it is to train a new model and deploy a completely different algorithm.



Here is my log:




[09:23:38] [INFO] [dku.utils] - 2017-09-19 09:23:38,188 INFO Reading with dtypes: {u'Domain': None, u'DescriptionProcessed': 'str'}
[09:23:38] [INFO] [dku.utils] - 2017-09-19 09:23:38,188 INFO Column 0 = Domain (dtype=None)
[09:23:38] [INFO] [dku.utils] - 2017-09-19 09:23:38,188 INFO Column 1 = Source (dtype=None)
[09:23:38] [INFO] [dku.utils] - 2017-09-19 09:23:38,188 INFO Column 2 = DescriptionProcessed (dtype=str)
[09:23:38] [INFO] [dku.utils] - 2017-09-19 09:23:38,283 INFO Starting dataframes iterator
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,688 INFO Got a dataframe : (100000, 3)
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,688 INFO Coercion done
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,688 INFO NORMALIZED: Domain -> object
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,688 INFO NORMALIZED: Source -> object
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,688 INFO NORMALIZED: DescriptionProcessed -> object
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,689 INFO Processing it
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,697 INFO Set MF index len 100000
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,697 DEBUG PROCESS WITH Step:MultipleImputeMissingFromInput
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,697 DEBUG MIMIFI: Imputing with map {}
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,698 DEBUG PROCESS WITH Step:FlushDFBuilder(num_flagonly)
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,698 DEBUG PROCESS WITH Step:MultipleImputeMissingFromInput
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,698 DEBUG MIMIFI: Imputing with map {}
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,698 DEBUG PROCESS WITH Step:FlushDFBuilder(cat_flagpresence)
[09:23:41] [INFO] [dku.utils] - 2017-09-19 09:23:41,698 DEBUG PROCESS WITH <class 'dataiku.doctor.preprocessing.dataframe_preprocessing.TextTFIDFVectorizerProcessor'> (DescriptionProcessed)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,616 DEBUG PROCESS WITH Step:FlushDFBuilder(interaction)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,616 DEBUG PROCESS WITH Step:DumpPipelineState
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,616 INFO ********* Pipieline state (Before feature selection)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,616 INFO input_df= (100000, 3)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,616 INFO current_mf=(100000, 26558)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,616 INFO PPR:
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,616 DEBUG PROCESS WITH Step:EmitCurrentMFAsResult
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO Set MF index len 100000
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 DEBUG PROCESS WITH Step:DumpPipelineState
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO ********* Pipieline state (At end)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO input_df= (100000, 3)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO current_mf=(0, 0)
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO PPR:
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO TRAIN = <class 'dataiku.doctor.multiframe.MultiFrame'> ((100000, 26558))
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO Predicting it
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,625 INFO Prepare to predict ...
[09:24:02] [INFO] [com.dataiku.dip.dataflow.streaming.DatasetWritingService] - Init write session: tcenfKZdgi
[09:24:02] [DEBUG] [dku.jobs] - Command /tintercom/datasets/init-write-session processed in 14ms
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,679 INFO Initializing write data stream (tcenfKZdgi)
[09:24:02] [INFO] [dku.jobs] - Connects using API ticket
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,682 INFO Waiting for data to send ...
[09:24:02] [DEBUG] [dku.jobs] - Received command : /tintercom/datasets/wait-write-session
[09:24:02] [INFO] [dku.utils] - 2017-09-19 09:24:02,682 INFO Got end mark, ending send
[09:24:02] [INFO] [com.dataiku.dip.dataflow.streaming.DatasetWriter] - Creating output writer
[09:24:02] [INFO] [com.dataiku.dip.dataflow.streaming.DatasetWriter] - Initializing output writer
[09:24:02] [INFO] [dku.connections.sql.provider] - Connecting to jdbc:postgresql://dataiku.ct2brvwy8za8.us-east-1.rds.amazonaws.com:5432/Dataiku with props: {}
[09:24:02] [DEBUG] [dku.connections.sql.provider] - Driver version 9.0
[09:24:02] [INFO] [dku.connections.sql.provider] - Driver PostgreSQL Native Driver (JDBC 4.0) PostgreSQL 9.0 JDBC4 (build 801) (9.0)
[09:24:02] [INFO] [dku.connections.sql.provider] - Database PostgreSQL 9.5.4 (9.5) rowSize=1073741824 stmts=0
[09:24:02] [DEBUG] [dku.connections.sql.provider] - Set autocommit=false on conn=Dataiku_DB
[09:24:02] [INFO] [dku.sql.generic] - Dropping table
[09:24:02] [INFO] [dku.dataset.sql] - Executing statement:
[09:24:02] [INFO] [dku.dataset.sql] - DROP TABLE "CLUSTERREPORTCLUSTERINGNEW_cb_descriptions_consumerelectronics_classified"
[09:24:02] [INFO] [dku.dataset.sql] - Statement done
[09:24:02] [INFO] [dku.sql.generic] - Creating table
[09:24:02] [INFO] [dku.dataset.sql] - Executing statement:
[09:24:02] [INFO] [dku.dataset.sql] - CREATE TABLE "CLUSTERREPORTCLUSTERINGNEW_cb_descriptions_consumerelectronics_classified" (
"Domain" text,
"Source" text,
"DescriptionProcessed" text,
"proba_0.0" double precision,
"proba_1.0" double precision,
"prediction" double precision
)
[09:24:02] [INFO] [dku.dataset.sql] - Statement done
[09:24:02] [INFO] [com.dataiku.dip.dataflow.streaming.DatasetWriter] - Done initializing output writer
[09:24:02] [INFO] [dku.output.sql.pglike] - Copy done, copied 0 records
[09:24:02] [INFO] [dku.connections.sql.provider] - Commit conn=Dataiku_DB
[09:24:02] [DEBUG] [dku.connections.sql.provider] - Close conn=Dataiku_DB
[09:24:02] [INFO] [dku.output.sql.pglike] - Transaction done, copied 0 records
[09:24:02] [INFO] [com.dataiku.dip.dataflow.streaming.DatasetWritingService] - Pushed data to write session tcenfKZdgi : 0 rows
[09:24:02] [DEBUG] [dku.jobs] - Command /tintercom/datasets/push-data processed in 144ms
[09:24:02] [INFO] [com.dataiku.dip.dataflow.streaming.DatasetWritingService] - Finished write session: tcenfKZdgi
[09:24:02] [DEBUG] [dku.jobs] - Command /tintercom/datasets/wait-write-session processed in 146ms
[09:24:02] [INFO] [dku.utils] - 0 rows successfully written (tcenfKZdgi)
[09:24:02] [INFO] [dku.utils] - Traceback (most recent call last):
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dss/condaenv/lib/python2.7/runpy.py", line 174, in _run_module_as_main
[09:24:02] [INFO] [dku.utils] - "__main__", fname, loader, pkg_name)
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dss/condaenv/lib/python2.7/runpy.py", line 72, in _run_code
[09:24:02] [INFO] [dku.utils] - exec code in run_globals
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dataiku-dss-4.0.8/python/dataiku/doctor/prediction/reg_scoring_recipe.py", line 146, in <module>
[09:24:02] [INFO] [dku.utils] - json.load_from_filepath(sys.argv[7]))
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dataiku-dss-4.0.8/python/dataiku/doctor/prediction/reg_scoring_recipe.py", line 133, in main
[09:24:02] [INFO] [dku.utils] - for output_df in output_generator():
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dataiku-dss-4.0.8/python/dataiku/doctor/prediction/reg_scoring_recipe.py", line 78, in output_generator
[09:24:02] [INFO] [dku.utils] - output_probas=recipe_desc["outputProbabilities"])
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dataiku-dss-4.0.8/python/dataiku/doctor/prediction/classification_scoring.py", line 206, in binary_classification_predict
[09:24:02] [INFO] [dku.utils] - (pred_df, proba_df) = binary_classification_predict_ex(clf, modeling_params, target_map, threshold, transformed, output_probas)
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dataiku-dss-4.0.8/python/dataiku/doctor/prediction/classification_scoring.py", line 157, in binary_classification_predict_ex
[09:24:02] [INFO] [dku.utils] - features_X_df = features_X.as_dataframe()
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dataiku-dss-4.0.8/python/dataiku/doctor/multiframe.py", line 269, in as_dataframe
[09:24:02] [INFO] [dku.utils] - blkdf = pd.DataFrame(blk.matrix.toarray(), columns=blk.names)
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dss/condaenv/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 920, in toarray
[09:24:02] [INFO] [dku.utils] - return self.tocoo(copy=False).toarray(order=order, out=out)
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dss/condaenv/lib/python2.7/site-packages/scipy/sparse/coo.py", line 252, in toarray
[09:24:02] [INFO] [dku.utils] - B = self._process_toarray_args(order, out)
[09:24:02] [INFO] [dku.utils] - File "/home/dataiku/dss/condaenv/lib/python2.7/site-packages/scipy/sparse/base.py", line 1009, in _process_toarray_args
[09:24:02] [INFO] [dku.utils] - return np.zeros(self.shape, dtype=self.dtype, order=order)
[09:24:02] [INFO] [dku.utils] - MemoryError
[09:24:02] [INFO] [dku.flow.activity] - Run thread failed for activity score_CB_descriptions_final_without_labels_14_NP
com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs.
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:311)
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:231)
at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeModule(AbstractPythonRecipeRunner.java:47)
at com.dataiku.dip.analysis.ml.prediction.flow.PredictionScoringRecipeRunner.runOriginalPython(PredictionScoringRecipeRunner.java:388)
at com.dataiku.dip.analysis.ml.prediction.flow.PredictionScoringRecipeRunner.runWithOriginalEngine(PredictionScoringRecipeRunner.java:294)
at com.dataiku.dip.analysis.ml.prediction.flow.PredictionScoringRecipeRunner.run(PredictionScoringRecipeRunner.java:220)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:353)
[09:24:03] [INFO] [dku.flow.activity] running score_CB_descriptions_final_without_labels_14_NP - activity is finished
[09:24:03] [ERROR] [dku.flow.activity] running score_CB_descriptions_final_without_labels_14_NP - Activity failed
com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs.
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:311)
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:231)
at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeModule(AbstractPythonRecipeRunner.java:47)
at com.dataiku.dip.analysis.ml.prediction.flow.PredictionScoringRecipeRunner.runOriginalPython(PredictionScoringRecipeRunner.java:388)
at com.dataiku.dip.analysis.ml.prediction.flow.PredictionScoringRecipeRunner.runWithOriginalEngine(PredictionScoringRecipeRunner.java:294)
at com.dataiku.dip.analysis.ml.prediction.flow.PredictionScoringRecipeRunner.run(PredictionScoringRecipeRunner.java:220)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:353)




I have wasted hours to create and re-create models.

Does anyone have a similar experience?

0 Kudos
1 Solution
cperdigou
Dataiker Alumni
Hello,

Your input data is very large, with 26k+ columns. Scoring is done on chunks of 100k rows (this cannot be changed), meaning that in your case this creates a 100k * 26k matrix, which does not fit in memory and leads to the "Memory error" message that you have in your logs.

View solution in original post

2 Replies
cperdigou
Dataiker Alumni
Hello,

Your input data is very large, with 26k+ columns. Scoring is done on chunks of 100k rows (this cannot be changed), meaning that in your case this creates a 100k * 26k matrix, which does not fit in memory and leads to the "Memory error" message that you have in your logs.
mattmagic
Level 2
Author
Thank you, I reduced the maximum total words in the TF-IDF.
0 Kudos