[2022/11/30-16:07:04.986] [ActivityExecutor-29] [INFO] [dku] running compute_e2JfcT5d_NP - ---------------------------------------- [2022/11/30-16:07:04.986] [ActivityExecutor-29] [INFO] [dku] running compute_e2JfcT5d_NP - DSS startup: jek version:11.1.0 [2022/11/30-16:07:04.986] [ActivityExecutor-29] [INFO] [dku] running compute_e2JfcT5d_NP - DSS home: C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home [2022/11/30-16:07:04.986] [ActivityExecutor-29] [INFO] [dku] running compute_e2JfcT5d_NP - OS: Windows 10 10.0 amd64 - Java: Temurin 1.8.0_322 [2022/11/30-16:07:04.949] [ActivityExecutor-29] [INFO] [dku.flow.jobrunner] running compute_e2JfcT5d_NP - Allocated a slot for this activity! [2022/11/30-16:07:05.021] [ActivityExecutor-29] [INFO] [dku.flow.jobrunner] running compute_e2JfcT5d_NP - Run activity [2022/11/30-16:07:05.082] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_e2JfcT5d_NP - Executing default pre-activity lifecycle hook [2022/11/30-16:07:05.279] [ActivityExecutor-29] [INFO] [dku.managedfolders.handler] running compute_e2JfcT5d_NP - Create provider for DKU_EXAM_ADV_DESIGNER.e2JfcT5d with path DKU_EXAM_ADV_DESIGNER/e2JfcT5d [2022/11/30-16:07:05.335] [ActivityExecutor-29] [INFO] [dku.managedfolders.handler] running compute_e2JfcT5d_NP - Ensured folder for root of managed folder [2022/11/30-16:07:05.368] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_e2JfcT5d_NP - Checking if sources are ready [2022/11/30-16:07:05.403] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_e2JfcT5d_NP - Will check readiness of DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct p=NP [2022/11/30-16:07:05.450] [ActivityExecutor-29] [INFO] [dku.datasets.file] running compute_e2JfcT5d_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2022/11/30-16:07:05.483] [ActivityExecutor-29] [DEBUG] [dku.datasets.fsbased] running compute_e2JfcT5d_NP - getReadiness: will enumerate partition [2022/11/30-16:07:05.518] [ActivityExecutor-29] [INFO] [dku.datasets.ftplike] running compute_e2JfcT5d_NP - Enumerating Filesystem dataset prefix= [2022/11/30-16:07:05.552] [ActivityExecutor-29] [DEBUG] [dku.datasets.fsbased] running compute_e2JfcT5d_NP - Building FS provider for dataset handler: DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct [2022/11/30-16:07:05.591] [ActivityExecutor-29] [DEBUG] [dku.datasets.fsbased] running compute_e2JfcT5d_NP - FS Provider built [2022/11/30-16:07:05.618] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_e2JfcT5d_NP - Enumerating local filesystem prefix=/ [2022/11/30-16:07:05.655] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_e2JfcT5d_NP - Enumeration done nb_paths=1 size=8624612 [2022/11/30-16:07:05.685] [ActivityExecutor-29] [DEBUG] [dku.datasets.fsbased] running compute_e2JfcT5d_NP - getReadiness: enumerated partition, found 1 paths, computing hash [2022/11/30-16:07:05.715] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_e2JfcT5d_NP - Checked source readiness DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct -> true [2022/11/30-16:07:05.747] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_e2JfcT5d_NP - Computing hashes to propagate BEFORE activity [2022/11/30-16:07:05.776] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_e2JfcT5d_NP - Recorded 1 hashes before activity run [2022/11/30-16:07:05.804] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_e2JfcT5d_NP - Building recipe runner of type [2022/11/30-16:07:05.849] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_e2JfcT5d_NP - Recipe runner built, will use 1 thread(s) [2022/11/30-16:07:05.882] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_e2JfcT5d_NP - Starting execution thread: com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner@5a0a9076 [2022/11/30-16:07:05.914] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_e2JfcT5d_NP - Execution threads started, waiting for activity end [2022/11/30-16:07:05.918] [FRT-34-FlowRunnable] [INFO] [dku.flow.activity] act.compute_e2JfcT5d_NP - Run thread for activity compute_e2JfcT5d_NP starting [2022/11/30-16:07:06.019] [FRT-34-FlowRunnable] [INFO] [dku.flow.custompython] act.compute_e2JfcT5d_NP - Dumping Python script to C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\compute_e2JfcT5d_NP\custom-python-recipe\pyoutbXmtciHfpXbd\script.py [2022/11/30-16:07:06.080] [FRT-34-FlowRunnable] [INFO] [dku.flow.abstract.python] act.compute_e2JfcT5d_NP - Dumping Python script to C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\compute_e2JfcT5d_NP\custom-python-recipe\pyoutbXmtciHfpXbd\script.py [2022/11/30-16:07:06.146] [FRT-34-FlowRunnable] [INFO] [dku.datasets.file] act.compute_e2JfcT5d_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2022/11/30-16:07:06.178] [FRT-34-FlowRunnable] [DEBUG] [dku.datasets.fsbased] act.compute_e2JfcT5d_NP - Building FS provider for dataset handler: DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct [2022/11/30-16:07:06.206] [FRT-34-FlowRunnable] [DEBUG] [dku.datasets.fsbased] act.compute_e2JfcT5d_NP - FS Provider built [2022/11/30-16:07:06.234] [FRT-34-FlowRunnable] [INFO] [dku.datasets.file] act.compute_e2JfcT5d_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2022/11/30-16:07:06.266] [FRT-34-FlowRunnable] [DEBUG] [dku.datasets.fsbased] act.compute_e2JfcT5d_NP - Building FS provider for dataset handler: DKU_EXAM_ADV_DESIGNER.Online_Retail_Distinct [2022/11/30-16:07:06.303] [FRT-34-FlowRunnable] [DEBUG] [dku.datasets.fsbased] act.compute_e2JfcT5d_NP - FS Provider built [2022/11/30-16:07:06.399] [FRT-34-FlowRunnable] [INFO] [dku.code.projectLibs] act.compute_e2JfcT5d_NP - EXTERNAL LIBS FROM DKU_EXAM_ADV_DESIGNER is {"gitReferences":{},"pythonPath":["python"],"rsrcPath":["R"],"importLibrariesFromProjects":[]} [2022/11/30-16:07:06.439] [FRT-34-FlowRunnable] [INFO] [dku.code.projectLibs] act.compute_e2JfcT5d_NP - chunkFolder is C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\localconfig\projects\DKU_EXAM_ADV_DESIGNER\lib\R [2022/11/30-16:07:06.481] [FRT-34-FlowRunnable] [INFO] [dip.plugin.presets] act.compute_e2JfcT5d_NP - Checking project-level settings for overriden presets and additional presets [2022/11/30-16:07:06.534] [FRT-34-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_e2JfcT5d_NP - Writing dku-exec-env for local execution in C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\compute_e2JfcT5d_NP\custom-python-recipe\pyoutbXmtciHfpXbd\remote-run-env-def.json [2022/11/30-16:07:06.603] [FRT-34-FlowRunnable] [INFO] [dku.code.envs.resolution] act.compute_e2JfcT5d_NP - Executing Python activity in env: plugin_nlp-visualization_managed [2022/11/30-16:07:06.657] [FRT-34-FlowRunnable] [INFO] [dku.flow.abstract.python] act.compute_e2JfcT5d_NP - Execute activity command: ["C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed\\Scripts\\python.exe","-u","C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd\\python-exec-wrapper.py","C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd\\script.py"] [2022/11/30-16:07:06.718] [FRT-34-FlowRunnable] [INFO] [dku.security.process] act.compute_e2JfcT5d_NP - Starting process (regular) [2022/11/30-16:07:07.037] [FRT-34-FlowRunnable] [INFO] [dku.security.process] act.compute_e2JfcT5d_NP - Process started with pid=4060 [2022/11/30-16:07:07.082] [FRT-34-FlowRunnable] [INFO] [dku.processes.cgroups] act.compute_e2JfcT5d_NP - Will use cgroups [] [2022/11/30-16:07:07.117] [FRT-34-FlowRunnable] [INFO] [dku.processes.cgroups] act.compute_e2JfcT5d_NP - Applying rules to used cgroups: [] [2022/11/30-16:07:07.161] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,117 INFO -------------------- [2022/11/30-16:07:07.164] [FRT-34-FlowRunnable] [DEBUG] [dku.resourceusage] act.compute_e2JfcT5d_NP - Reporting start of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"DKU_EXAM_ADV_DESIGNER","jobId":"Build_word_cloud__NP__2022-11-30T08-06-56.287","activityId":"compute_e2JfcT5d_NP","activityType":"recipe","recipeType":"CustomCode_nlp-visualization-wordcloud","recipeName":"compute_e2JfcT5d"},"type":"LOCAL_PROCESS","id":"wV4PLxZm62SzGJNE","startTime":1669795627157,"localProcess":{"cpuCurrent":0.0}} [2022/11/30-16:07:07.196] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,117 INFO Dataiku Python entrypoint starting up [2022/11/30-16:07:07.244] [process-resource-monitor-4060-39] [DEBUG] [dku.resource] - Process stats for pid 4060: {"pid":4060,"commandName":"C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed\\Scripts\\python.exe","cpuCurrent":0.0,"vmRSSTotalMBS":0} [2022/11/30-16:07:07.279] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,117 INFO executable = C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\code-envs\python\plugin_nlp-visualization_managed\Scripts\python.exe [2022/11/30-16:07:07.353] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,117 INFO argv = ['C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd\\python-exec-wrapper.py', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd\\script.py'] [2022/11/30-16:07:07.395] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,118 INFO -------------------- [2022/11/30-16:07:07.433] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,118 INFO Looking for RemoteRunEnvDef in .\remote-run-env-def.json [2022/11/30-16:07:07.469] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,118 INFO Found RemoteRunEnvDef environment: .\remote-run-env-def.json [2022/11/30-16:07:07.512] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,128 INFO Running a DSS Python recipe locally, uinsetting env [2022/11/30-16:07:07.553] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,130 INFO Setup complete, ready to execute Python code [2022/11/30-16:07:07.594] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,130 INFO Sys path: ['C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\lib\\python', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\kits\\dataiku-dss-11.1.0-win\\python', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed\\Scripts\\python36.zip', 'C:\\Users\\low.yun\\AppData\\Local\\Programs\\Python\\Python36\\DLLs', 'C:\\Users\\low.yun\\AppData\\Local\\Programs\\Python\\Python36\\lib', 'C:\\Users\\low.yun\\AppData\\Local\\Programs\\Python\\Python36', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed\\lib\\site-packages', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\localconfig\\projects\\DKU_EXAM_ADV_DESIGNER\\lib\\python', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\plugins\\installed\\nlp-visualization\\python-lib'] [2022/11/30-16:07:07.635] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:07,130 INFO Script file: C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\compute_e2JfcT5d_NP\custom-python-recipe\pyoutbXmtciHfpXbd\script.py [2022/11/30-16:07:14.740] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:14,739 INFO Text column: description [2022/11/30-16:07:14.776] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:14,740 INFO Language: en [2022/11/30-16:07:14.803] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:14,740 INFO Subcharts column: None [2022/11/30-16:07:21.478] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,477 INFO Read dataset of shape: (694384, 1) [2022/11/30-16:07:21.502] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Remove stopwords: False [2022/11/30-16:07:21.521] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Stopwords folder path: None [2022/11/30-16:07:21.537] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Fonts folder path: C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\resource\fonts [2022/11/30-16:07:21.557] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Remove punctuation: False [2022/11/30-16:07:21.578] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Case-insensitive: False [2022/11/30-16:07:21.597] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Max number of words: 100 [2022/11/30-16:07:21.617] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Using built-in DSS palette: 'Default' with colors: ['#1F77B4', '#FF7F0E', '#2CA02C', '#D62728', '#9467BD', '#8C564B', '#E377C2', '#7F7F7F'] [2022/11/30-16:07:21.645] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Preparing data... [2022/11/30-16:07:21.664] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,478 INFO Preparing data: Done in 0.00 seconds. [2022/11/30-16:07:21.686] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:21,550 INFO Tokenizing 694384 document(s) in language 'en'... [2022/11/30-16:07:22.137] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,136 INFO Loading tokenizer for language 'en'... [2022/11/30-16:07:22.379] [null-err-37] [INFO] [dku.utils] - [2022-11-30 16:07:22,378] [INFO] Created vocabulary [2022/11/30-16:07:22.400] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,378 INFO Created vocabulary [2022/11/30-16:07:22.421] [null-err-37] [INFO] [dku.utils] - [2022-11-30 16:07:22,379] [INFO] Finished initializing nlp object [2022/11/30-16:07:22.448] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,379 INFO Finished initializing nlp object [2022/11/30-16:07:22.623] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,623 INFO Loading tokenizer for language 'en': done in 0.49 seconds [2022/11/30-16:07:22.647] [null-err-37] [INFO] [dku.utils] - *************** Recipe code failed ************** [2022/11/30-16:07:22.667] [null-err-37] [INFO] [dku.utils] - Begin Python stack [2022/11/30-16:07:22.687] [null-err-37] [INFO] [dku.utils] - Traceback (most recent call last): [2022/11/30-16:07:22.707] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\compute_e2JfcT5d_NP\custom-python-recipe\pyoutbXmtciHfpXbd\python-exec-wrapper.py", line 208, in [2022/11/30-16:07:22.727] [null-err-37] [INFO] [dku.utils] - exec(f.read()) [2022/11/30-16:07:22.747] [null-err-37] [INFO] [dku.utils] - File "", line 33, in [2022/11/30-16:07:22.771] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\wordcloud_visualizer.py", line 343, in tokenize_and_count [2022/11/30-16:07:22.797] [null-err-37] [INFO] [dku.utils] - docs = self._tokenize_texts(df_prepared) [2022/11/30-16:07:22.823] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\wordcloud_visualizer.py", line 215, in _tokenize_texts [2022/11/30-16:07:22.846] [null-err-37] [INFO] [dku.utils] - for text_list, language in zip(texts, languages) [2022/11/30-16:07:22.871] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\wordcloud_visualizer.py", line 215, in [2022/11/30-16:07:22.891] [null-err-37] [INFO] [dku.utils] - for text_list, language in zip(texts, languages) [2022/11/30-16:07:22.912] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\spacy_tokenizer.py", line 359, in tokenize_list [2022/11/30-16:07:22.934] [null-err-37] [INFO] [dku.utils] - n_process=self.DEFAULT_NUM_PROCESS, [2022/11/30-16:07:22.958] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\code-envs\python\plugin_nlp-visualization_managed\lib\site-packages\spacy\language.py", line 1485, in pipe [2022/11/30-16:07:22.982] [null-err-37] [INFO] [dku.utils] - for doc in docs: [2022/11/30-16:07:23.005] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\code-envs\python\plugin_nlp-visualization_managed\lib\site-packages\spacy\language.py", line 1521, in _multiprocessing_pipe [2022/11/30-16:07:23.030] [null-err-37] [INFO] [dku.utils] - proc.start() [2022/11/30-16:07:23.055] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\process.py", line 105, in start [2022/11/30-16:07:23.082] [null-err-37] [INFO] [dku.utils] - self._popen = self._Popen(self) [2022/11/30-16:07:23.111] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 223, in _Popen [2022/11/30-16:07:23.135] [null-err-37] [INFO] [dku.utils] - return _default_context.get_context().Process._Popen(process_obj) [2022/11/30-16:07:23.159] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 322, in _Popen [2022/11/30-16:07:23.184] [null-err-37] [INFO] [dku.utils] - return Popen(process_obj) [2022/11/30-16:07:23.207] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__ [2022/11/30-16:07:23.237] [null-err-37] [INFO] [dku.utils] - reduction.dump(process_obj, to_child) [2022/11/30-16:07:23.265] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\reduction.py", line 60, in dump [2022/11/30-16:07:23.291] [null-err-37] [INFO] [dku.utils] - ForkingPickler(file, protocol).dump(obj) [2022/11/30-16:07:23.318] [null-err-37] [INFO] [dku.utils] - _pickle.PicklingError: Can't pickle at 0x000001C5AFDC80D0>: attribute lookup on spacy_tokenizer failed [2022/11/30-16:07:23.346] [null-err-37] [INFO] [dku.utils] - End Python stack [2022/11/30-16:07:23.371] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,655 INFO Check if spark is available [2022/11/30-16:07:23.399] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,656 INFO Not stopping a spark context: No module named 'pyspark' [2022/11/30-16:07:23.427] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,802 INFO -------------------- [2022/11/30-16:07:23.454] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,802 INFO Dataiku Python entrypoint starting up [2022/11/30-16:07:23.480] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,802 INFO executable = C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\code-envs\python\plugin_nlp-visualization_managed\Scripts\python.exe [2022/11/30-16:07:23.508] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,802 INFO argv = ['C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd\\python-exec-wrapper.py', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd\\script.py'] [2022/11/30-16:07:23.535] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,802 INFO -------------------- [2022/11/30-16:07:23.566] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,802 INFO Looking for RemoteRunEnvDef in .\remote-run-env-def.json [2022/11/30-16:07:23.597] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,802 INFO Found RemoteRunEnvDef environment: .\remote-run-env-def.json [2022/11/30-16:07:23.625] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,803 INFO Running a DSS Python recipe locally, uinsetting env [2022/11/30-16:07:23.654] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,805 INFO Setup complete, ready to execute Python code [2022/11/30-16:07:23.683] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,805 INFO Sys path: ['C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\compute_e2JfcT5d_NP\\custom-python-recipe\\pyoutbXmtciHfpXbd', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\lib\\python', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\kits\\dataiku-dss-11.1.0-win\\python', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed\\Scripts\\python36.zip', 'C:\\Users\\low.yun\\AppData\\Local\\Programs\\Python\\Python36\\DLLs', 'C:\\Users\\low.yun\\AppData\\Local\\Programs\\Python\\Python36\\lib', 'C:\\Users\\low.yun\\AppData\\Local\\Programs\\Python\\Python36', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed\\lib\\site-packages', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\localconfig\\projects\\DKU_EXAM_ADV_DESIGNER\\lib\\python', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\plugins\\installed\\nlp-visualization\\python-lib', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\code-envs\\python\\plugin_nlp-visualization_managed\\lib\\site-packages\\IPython\\extensions', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\DKU_EXAM_ADV_DESIGNER\\Build_word_cloud__NP__2022-11-30T08-06-56.287\\localconfig\\projects\\DKU_EXAM_ADV_DESIGNER\\lib\\python', 'C:\\Users\\low.yun\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\plugins\\installed\\nlp-visualization\\python-lib'] [2022/11/30-16:07:23.710] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:22,805 INFO Script file: C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\compute_e2JfcT5d_NP\custom-python-recipe\pyoutbXmtciHfpXbd\script.py [2022/11/30-16:07:27.171] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:27,171 INFO Text column: description [2022/11/30-16:07:27.204] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:27,171 INFO Language: en [2022/11/30-16:07:27.230] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:27,171 INFO Subcharts column: None [2022/11/30-16:07:31.947] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,947 INFO Read dataset of shape: (694384, 1) [2022/11/30-16:07:31.969] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,947 INFO Remove stopwords: False [2022/11/30-16:07:31.989] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,947 INFO Stopwords folder path: None [2022/11/30-16:07:32.008] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,947 INFO Fonts folder path: C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\resource\fonts [2022/11/30-16:07:32.027] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,947 INFO Remove punctuation: False [2022/11/30-16:07:32.045] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,948 INFO Case-insensitive: False [2022/11/30-16:07:32.066] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,948 INFO Max number of words: 100 [2022/11/30-16:07:32.086] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,948 INFO Using built-in DSS palette: 'Default' with colors: ['#1F77B4', '#FF7F0E', '#2CA02C', '#D62728', '#9467BD', '#8C564B', '#E377C2', '#7F7F7F'] [2022/11/30-16:07:32.105] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,948 INFO Preparing data... [2022/11/30-16:07:32.131] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:31,948 INFO Preparing data: Done in 0.00 seconds. [2022/11/30-16:07:32.153] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:32,014 INFO Tokenizing 694384 document(s) in language 'en'... [2022/11/30-16:07:32.560] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:32,560 INFO Loading tokenizer for language 'en'... [2022/11/30-16:07:32.812] [null-err-37] [INFO] [dku.utils] - [2022-11-30 16:07:32,812] [INFO] Created vocabulary [2022/11/30-16:07:32.837] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:32,812 INFO Created vocabulary [2022/11/30-16:07:32.857] [null-err-37] [INFO] [dku.utils] - [2022-11-30 16:07:32,812] [INFO] Finished initializing nlp object [2022/11/30-16:07:32.875] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:32,812 INFO Finished initializing nlp object [2022/11/30-16:07:33.016] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:33,016 INFO Loading tokenizer for language 'en': done in 0.46 seconds [2022/11/30-16:07:33.039] [null-err-37] [INFO] [dku.utils] - *************** Recipe code failed ************** [2022/11/30-16:07:33.058] [null-err-37] [INFO] [dku.utils] - Begin Python stack [2022/11/30-16:07:33.078] [null-err-37] [INFO] [dku.utils] - Traceback (most recent call last): [2022/11/30-16:07:33.096] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\DKU_EXAM_ADV_DESIGNER\Build_word_cloud__NP__2022-11-30T08-06-56.287\compute_e2JfcT5d_NP\custom-python-recipe\pyoutbXmtciHfpXbd\python-exec-wrapper.py", line 208, in [2022/11/30-16:07:33.116] [null-err-37] [INFO] [dku.utils] - exec(f.read()) [2022/11/30-16:07:33.133] [null-err-37] [INFO] [dku.utils] - File "", line 33, in [2022/11/30-16:07:33.152] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\wordcloud_visualizer.py", line 343, in tokenize_and_count [2022/11/30-16:07:33.170] [null-err-37] [INFO] [dku.utils] - docs = self._tokenize_texts(df_prepared) [2022/11/30-16:07:33.192] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\wordcloud_visualizer.py", line 215, in _tokenize_texts [2022/11/30-16:07:33.210] [null-err-37] [INFO] [dku.utils] - for text_list, language in zip(texts, languages) [2022/11/30-16:07:33.232] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\wordcloud_visualizer.py", line 215, in [2022/11/30-16:07:33.258] [null-err-37] [INFO] [dku.utils] - for text_list, language in zip(texts, languages) [2022/11/30-16:07:33.280] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\plugins\installed\nlp-visualization\python-lib\spacy_tokenizer.py", line 359, in tokenize_list [2022/11/30-16:07:33.302] [null-err-37] [INFO] [dku.utils] - n_process=self.DEFAULT_NUM_PROCESS, [2022/11/30-16:07:33.323] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\code-envs\python\plugin_nlp-visualization_managed\lib\site-packages\spacy\language.py", line 1485, in pipe [2022/11/30-16:07:33.347] [null-err-37] [INFO] [dku.utils] - for doc in docs: [2022/11/30-16:07:33.365] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Dataiku\DataScienceStudio\dss_home\code-envs\python\plugin_nlp-visualization_managed\lib\site-packages\spacy\language.py", line 1521, in _multiprocessing_pipe [2022/11/30-16:07:33.385] [null-err-37] [INFO] [dku.utils] - proc.start() [2022/11/30-16:07:33.407] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\process.py", line 105, in start [2022/11/30-16:07:33.426] [null-err-37] [INFO] [dku.utils] - self._popen = self._Popen(self) [2022/11/30-16:07:33.446] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 223, in _Popen [2022/11/30-16:07:33.466] [null-err-37] [INFO] [dku.utils] - return _default_context.get_context().Process._Popen(process_obj) [2022/11/30-16:07:33.486] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 322, in _Popen [2022/11/30-16:07:33.507] [null-err-37] [INFO] [dku.utils] - return Popen(process_obj) [2022/11/30-16:07:33.527] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__ [2022/11/30-16:07:33.545] [null-err-37] [INFO] [dku.utils] - prep_data = spawn.get_preparation_data(process_obj._name) [2022/11/30-16:07:33.566] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 143, in get_preparation_data [2022/11/30-16:07:33.585] [null-err-37] [INFO] [dku.utils] - _check_not_importing_main() [2022/11/30-16:07:33.604] [null-err-37] [INFO] [dku.utils] - File "C:\Users\low.yun\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main [2022/11/30-16:07:33.626] [null-err-37] [INFO] [dku.utils] - is not going to be frozen to produce an executable.''') [2022/11/30-16:07:33.646] [null-err-37] [INFO] [dku.utils] - RuntimeError: [2022/11/30-16:07:33.668] [null-err-37] [INFO] [dku.utils] - An attempt has been made to start a new process before the [2022/11/30-16:07:33.690] [null-err-37] [INFO] [dku.utils] - current process has finished its bootstrapping phase. [2022/11/30-16:07:33.711] [null-err-37] [INFO] [dku.utils] - This probably means that you are not using fork to start your [2022/11/30-16:07:33.732] [null-err-37] [INFO] [dku.utils] - child processes and you have forgotten to use the proper idiom [2022/11/30-16:07:33.755] [null-err-37] [INFO] [dku.utils] - in the main module: [2022/11/30-16:07:33.774] [null-err-37] [INFO] [dku.utils] - if __name__ == '__main__': [2022/11/30-16:07:33.795] [null-err-37] [INFO] [dku.utils] - freeze_support() [2022/11/30-16:07:33.817] [null-err-37] [INFO] [dku.utils] - ... [2022/11/30-16:07:33.839] [null-err-37] [INFO] [dku.utils] - The "freeze_support()" line can be omitted if the program [2022/11/30-16:07:33.861] [null-err-37] [INFO] [dku.utils] - is not going to be frozen to produce an executable. [2022/11/30-16:07:33.882] [null-err-37] [INFO] [dku.utils] - End Python stack [2022/11/30-16:07:33.907] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:33,034 INFO Check if spark is available [2022/11/30-16:07:33.929] [null-err-37] [INFO] [dku.utils] - 2022-11-30 16:07:33,035 INFO Not stopping a spark context: No module named 'pyspark' [2022/11/30-16:07:58.022] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@2109328522,createdInThread=ActivityExecutor-29-29,forChildThread=false [2022/11/30-16:07:58.058] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@2000791546,createdInThread=ActivityExecutor-29-29,forChildThread=true [2022/11/30-16:07:58.092] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@843261814,createdInThread=FRT-34-FlowRunnable-34,forChildThread=true [2022/11/30-16:07:58.126] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@28139560,createdInThread=FRT-34-FlowRunnable-34,forChildThread=true [2022/11/30-16:07:58.156] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@54210837,createdInThread=FRT-34-FlowRunnable-34,forChildThread=true [2022/11/30-16:07:58.190] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@2034182399,createdInThread=FRT-34-FlowRunnable-34,forChildThread=true [2022/11/30-16:07:58.221] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@1382864953,createdInThread=FRT-34-FlowRunnable-34,forChildThread=true