Oops: an unexpected error occurred The Python process failed (exit code: 1). More info might be available in the logs. Please see our options for getting help HTTP code: , type: com.dataiku.dip.exceptions.ProcessDiedException [22:07:12] [INFO] [dku] running cluster_GroupCustomerNo_joined_NP - ---------------------------------------- [22:07:12] [INFO] [dku] running cluster_GroupCustomerNo_joined_NP - DSS startup: jek version:11.3.2 [22:07:12] [INFO] [dku] running cluster_GroupCustomerNo_joined_NP - DSS home: C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\dss_home [22:07:12] [INFO] [dku] running cluster_GroupCustomerNo_joined_NP - OS: Windows 10 10.0 amd64 - Java: Temurin 1.8.0_322 [22:07:12] [INFO] [dku.flow.jobrunner] running cluster_GroupCustomerNo_joined_NP - Allocated a slot for this activity! [22:07:12] [INFO] [dku.flow.jobrunner] running cluster_GroupCustomerNo_joined_NP - Run activity [22:07:12] [INFO] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Executing default pre-activity lifecycle hook [22:07:12] [INFO] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Checking if sources are ready [22:07:12] [INFO] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Will check readiness of SHOPPING.GroupCustomerNo_joined p=NP [22:07:13] [INFO] [dku.datasets.file] running cluster_GroupCustomerNo_joined_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"SHOPPING/GroupCustomerNo_joined","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [22:07:13] [DEBUG] [dku.datasets.fsbased] running cluster_GroupCustomerNo_joined_NP - getReadiness: will enumerate partition [22:07:13] [INFO] [dku.datasets.ftplike] running cluster_GroupCustomerNo_joined_NP - Enumerating Filesystem dataset prefix= [22:07:13] [DEBUG] [dku.datasets.fsbased] running cluster_GroupCustomerNo_joined_NP - Building FS provider for dataset handler: SHOPPING.GroupCustomerNo_joined [22:07:13] [DEBUG] [dku.datasets.fsbased] running cluster_GroupCustomerNo_joined_NP - FS Provider built [22:07:13] [DEBUG] [dku.fs.local] running cluster_GroupCustomerNo_joined_NP - Enumerating local filesystem prefix=/ [22:07:13] [DEBUG] [dku.fs.local] running cluster_GroupCustomerNo_joined_NP - Enumeration done nb_paths=1 size=73840 [22:07:13] [DEBUG] [dku.datasets.fsbased] running cluster_GroupCustomerNo_joined_NP - getReadiness: enumerated partition, found 1 paths, computing hash [22:07:13] [INFO] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Checked source readiness SHOPPING.GroupCustomerNo_joined -> true [22:07:13] [DEBUG] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Computing hashes to propagate BEFORE activity [22:07:13] [DEBUG] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Recorded 1 hashes before activity run [22:07:13] [DEBUG] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Building recipe runner of type [22:07:13] [DEBUG] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Recipe runner built, will use 1 thread(s) [22:07:13] [DEBUG] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Starting execution thread: com.dataiku.dip.analysis.ml.clustering.flow.ClusteringClusterRecipeRunner@153813f4 [22:07:13] [DEBUG] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Execution threads started, waiting for activity end [22:07:13] [INFO] [dku.flow.activity] - Run thread for activity cluster_GroupCustomerNo_joined_NP starting [22:07:13] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"SHOPPING/GroupCustomerNo_joined","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [22:07:13] [INFO] [dku.datasets.ftplike] - Enumerating Filesystem dataset prefix= [22:07:13] [DEBUG] [dku.datasets.fsbased] - Building FS provider for dataset handler: SHOPPING.GroupCustomerNo_joined [22:07:13] [DEBUG] [dku.datasets.fsbased] - FS Provider built [22:07:13] [DEBUG] [dku.fs.local] - Enumerating local filesystem prefix=/ [22:07:13] [DEBUG] [dku.fs.local] - Enumeration done nb_paths=1 size=73840 [22:07:13] [INFO] [dku.input.push] - USTP: push selection.method=FULL records=100000 ratio=0.02 col=null [22:07:13] [INFO] [dku.format] - Extractor run: limit={"maxBytes":-1,"maxRecords":-1,"ordering":{"enabled":false,"rules":[]}} totalRecords=0 [22:07:13] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [22:07:13] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [22:07:13] [INFO] [dku.format] - Start compressed [GZIP] stream: C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\dss_home\managed_datasets\SHOPPING\GroupCustomerNo_joined\out-s0.csv.gz / totalRecsBefore=0 [22:07:13] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [22:07:13] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [22:07:13] [INFO] [dku.format] - after stream totalComp=73840 totalUncomp=193369 totalRec=4572 [22:07:13] [INFO] [dku.format] - Extractor run done, totalCompressed=73840 totalRecords=4572 [22:07:13] [INFO] [dku.recipes.clustering] - Run clustering in code env built-in (set at deploy-time) [22:07:13] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"SHOPPING/GroupCustomerNo_joined_clustered","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [22:07:13] [DEBUG] [dku.datasets.fsbased] - Building FS provider for dataset handler: SHOPPING.GroupCustomerNo_joined_clustered [22:07:13] [DEBUG] [dku.datasets.fsbased] - FS Provider built [22:07:13] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"SHOPPING/GroupCustomerNo_joined","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [22:07:13] [DEBUG] [dku.datasets.fsbased] - Building FS provider for dataset handler: SHOPPING.GroupCustomerNo_joined [22:07:13] [DEBUG] [dku.datasets.fsbased] - FS Provider built [22:07:13] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"SHOPPING/GroupCustomerNo_joined","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [22:07:13] [DEBUG] [dku.datasets.fsbased] - Building FS provider for dataset handler: SHOPPING.GroupCustomerNo_joined [22:07:13] [DEBUG] [dku.datasets.fsbased] - FS Provider built [22:07:13] [INFO] [dku.code.projectLibs] - EXTERNAL LIBS FROM SHOPPING is {"gitReferences":{},"pythonPath":["python"],"rsrcPath":["R"],"importLibrariesFromProjects":[]} [22:07:13] [INFO] [dku.code.projectLibs] - chunkFolder is C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\SHOPPING\Build_GroupCustomerNo_joined_clustered__NP__2023-04-03T15-07-06.346\localconfig\projects\SHOPPING\lib\R [22:07:13] [INFO] [dku.recipes.code.base] - Writing dku-exec-env for local execution in C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\dss_home\jobs\SHOPPING\Build_GroupCustomerNo_joined_clustered__NP__2023-04-03T15-07-06.346\cluster_GroupCustomerNo_joined_NP\clustering-recipe\specCM2afnLep9j0\remote-run-env-def.json [22:07:13] [INFO] [dku.code.envs.resolution] - Executing Python activity in builtin env [22:07:13] [INFO] [dku.flow.abstract.python] - Execute activity command: ["C:\\Users\\nunna\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\pyenv\\Scripts\\python.exe","-u","-m","dataiku.doctor.clustering.reg_cluster_recipe","C:\\Users\\nunna\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\jobs\\SHOPPING\\Build_GroupCustomerNo_joined_clustered__NP__2023-04-03T15-07-06.346\\cluster_GroupCustomerNo_joined_NP\\clustering-recipe\\specCM2afnLep9j0","GroupCustomerNo_joined_clustered",""] [22:07:13] [INFO] [dku.security.process] - Starting process (regular) [22:07:13] [INFO] [dku.security.process] - Process started with pid=22236 [22:07:13] [INFO] [dku.processes.cgroups] - Will use cgroups [] [22:07:13] [INFO] [dku.processes.cgroups] - Applying rules to used cgroups: [] [22:07:13] [DEBUG] [dku.resourceusage] - Reporting start of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"SHOPPING","jobId":"Build_GroupCustomerNo_joined_clustered__NP__2023-04-03T15-07-06.346","activityId":"cluster_GroupCustomerNo_joined_NP","activityType":"recipe","recipeType":"clustering_cluster","recipeName":"cluster_GroupCustomerNo_joined"},"type":"LOCAL_PROCESS","id":"pmFNtoBEMqEjyQ4B","startTime":1680534433974,"localProcess":{"cpuCurrent":0.0}} [22:07:13] [DEBUG] [dku.resource] - Process stats for pid 22236: {"pid":22236,"commandName":"C:\\Users\\nunna\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\pyenv\\Scripts\\python.exe","cpuCurrent":0.0,"vmRSSTotalMBS":0} [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO START - Loading source dataset [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Reading with dtypes: None [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for CustomerNo: None (schema_type=bigint feature_type=NUMERIC feature_role=REJECT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for Frequency: (schema_type=bigint feature_type=NUMERIC feature_role=INPUT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for TotalSpend: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for Recency: (schema_type=bigint feature_type=NUMERIC feature_role=INPUT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for Age: (schema_type=bigint feature_type=NUMERIC feature_role=INPUT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for AnnualIncome: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for Work_Experience: (schema_type=bigint feature_type=NUMERIC feature_role=INPUT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Computed dtype for Family_Size: (schema_type=bigint feature_type=NUMERIC feature_role=INPUT) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,051 INFO Reading with FIXED dtypes: {'Frequency': , 'TotalSpend': , 'Recency': , 'Age': , 'AnnualIncome': , 'Work_Experience': , 'Family_Size': } [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Loaded table [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Loaded full df: shape=(4572,8) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Coercion done [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO END - Loading source dataset [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO START - Collecting preprocessing data [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Looking at TotalSpend... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Checking series of type: float64 (isM8=False) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Looking at Work_Experience... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Checking series of type: float64 (isM8=False) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Looking at Recency... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Checking series of type: float64 (isM8=False) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Looking at Family_Size... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,085 INFO Checking series of type: float64 (isM8=False) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Looking at Frequency... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Checking series of type: float64 (isM8=False) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Looking at CustomerNo... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Looking at Age... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Checking series of type: float64 (isM8=False) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Looking at AnnualIncome... (type=NUMERIC) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Checking series of type: float64 (isM8=False) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO END - Collecting preprocessing data [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO generating interactions [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO START - Preprocessing data [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 INFO Set MF index len 4572 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,090 DEBUG FIT/PROCESS WITH Step:CopyMultipleColumnsFromInput [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG FIT/PROCESS WITH Step:CopyMultipleColumnsFromInput [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG FIT/PROCESS WITH Step:EmitCurrentMFAsResult [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 INFO Set MF index len 4572 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG FIT/PROCESS WITH Step:DumpPipelineState [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG ********* Pipeline state (After create profiling) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG input_df= (4572, 8) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG current_mf=(0, 0) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG PPR: [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG PROFILING = ((4572, 7)) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG UNPROCESSED = ((4572, 8)) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,095 DEBUG MIMIFI: Imputing with map {'TotalSpend': 12806.3160148731, 'Work_Experience': 2.484689413823272, 'Recency': 124.91207349081365, 'Family_Size': 2.7801837270341205, 'Frequency': 3.9888451443569553, 'Age': 43.6010498687664, 'AnnualIncome': 85049.85061242344} [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,098 DEBUG FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,098 DEBUG MIMIFI: Imputing with map {} [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,098 DEBUG FIT/PROCESS WITH Step:RescalingProcessor2 (TotalSpend) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,098 DEBUG Rescale TotalSpend (avg=12806.3160148731 std=52746.89025499263 shift=12806.3160148731 inv_scale=1.8958463620617853e-05) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG Rescaled TotalSpend (avg=8.149163272833689e-16 std=1.0000000000000002) nulls=0 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG FIT/PROCESS WITH Step:RescalingProcessor2 (Work_Experience) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG Rescale Work_Experience (avg=2.484689413823272 std=3.263526896032945 shift=2.484689413823272 inv_scale=0.30641696295366005) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG Rescaled Work_Experience (avg=1.5614028977121077e-17 std=1.000000000000029) nulls=0 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG FIT/PROCESS WITH Step:RescalingProcessor2 (Recency) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG Rescale Recency (avg=124.91207349081365 std=99.90898138876355 shift=124.91207349081365 inv_scale=0.01000911015305844) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG Rescaled Recency (avg=-2.232830426821591e-16 std=0.9999999999999915) nulls=0 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG FIT/PROCESS WITH Step:RescalingProcessor2 (Family_Size) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,106 DEBUG Rescale Family_Size (avg=2.7801837270341205 std=1.5524156831275133 shift=2.7801837270341205 inv_scale=0.644157367687364) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG Rescaled Family_Size (avg=2.028123950496349e-16 std=0.9999999999999842) nulls=0 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG FIT/PROCESS WITH Step:RescalingProcessor2 (Frequency) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG Rescale Frequency (avg=3.9888451443569553 std=6.827089527140132 shift=3.9888451443569553 inv_scale=0.14647530196061453) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG Rescaled Frequency (avg=3.181085219289053e-16 std=0.9999999999999943) nulls=0 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG FIT/PROCESS WITH Step:RescalingProcessor2 (Age) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG Rescale Age (avg=43.6010498687664 std=16.747525604143387 shift=43.6010498687664 inv_scale=0.05971031325084806) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG Rescaled Age (avg=1.0232895506934405e-16 std=1.000000000000002) nulls=0 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,162 DEBUG FIT/PROCESS WITH Step:RescalingProcessor2 (AnnualIncome) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG Rescale AnnualIncome (avg=85049.85061242344 std=37663.31932936189 shift=85049.85061242344 inv_scale=2.6551032086553548e-05) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG Rescaled AnnualIncome (avg=1.4613565534108032e-16 std=0.9999999999999997) nulls=0 [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:FlushDFBuilder(num_flagonly) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:FlushDFBuilder(datetime_cyclical) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG MIMIFI: Imputing with map {} [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:FlushDFBuilder(cat_flagpresence) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG MIMIFI: Imputing with map {} [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG MIMIFI: Imputing with map {} [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:FlushDFBuilder(interaction) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:DumpPipelineState [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG ********* Pipeline state (After std handling) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG input_df= (4572, 8) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG current_mf=(4572, 7) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG PPR: [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG PROFILING = ((4572, 7)) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG UNPROCESSED = ((4572, 8)) [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG FIT/PROCESS WITH Step:OutlierDetection [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,170 DEBUG Outliers detection: fitting PCA [22:07:16] [INFO] [dku.utils] - 2023-04-03 22:07:16,172 INFO Fitting on [[-0.21416838 -0.76135098 -0.10922015 ... -0.437792 1.03889676 [22:07:16] [INFO] [dku.utils] - 1.47894956] [22:07:16] [INFO] [dku.utils] - [-0.24231866 -0.45493402 -0.01913815 ... -0.437792 1.03889676 [22:07:16] [INFO] [dku.utils] - -0.8481953 ] [22:07:16] [INFO] [dku.utils] - [-0.13492257 -0.45493402 0.88168176 ... -0.437792 -0.27473013 [22:07:16] [INFO] [dku.utils] - 0.70719071] [22:07:16] [INFO] [dku.utils] - ... [22:07:16] [INFO] [dku.utils] - [ 0.634519 0.1578999 -1.01004006 ... 2.93113995 -0.99125389 [22:07:16] [INFO] [dku.utils] - -0.03822952] [22:07:16] [INFO] [dku.utils] - [ 0.05936888 -0.45493402 -1.08010383 ... 0.58753512 -0.21501982 [22:07:16] [INFO] [dku.utils] - 0.82048396] [22:07:17] [INFO] [dku.utils] - [ 0.14647032 -0.76135098 -0.99002184 ... 0.58753512 -1.0509642 [22:07:17] [INFO] [dku.utils] - 1.32630767]] (cols ['TotalSpend', 'Work_Experience', 'Recency', 'Family_Size', 'Frequency', 'Age', 'AnnualIncome']) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,172 DEBUG Outliers detection: done fitting PCA [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,172 DEBUG Outliers detection: performing cubic-root kmeans on df (4572, 6) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,955 DEBUG Outliers detection: done kmeans [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,955 DEBUG Outliers detection: selecting mini-clusters [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,955 DEBUG Outliers detection: done (2 mini-clusters are outliers) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,955 DEBUG Detected 2 outliers [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,955 DEBUG Remove some rows. Shape before: [22:07:17] [INFO] [dku.utils] - MultiFrame (1 blocks): [22:07:17] [INFO] [dku.utils] - Block NUM_IMPUTED_KEPT () -> (4572,7) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO MultiFrame, dropping rows: [ 58 114 1026 1353 2665 2944 3129 4232 4248 4270 4274 4322 4324 4331 [22:07:17] [INFO] [dku.utils] - 4346 4349 4395 4456 4507 4520 4539 4553] [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG Removed some rows. Shape after: [22:07:17] [INFO] [dku.utils] - MultiFrame (1 blocks): [22:07:17] [INFO] [dku.utils] - Block NUM_IMPUTED_KEPT () -> (4550,7) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG After outliers input_df=(4550, 8) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG FIT/PROCESS WITH Step:DumpPipelineState [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG ********* Pipeline state (After outliers) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG input_df= (4550, 8) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG current_mf=(4550, 7) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG PPR: [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG PROFILING = ((4572, 7)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG UNPROCESSED = ((4550, 8)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG OUTLIERS = ((4572, 1)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG FIT/PROCESS WITH Step:EmitCurrentMFAsResult [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO Set MF index len 4550 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG FIT/PROCESS WITH Step:AddReferenceInOutput [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG FIT/PROCESS WITH Step:DumpPipelineState [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG ********* Pipeline state (After PCA) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG input_df= (4550, 8) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG current_mf=(0, 0) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG PPR: [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG PROFILING = ((4572, 7)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG UNPROCESSED = ((4550, 8)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG OUTLIERS = ((4572, 1)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG TRAIN_PREPCA = ((4550, 7)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 DEBUG TRAIN = ((4550, 7)) [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO END - Preprocessing data [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO KMEANS k=5 n_jobs=2 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO FP TotalSpend=-0.21416837960042276 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO FP Work_Experience=-0.7613509840668369 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO FP Recency=-0.10922014556782281 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO FP Family_Size=0.141596271768548 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO FP Frequency=-0.43779199503320154 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO FP Age=1.038896762571842 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:16,971 INFO FP AnnualIncome=1.4789495556795442 [22:07:17] [INFO] [dku.utils] - 2023-04-03 22:07:17,520 ERROR exception calling callback for [22:07:17] [INFO] [dku.utils] - sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback: [22:07:17] [INFO] [dku.utils] - ''' [22:07:17] [INFO] [dku.utils] - Traceback (most recent call last): [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\multiprocessing\queues.py", line 109, in get [22:07:17] [INFO] [dku.utils] - self._sem.release() [22:07:17] [INFO] [dku.utils] - OSError: [WinError 6] The handle is invalid [22:07:17] [INFO] [dku.utils] - During handling of the above exception, another exception occurred: [22:07:17] [INFO] [dku.utils] - Traceback (most recent call last): [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 391, in _process_worker [22:07:17] [INFO] [dku.utils] - call_item = call_queue.get(block=True, timeout=timeout) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\multiprocessing\queues.py", line 111, in get [22:07:17] [INFO] [dku.utils] - self._rlock.release() [22:07:17] [INFO] [dku.utils] - OSError: [WinError 6] The handle is invalid [22:07:17] [INFO] [dku.utils] - ''' [22:07:17] [INFO] [dku.utils] - The above exception was the direct cause of the following exception: [22:07:17] [INFO] [dku.utils] - Traceback (most recent call last): [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\_base.py", line 625, in _invoke_callbacks [22:07:17] [INFO] [dku.utils] - callback(self) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 309, in __call__ [22:07:17] [INFO] [dku.utils] - self.parallel.dispatch_next() [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 731, in dispatch_next [22:07:17] [INFO] [dku.utils] - if not self.dispatch_one_batch(self._original_iterator): [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch [22:07:17] [INFO] [dku.utils] - self._dispatch(tasks) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch [22:07:17] [INFO] [dku.utils] - job = self._backend.apply_async(batch, callback=cb) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\_parallel_backends.py", line 510, in apply_async [22:07:17] [INFO] [dku.utils] - future = self._workers.submit(SafeFunction(func)) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit [22:07:17] [INFO] [dku.utils] - fn, *args, **kwargs) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1022, in submit [22:07:17] [INFO] [dku.utils] - raise self._flags.broken [22:07:17] [INFO] [dku.utils] - sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable. [22:07:17] [INFO] [dku.utils] - ERROR: The process "7800" not found. [22:07:17] [INFO] [dku.utils] - sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback: [22:07:17] [INFO] [dku.utils] - ''' [22:07:17] [INFO] [dku.utils] - Traceback (most recent call last): [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\multiprocessing\queues.py", line 109, in get [22:07:17] [INFO] [dku.utils] - self._sem.release() [22:07:17] [INFO] [dku.utils] - OSError: [WinError 6] The handle is invalid [22:07:17] [INFO] [dku.utils] - During handling of the above exception, another exception occurred: [22:07:17] [INFO] [dku.utils] - Traceback (most recent call last): [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 391, in _process_worker [22:07:17] [INFO] [dku.utils] - call_item = call_queue.get(block=True, timeout=timeout) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\multiprocessing\queues.py", line 111, in get [22:07:17] [INFO] [dku.utils] - self._rlock.release() [22:07:17] [INFO] [dku.utils] - OSError: [WinError 6] The handle is invalid [22:07:17] [INFO] [dku.utils] - ''' [22:07:17] [INFO] [dku.utils] - The above exception was the direct cause of the following exception: [22:07:17] [INFO] [dku.utils] - Traceback (most recent call last): [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\runpy.py", line 193, in _run_module_as_main [22:07:17] [INFO] [dku.utils] - "__main__", mod_spec) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\runpy.py", line 85, in _run_code [22:07:17] [INFO] [dku.utils] - exec(code, run_globals) [22:07:17] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\python\dataiku\doctor\clustering\reg_cluster_recipe.py", line 93, in [22:07:18] [INFO] [dku.utils] - main(sys.argv[1], sys.argv[2], keptInputColumns) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\python\dataiku\doctor\clustering\reg_cluster_recipe.py", line 55, in main [22:07:18] [INFO] [dku.utils] - (clf, actual_params, cluster_labels, additional_columns) = clustering_fit(modeling_params, transformed_train) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\python\dataiku\doctor\clustering\clustering_fit.py", line 120, in clustering_fit [22:07:18] [INFO] [dku.utils] - cluster_labels_arr = clf.fit_predict(train_np) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\cluster\k_means_.py", line 1000, in fit_predict [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\cluster\k_means_.py", line 1000, in fit_predict [22:07:18] [INFO] [dku.utils] - return self.fit(X, sample_weight=sample_weight).labels_ [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: return self.fit(X, sample_weight=sample_weight).labels_ [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\cluster\k_means_.py", line 974, in fit [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\cluster\k_means_.py", line 974, in fit [22:07:18] [INFO] [dku.utils] - return_n_iter=True) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: return_n_iter=True) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\cluster\k_means_.py", line 401, in k_means [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\cluster\k_means_.py", line 401, in k_means [22:07:18] [INFO] [dku.utils] - for seed in seeds) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: for seed in seeds) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 934, in __call__ [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 934, in __call__ [22:07:18] [INFO] [dku.utils] - self.retrieve() [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: self.retrieve() [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve [22:07:18] [INFO] [dku.utils] - self._output.extend(job.get(timeout=self.timeout)) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: self._output.extend(job.get(timeout=self.timeout)) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result [22:07:18] [INFO] [dku.utils] - return future.result(timeout=timeout) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: return future.result(timeout=timeout) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\concurrent\futures\_base.py", line 435, in result [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\concurrent\futures\_base.py", line 435, in result [22:07:18] [INFO] [dku.utils] - return self.__get_result() [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: return self.__get_result() [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\concurrent\futures\_base.py", line 384, in __get_result [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\concurrent\futures\_base.py", line 384, in __get_result [22:07:18] [INFO] [dku.utils] - raise self._exception [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: raise self._exception [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\_base.py", line 625, in _invoke_callbacks [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\_base.py", line 625, in _invoke_callbacks [22:07:18] [INFO] [dku.utils] - callback(self) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: callback(self) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 309, in __call__ [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 309, in __call__ [22:07:18] [INFO] [dku.utils] - self.parallel.dispatch_next() [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: self.parallel.dispatch_next() [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 731, in dispatch_next [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 731, in dispatch_next [22:07:18] [INFO] [dku.utils] - if not self.dispatch_one_batch(self._original_iterator): [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: if not self.dispatch_one_batch(self._original_iterator): [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 759, in dispatch_one_batch [22:07:18] [INFO] [dku.utils] - self._dispatch(tasks) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: self._dispatch(tasks) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\parallel.py", line 716, in _dispatch [22:07:18] [INFO] [dku.utils] - job = self._backend.apply_async(batch, callback=cb) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: job = self._backend.apply_async(batch, callback=cb) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\_parallel_backends.py", line 510, in apply_async [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\_parallel_backends.py", line 510, in apply_async [22:07:18] [INFO] [dku.utils] - future = self._workers.submit(SafeFunction(func)) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: future = self._workers.submit(SafeFunction(func)) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit [22:07:18] [INFO] [dku.utils] - fn, *args, **kwargs) [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: fn, *args, **kwargs) [22:07:18] [INFO] [dku.utils] - File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1022, in submit [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: File "C:\Users\nunna\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-11.3.2-win\pythonwin.packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1022, in submit [22:07:18] [INFO] [dku.utils] - raise self._flags.broken [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: raise self._flags.broken [22:07:18] [INFO] [dku.utils] - sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable. [22:07:18] [WARN] [dku.utils] - Writer already closed, cannot write: sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable. [22:07:18] [DEBUG] [dku.resourceusage] - Reporting completion of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"SHOPPING","jobId":"Build_GroupCustomerNo_joined_clustered__NP__2023-04-03T15-07-06.346","activityId":"cluster_GroupCustomerNo_joined_NP","activityType":"recipe","recipeType":"clustering_cluster","recipeName":"cluster_GroupCustomerNo_joined"},"type":"LOCAL_PROCESS","id":"pmFNtoBEMqEjyQ4B","startTime":1680534433974,"localProcess":{"pid":22236,"commandName":"C:\\Users\\nunna\\AppData\\Local\\Dataiku\\DataScienceStudio\\dss_home\\pyenv\\Scripts\\python.exe","cpuCurrent":0.0,"vmRSSTotalMBS":0}} [22:07:18] [INFO] [dip.exec.resultHandler] - Did not find a specific error from error files or logs, falling back on return code [22:07:18] [INFO] [dku.flow.activity] - Run thread failed for activity cluster_GroupCustomerNo_joined_NP com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs. at com.dataiku.dip.dataflow.common.CodeBasedThingHelper.throwSubprocessError(CodeBasedThingHelper.java:23) at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:29) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:73) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeModule(AbstractPythonRecipeRunner.java:88) at com.dataiku.dip.analysis.ml.clustering.flow.ClusteringClusterRecipeRunner$1.run(ClusteringClusterRecipeRunner.java:121) at com.dataiku.dip.analysis.ml.clustering.flow.ClusteringClusterRecipeRunner.startRunner(ClusteringClusterRecipeRunner.java:278) at com.dataiku.dip.analysis.ml.clustering.flow.ClusteringClusterRecipeRunner.run(ClusteringClusterRecipeRunner.java:262) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:375) [22:07:18] [INFO] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - activity is finished [22:07:18] [ERROR] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Activity failed com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs. at com.dataiku.dip.dataflow.common.CodeBasedThingHelper.throwSubprocessError(CodeBasedThingHelper.java:23) at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:29) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:73) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeModule(AbstractPythonRecipeRunner.java:88) at com.dataiku.dip.analysis.ml.clustering.flow.ClusteringClusterRecipeRunner$1.run(ClusteringClusterRecipeRunner.java:121) at com.dataiku.dip.analysis.ml.clustering.flow.ClusteringClusterRecipeRunner.startRunner(ClusteringClusterRecipeRunner.java:278) at com.dataiku.dip.analysis.ml.clustering.flow.ClusteringClusterRecipeRunner.run(ClusteringClusterRecipeRunner.java:262) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:375) [22:07:18] [INFO] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Executing default post-activity lifecycle hook [22:07:18] [INFO] [dku.flow.activity] running cluster_GroupCustomerNo_joined_NP - Done post-activity tasks