---------------------------------------- DSS startup: jek version:8.0.0 DSS home: /home/dataiku/dss OS: Linux 3.10.0-1127.13.1.el7.x86_64 amd64 - Java: Oracle Corporation 1.8.0_252 [2020/08/24-09:02:46.140] [qtp641449963-21] [INFO] [dku.jobs] - Connects using Shared secret [2020/08/24-09:02:46.141] [qtp641449963-21] [DEBUG] [dku.jobs] - Received command : /pintercom/start_session [2020/08/24-09:02:46.263] [qtp641449963-21] [INFO] [dku.jobs] - Start Session: {"jobProjectKey":"DKU_CHURN","jobId":"Build_geo_encoded_2020-08-24T09-02-45.809","jobTicketSecret":"cSGxrf6RbgRu5D9u4QvmtPdT0EIS4pNeViKQ6V7JeNKspGGGwyXvPpKv8tWfk9QH","initiatorAuthContext":{"userGroupLevelPermissions":{"admin":true,"mayManageUDM":true,"mayCreateProjects":true,"mayCreateProjectsFromMacros":true,"mayCreateProjectsFromTemplates":true,"mayCreateProjectsFromDataikuApps":true,"mayWriteUnsafeCode":true,"mayWriteSafeCode":true,"mayCreateAuthenticatedConnections":true,"mayCreateCodeEnvs":true,"mayCreateClusters":true,"mayDevelopPlugins":true,"mayEditLibFolders":true,"mayManageCodeEnvs":true,"mayManageClusters":true,"mayViewIndexedHiveConnections":true,"mayCreatePublishedAPIServices":true,"mayWriteInRootProjectFolder":true,"mayCreateActiveWebContent":true,"canObtainAPITicketFromCookiesForGroupsRegex":""},"via":[],"authSource":"USER_FROM_UI","realUserLogin":"admin","userGroups":["administrators"],"userProfile":"DESIGNER"}} [2020/08/24-09:02:46.323] [qtp641449963-21] [INFO] [dip.tickets.fixed] - Creating service for single API ticket in auth context: admin [2020/08/24-09:02:46.477] [qtp641449963-21] [DEBUG] [dku.jobs] - Command /pintercom/start_session processed in 337ms [2020/08/24-09:02:46.594] [qtp641449963-22] [INFO] [dku.jobs] - Connects using Shared secret [2020/08/24-09:02:46.597] [qtp641449963-22] [DEBUG] [dku.jobs] - Received command : /pintercom/resolve_job [2020/08/24-09:02:46.597] [qtp641449963-22] [INFO] [dku.job.slave] - Starting to resolve job [2020/08/24-09:02:47.218] [qtp641449963-22] [INFO] [dku.job.slave] - { "type": "NON_RECURSIVE_FORCED_BUILD", "projectKey": "DKU_CHURN", "id": "Build_geo_encoded_2020-08-24T09-02-45.809", "name": "Build geo_encoded", "initiator": "admin", "triggeredFrom": "RECIPE", "recipe": "compute_geo_encoded", "initiationTimestamp": 1598259765809, "mailNotification": false, "outputs": [ { "type": "DATASET", "targetDatasetProjectKey": "DKU_CHURN", "targetDataset": "geo_encoded" } ], "refreshHiveMetastore": true } [2020/08/24-09:02:47.218] [qtp641449963-22] [INFO] [dku.job.slave] - Loading recipes [2020/08/24-09:02:47.284] [qtp641449963-22] [DEBUG] [dku.flow.recipes] - Building global graph [2020/08/24-09:02:48.475] [qtp641449963-22] [DEBUG] [dku.flow.recipes] - Done getting global graph t=1190ms [2020/08/24-09:02:48.476] [qtp641449963-22] [INFO] [dku.job.slave] - Recipes loaded, computing job details [2020/08/24-09:02:48.547] [qtp641449963-22] [INFO] [dku.flow.compute] - Job output type: DATASET [2020/08/24-09:02:48.621] [qtp641449963-22] [INFO] [dku.flow.compute] - computing job from target DKU_CHURN.geo_encoded partition NP [2020/08/24-09:02:48.659] [qtp641449963-22] [INFO] [dku.flow.compute] - Refreshing DKU_CHURN.geo_encoded/NP with recipe compute_geo_encoded [2020/08/24-09:02:48.725] [qtp641449963-22] [INFO] [dku.flow.compute.op] activity compute_geo_encoded_ - Compute output partitions for subgraph, with target : DKU_CHURN.geo_encoded and partition: NP [2020/08/24-09:02:48.773] [qtp641449963-22] [INFO] [dku.flow.compute.op] activity compute_geo_encoded_ - Target DKU_CHURN.geo_encoded is not partitioned [2020/08/24-09:02:48.774] [qtp641449963-22] [INFO] [dku.flow.compute.op] activity compute_geo_encoded_ - Add output partition for DKU_CHURN.geo_encoded : NP [2020/08/24-09:02:48.774] [qtp641449963-22] [INFO] [dku.flow.compute] activity compute_geo_encoded_ - computed output partitions: {DKU_CHURN.geo_encoded=} [2020/08/24-09:02:48.813] [qtp641449963-22] [INFO] [dku.flow.compute.ip] activity compute_geo_encoded_ - 0 dependency for DKU_CHURN.revenue_loss [2020/08/24-09:02:48.814] [qtp641449963-22] [INFO] [dku.flow.compute.ip] activity compute_geo_encoded_ - source DKU_CHURN.revenue_loss is not partitioned, including the global partition [2020/08/24-09:02:48.815] [qtp641449963-22] [INFO] [dku.flow.compute] activity compute_geo_encoded_ - computed input partitions : {DKU_CHURN.revenue_loss=[]} [2020/08/24-09:02:48.867] [qtp641449963-22] [INFO] [dku.flow.compute] - Before prune, job looks like: JOB ACTIVITY ROOT DEPS ACTIVITY compute_geo_encoded_NP SGID: compute_geo_encoded_NP SRC: DKU_CHURN.revenue_loss (partitions: NP) DST: DKU_CHURN.geo_encoded (partition: NP) [2020/08/24-09:02:48.869] [qtp641449963-22] [INFO] [dku.flow.compute] - After cleanup of implicit recipes, job looks like: JOB ACTIVITY ROOT DEPS ACTIVITY compute_geo_encoded_NP SGID: compute_geo_encoded_NP SRC: DKU_CHURN.revenue_loss (partitions: NP) DST: DKU_CHURN.geo_encoded (partition: NP) [2020/08/24-09:02:48.916] [qtp641449963-22] [INFO] [dku.flow.compute] - Job project: DKU_CHURN [2020/08/24-09:02:49.099] [qtp641449963-22] [INFO] [dku.flow.compute] - Doing job pruning according to settings [2020/08/24-09:02:49.209] [qtp641449963-22] [INFO] [dku.flow.compute] - not pruning activity, job type is NON_RECURSIVE_FORCED_BUILD [2020/08/24-09:02:49.213] [qtp641449963-22] [INFO] [dku.flow.compute] - After prune, job looks like that [2020/08/24-09:02:49.216] [qtp641449963-22] [INFO] [dku.flow.compute] - JOB ACTIVITY ROOT DEPS ACTIVITY compute_geo_encoded_NP SGID: compute_geo_encoded_NP SRC: DKU_CHURN.revenue_loss (partitions: NP) DST: DKU_CHURN.geo_encoded (partition: NP) [2020/08/24-09:02:49.217] [qtp641449963-22] [INFO] [dku.flow.compute] - Done building final job graph [2020/08/24-09:02:49.262] [qtp641449963-22] [INFO] [dku.job.slave] - Done computing job data, dumping graph [2020/08/24-09:02:49.265] [qtp641449963-22] [INFO] [dku.job.slave] - JOB ACTIVITY ROOT DEPS ACTIVITY compute_geo_encoded_NP SGID: compute_geo_encoded_NP SRC: DKU_CHURN.revenue_loss (partitions: NP) DST: DKU_CHURN.geo_encoded (partition: NP) [2020/08/24-09:02:49.311] [qtp641449963-22] [INFO] [dku.job.slave] - Job has the following sources: [{"projectKey":"DKU_CHURN","type":"DATASET","id":"revenue_loss"}] [2020/08/24-09:02:49.313] [qtp641449963-22] [INFO] [dku.job.slave] - Trying to obtain grant: datasetread::READ_DATA::DKU_CHURN.revenue_loss [2020/08/24-09:02:49.856] [qtp641449963-22] [DEBUG] [dku.jobs] - Command /pintercom/resolve_job processed in 3262ms [2020/08/24-09:02:49.942] [qtp641449963-23] [INFO] [dku.jobs] - Connects using Shared secret [2020/08/24-09:02:49.949] [qtp641449963-23] [DEBUG] [dku.jobs] - Received command : /pintercom/run_fully [2020/08/24-09:02:50.118] [qtp641449963-23] [INFO] [dku.job.slave] - Job run started in kernel with pid 1815 [2020/08/24-09:02:50.157] [qtp641449963-23] [INFO] [dku.flow.jobrunner] - Creating 5 activity runner thread(s). [2020/08/24-09:02:50.192] [qtp641449963-23] [INFO] [dku.flow.jobrunner] - Starting activity executor threads [2020/08/24-09:02:50.280] [qtp641449963-23] [INFO] [dku.flow.jobrunner] - Waiting for activity executor threads [2020/08/24-09:02:50.295] [ActivityExecutor-29] [INFO] [dku.flow.jobrunner] - About to run compute_geo_encoded_NP [2020/08/24-09:02:50.514] [ActivityExecutor-29] [INFO] [dku.flow.jobrunner] running compute_geo_encoded_NP - Allocated a slot for this activity! 327.716: [GC (Allocation Failure) [PSYoungGen: 82402K->8109K(102912K)] 90988K->22024K(136704K), 0.0948257 secs] [Times: user=0.06 sys=0.01, real=0.10 secs] [2020/08/24-09:02:50.624] [ActivityExecutor-29] [INFO] [dku.flow.jobrunner] running compute_geo_encoded_NP - Run activity [2020/08/24-09:02:50.681] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_geo_encoded_NP - Executing default pre-activity lifecycle hook [2020/08/24-09:02:50.730] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_geo_encoded_NP - Checking if sources are ready [2020/08/24-09:02:50.809] [ActivityExecutor-29] [DEBUG] [dku.dataset.hash] running compute_geo_encoded_NP - Readiness cache miss for dataset__admin__DKU_CHURN.revenue_loss__NP [2020/08/24-09:02:50.871] [ActivityExecutor-29] [INFO] [dku.datasets.file] running compute_geo_encoded_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_CHURN.revenue_loss","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:50.875] [ActivityExecutor-29] [INFO] [dku.datasets.ftplike] running compute_geo_encoded_NP - Enumerating Filesystem dataset prefix= [2020/08/24-09:02:50.967] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_geo_encoded_NP - Enumerating local filesystem prefix=/ [2020/08/24-09:02:50.987] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_geo_encoded_NP - Enumeration done nb_paths=1 size=140763 [2020/08/24-09:02:51.004] [ActivityExecutor-29] [INFO] [dku.dataset.hash] running compute_geo_encoded_NP - Caching readiness for dataset__admin__DKU_CHURN.revenue_loss__NP s=READY h=laRxnaP0X8GeOCIRP+XEJA [2020/08/24-09:02:51.005] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_geo_encoded_NP - Checked source readiness DKU_CHURN.revenue_loss -> true [2020/08/24-09:02:51.021] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_geo_encoded_NP - Computing hashes to propagate BEFORE activity [2020/08/24-09:02:51.058] [ActivityExecutor-29] [DEBUG] [dku.dataset.hash] running compute_geo_encoded_NP - Readiness cache miss for dataset__admin__DKU_CHURN.revenue_loss__NP [2020/08/24-09:02:51.060] [ActivityExecutor-29] [INFO] [dku.datasets.file] running compute_geo_encoded_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_CHURN.revenue_loss","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:51.060] [ActivityExecutor-29] [INFO] [dku.datasets.ftplike] running compute_geo_encoded_NP - Enumerating Filesystem dataset prefix= [2020/08/24-09:02:51.063] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_geo_encoded_NP - Enumerating local filesystem prefix=/ [2020/08/24-09:02:51.064] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_geo_encoded_NP - Enumeration done nb_paths=1 size=140763 [2020/08/24-09:02:51.065] [ActivityExecutor-29] [INFO] [dku.dataset.hash] running compute_geo_encoded_NP - Caching readiness for dataset__admin__DKU_CHURN.revenue_loss__NP s=READY h=laRxnaP0X8GeOCIRP+XEJA [2020/08/24-09:02:51.066] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_geo_encoded_NP - Recorded 1 hashes before activity run [2020/08/24-09:02:51.104] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_geo_encoded_NP - Building recipe runner of type [2020/08/24-09:02:51.176] [ActivityExecutor-29] [DEBUG] [dku.job.activity] running compute_geo_encoded_NP - Filling source sizes [2020/08/24-09:02:51.180] [ActivityExecutor-29] [INFO] [dku.datasets.file] running compute_geo_encoded_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_CHURN.revenue_loss","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:51.181] [ActivityExecutor-29] [INFO] [dku.datasets.ftplike] running compute_geo_encoded_NP - Enumerating Filesystem dataset prefix= [2020/08/24-09:02:51.209] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_geo_encoded_NP - Enumerating local filesystem prefix=/ [2020/08/24-09:02:51.210] [ActivityExecutor-29] [DEBUG] [dku.fs.local] running compute_geo_encoded_NP - Enumeration done nb_paths=1 size=140763 [2020/08/24-09:02:51.211] [ActivityExecutor-29] [DEBUG] [dku.job.activity] running compute_geo_encoded_NP - Done filling source sizes [2020/08/24-09:02:51.212] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_geo_encoded_NP - Recipe runner built, will use 1 thread(s) [2020/08/24-09:02:51.213] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_geo_encoded_NP - Starting execution thread: com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner@5dfcdab8 [2020/08/24-09:02:51.241] [ActivityExecutor-29] [DEBUG] [dku.flow.activity] running compute_geo_encoded_NP - Execution threads started, waiting for activity end [2020/08/24-09:02:51.246] [FRT-34-FlowRunnable] [INFO] [dku.flow.activity] act.compute_geo_encoded_NP - Run thread for activity compute_geo_encoded_NP starting [2020/08/24-09:02:51.249] [FRT-34-FlowRunnable] [INFO] [dku.flow.custompython] act.compute_geo_encoded_NP - Dumping Python script to /home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/script.py [2020/08/24-09:02:51.324] [FRT-34-FlowRunnable] [INFO] [dku.flow.abstract.python] act.compute_geo_encoded_NP - Dumping Python script to /home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/script.py [2020/08/24-09:02:51.344] [FRT-34-FlowRunnable] [INFO] [dku.datasets.file] act.compute_geo_encoded_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"DKU_CHURN/geo_encoded","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:51.352] [FRT-34-FlowRunnable] [INFO] [dku.datasets.file] act.compute_geo_encoded_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_CHURN.revenue_loss","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:51.355] [FRT-34-FlowRunnable] [INFO] [dku.datasets.file] act.compute_geo_encoded_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_CHURN.revenue_loss","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:51.437] [FRT-34-FlowRunnable] [WARN] [dku.code.projectLibs] act.compute_geo_encoded_NP - External libraries file not found: /home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/localconfig/projects/DKU_CHURN/lib/external-libraries.json [2020/08/24-09:02:51.439] [FRT-34-FlowRunnable] [INFO] [dku.code.projectLibs] act.compute_geo_encoded_NP - EXTERNAL LIBS FROM DKU_CHURN is {"gitReferences":{},"pythonPath":["python"],"rsrcPath":["R"],"importLibrariesFromProjects":[]} [2020/08/24-09:02:51.441] [FRT-34-FlowRunnable] [INFO] [dku.code.projectLibs] act.compute_geo_encoded_NP - chunkFolder is /home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/localconfig/projects/DKU_CHURN/lib/R [2020/08/24-09:02:51.451] [FRT-34-FlowRunnable] [INFO] [dip.plugin.presets] act.compute_geo_encoded_NP - Checking project-level settings for overriden presets and additional presets [2020/08/24-09:02:51.481] [FRT-34-FlowRunnable] [INFO] [xxx] act.compute_geo_encoded_NP - RSRC PATH: ["/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/localconfig/projects/DKU_CHURN/lib/R"] [2020/08/24-09:02:51.484] [FRT-34-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_geo_encoded_NP - Writing dku-exec-env for local execution in /home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/remote-run-env-def.json [2020/08/24-09:02:51.550] [FRT-34-FlowRunnable] [INFO] [dku.code.envs.resolution] act.compute_geo_encoded_NP - Executing Python activity in env: plugin_geocoder_managed [2020/08/24-09:02:51.622] [FRT-34-FlowRunnable] [INFO] [dku.flow.abstract.python] act.compute_geo_encoded_NP - Execute activity command: ["/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/bin/python","-u","/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/python-exec-wrapper.py","/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/script.py"] [2020/08/24-09:02:51.623] [FRT-34-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_geo_encoded_NP - Run command insecurely, from user dataiku [2020/08/24-09:02:51.625] [FRT-34-FlowRunnable] [INFO] [dku.security.process] act.compute_geo_encoded_NP - Starting process (regular) [2020/08/24-09:02:51.682] [FRT-34-FlowRunnable] [INFO] [dku.security.process] act.compute_geo_encoded_NP - Process started with pid=2037 [2020/08/24-09:02:51.686] [FRT-34-FlowRunnable] [INFO] [dku.processes.cgroups] act.compute_geo_encoded_NP - Will use cgroups [] [2020/08/24-09:02:51.687] [FRT-34-FlowRunnable] [INFO] [dku.processes.cgroups] act.compute_geo_encoded_NP - Applying rules to used cgroups: [] [2020/08/24-09:02:51.720] [FRT-34-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_geo_encoded_NP - Process reads from nothing [2020/08/24-09:02:51.725] [FRT-34-FlowRunnable] [INFO] [dku.resourceusage] act.compute_geo_encoded_NP - Reporting start of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"DKU_CHURN","jobId":"Build_geo_encoded_2020-08-24T09-02-45.809","activityId":"compute_geo_encoded_NP","activityType":"recipe","recipeType":"CustomCode_forward_geocoding","recipeName":"compute_geo_encoded"},"type":"LOCAL_PROCESS","id":"QMODTbTlBL6a8l0R","startTime":1598259771721,"localProcess":{"cpuCurrent":0.0}} [2020/08/24-09:02:51.727] [FRT-34-FlowRunnable] [INFO] [dku.usage.computeresource.jek] act.compute_geo_encoded_NP - Reporting start of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"DKU_CHURN","jobId":"Build_geo_encoded_2020-08-24T09-02-45.809","activityId":"compute_geo_encoded_NP","activityType":"recipe","recipeType":"CustomCode_forward_geocoding","recipeName":"compute_geo_encoded"},"type":"LOCAL_PROCESS","id":"QMODTbTlBL6a8l0R","startTime":1598259771721,"localProcess":{"cpuCurrent":0.0}} [2020/08/24-09:02:51.828] [process-resource-monitor-2037-39] [INFO] [dku.usage.computeresource.jek] - Reporting update of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"DKU_CHURN","jobId":"Build_geo_encoded_2020-08-24T09-02-45.809","activityId":"compute_geo_encoded_NP","activityType":"recipe","recipeType":"CustomCode_forward_geocoding","recipeName":"compute_geo_encoded"},"type":"LOCAL_PROCESS","id":"QMODTbTlBL6a8l0R","startTime":1598259771721,"localProcess":{"pid":2037,"commandName":"/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/bin/python","cpuUserTimeMS":10,"cpuSystemTimeMS":0,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":10,"cpuCurrent":0.0,"vmSizeMB":18,"vmRSSMB":3,"vmHWMMB":3,"vmRSSAnonMB":1,"vmDataMB":1,"vmSizePeakMB":18,"vmRSSPeakMB":3,"vmRSSTotalMBS":0,"majorFaults":0,"childrenMajorFaults":0}} [2020/08/24-09:02:51.846] [process-resource-monitor-2037-39] [DEBUG] [dku.resource] - Process stats for pid 2037: {"pid":2037,"commandName":"/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/bin/python","cpuUserTimeMS":10,"cpuSystemTimeMS":0,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":10,"cpuCurrent":0.0,"vmSizeMB":18,"vmRSSMB":3,"vmHWMMB":3,"vmRSSAnonMB":1,"vmDataMB":1,"vmSizePeakMB":18,"vmRSSPeakMB":3,"vmRSSTotalMBS":0,"majorFaults":0,"childrenMajorFaults":0} [2020/08/24-09:02:52.066] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,064 INFO -------------------- [2020/08/24-09:02:52.070] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,064 INFO Dataiku Python entrypoint starting up [2020/08/24-09:02:52.071] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,064 INFO executable = /home/dataiku/dss/code-envs/python/plugin_geocoder_managed/bin/python [2020/08/24-09:02:52.072] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,065 INFO argv = ['/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/python-exec-wrapper.py', '/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/script.py'] [2020/08/24-09:02:52.072] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,065 INFO -------------------- [2020/08/24-09:02:52.073] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,065 INFO Looking for RemoteRunEnvDef in ./remote-run-env-def.json [2020/08/24-09:02:52.074] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,065 INFO Found RemoteRunEnvDef environment: ./remote-run-env-def.json [2020/08/24-09:02:52.102] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,099 INFO Running a DSS Python recipe locally, uinsetting env [2020/08/24-09:02:52.103] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,101 INFO Setup complete, ready to execute Python code [2020/08/24-09:02:52.104] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,101 INFO Sys path: ['/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP', '/home/dataiku/dss/lib/python', '/home/dataiku/dataiku-dss-8.0.0/python', '/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/lib64/python27.zip', '/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/lib64/python2.7', '/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/lib64/python2.7/plat-linux2', '/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/lib64/python2.7/lib-tk', '/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/lib64/python2.7/lib-old', '/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7', '/usr/lib/python2.7', '/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/lib/python2.7/site-packages', u'/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/localconfig/projects/DKU_CHURN/lib/python', u'/home/dataiku/dss/plugins/installed/geocoder/python-lib'] [2020/08/24-09:02:52.104] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:02:52,101 INFO Script file: /home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/script.py [2020/08/24-09:02:55.286] [Thread-13] [INFO] [dku.job.slave] - Sending status update [2020/08/24-09:02:55.328] [Thread-13] [INFO] [dku.job.slave] - Status update sent [2020/08/24-09:02:55.959] [qtp641449963-24] [INFO] [dku.jobs] - Connects using API ticket [2020/08/24-09:02:55.959] [qtp641449963-24] [DEBUG] [dku.jobs] - Received command : /tintercom/datasets/get-schema [2020/08/24-09:02:55.981] [qtp641449963-24] [INFO] [dku.job.slave] - Trying to obtain grant: datasetread::READ_SCHEMA::DKU_CHURN.revenue_loss [2020/08/24-09:02:56.040] [qtp641449963-24] [INFO] [dku.job.slave] - Reading schema of dataset DKU_CHURN.revenue_loss [2020/08/24-09:02:56.042] [qtp641449963-24] [DEBUG] [dku.jobs] - Command /tintercom/datasets/get-schema processed in 83ms [2020/08/24-09:02:56.060] [qtp641449963-20] [INFO] [dku.jobs] - Connects using API ticket [2020/08/24-09:02:56.060] [qtp641449963-20] [DEBUG] [dku.jobs] - Received command : /tintercom/datasets/read-data [2020/08/24-09:02:56.066] [qtp641449963-20] [INFO] [dku.job.slave] - Using cached grant: datasetread::READ_DATA::DKU_CHURN.revenue_loss [2020/08/24-09:02:56.115] [qtp641449963-20] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_CHURN.revenue_loss","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:56.119] [qtp641449963-20] [INFO] [dku.datasets.ftplike] - Enumerating Filesystem dataset prefix= [2020/08/24-09:02:56.142] [qtp641449963-20] [DEBUG] [dku.fs.local] - Enumerating local filesystem prefix=/ [2020/08/24-09:02:56.150] [qtp641449963-20] [DEBUG] [dku.fs.local] - Enumeration done nb_paths=1 size=140763 [2020/08/24-09:02:56.151] [qtp641449963-20] [INFO] [dku.input.push] - USTP: push selection.method=FULL records=-1 ratio=0.02 col=null [2020/08/24-09:02:56.159] [qtp641449963-20] [INFO] [dku.format] - Extractor run: limit={"maxBytes":-1,"maxRecords":-1,"ordering":{"enabled":false,"rules":[]}} totalRecords=0 [2020/08/24-09:02:56.184] [qtp641449963-20] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2020/08/24-09:02:56.186] [qtp641449963-20] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2020/08/24-09:02:56.186] [qtp641449963-20] [INFO] [dku.format] - Start compressed [GZIP] stream: /home/dataiku/dss/managed_datasets/DKU_CHURN.revenue_loss/out-s0.csv.gz / totalRecsBefore=0 [2020/08/24-09:02:56.186] [qtp641449963-20] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2020/08/24-09:02:56.187] [qtp641449963-20] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2020/08/24-09:02:58.799] [qtp641449963-20] [INFO] [dku.format] - after stream totalComp=140763 totalUncomp=388560 totalRec=1809 [2020/08/24-09:02:58.800] [qtp641449963-20] [INFO] [dku.format] - Extractor run done, totalCompressed=140763 totalRecords=1809 [2020/08/24-09:02:58.813] [qtp641449963-20] [DEBUG] [dku.jobs] - Command /tintercom/datasets/read-data processed in 2753ms [2020/08/24-09:02:59.010] [qtp641449963-22] [INFO] [dku.jobs] - Connects using API ticket [2020/08/24-09:02:59.010] [qtp641449963-22] [DEBUG] [dku.jobs] - Received command : /tintercom/datasets/read-data [2020/08/24-09:02:59.017] [qtp641449963-22] [INFO] [dku.job.slave] - Using cached grant: datasetread::READ_DATA::DKU_CHURN.revenue_loss [2020/08/24-09:02:59.022] [qtp641449963-22] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"/DKU_CHURN.revenue_loss","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2020/08/24-09:02:59.022] [qtp641449963-22] [INFO] [dku.datasets.ftplike] - Enumerating Filesystem dataset prefix= [2020/08/24-09:02:59.024] [qtp641449963-22] [DEBUG] [dku.fs.local] - Enumerating local filesystem prefix=/ [2020/08/24-09:02:59.024] [qtp641449963-22] [DEBUG] [dku.fs.local] - Enumeration done nb_paths=1 size=140763 [2020/08/24-09:02:59.025] [qtp641449963-22] [INFO] [dku.input.push] - USTP: push selection.method=FULL records=-1 ratio=0.02 col=null [2020/08/24-09:02:59.041] [qtp641449963-22] [INFO] [dku.format] - Extractor run: limit={"maxBytes":-1,"maxRecords":-1,"ordering":{"enabled":false,"rules":[]}} totalRecords=0 [2020/08/24-09:02:59.047] [qtp641449963-22] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2020/08/24-09:02:59.047] [qtp641449963-22] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2020/08/24-09:02:59.048] [qtp641449963-22] [INFO] [dku.format] - Start compressed [GZIP] stream: /home/dataiku/dss/managed_datasets/DKU_CHURN.revenue_loss/out-s0.csv.gz / totalRecsBefore=0 [2020/08/24-09:02:59.048] [qtp641449963-22] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2020/08/24-09:02:59.049] [qtp641449963-22] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** 336.356: [GC (Metadata GC Threshold) [PSYoungGen: 86852K->6810K(105472K)] 100767K->20725K(139264K), 0.1443888 secs] [Times: user=0.05 sys=0.01, real=0.15 secs] 336.501: [Full GC (Metadata GC Threshold) [PSYoungGen: 6810K->0K(105472K)] [ParOldGen: 13915K->17059K(56320K)] 20725K->17059K(161792K), [Metaspace: 34854K->34854K(1081344K)], 0.5499523 secs] [Times: user=0.28 sys=0.00, real=0.55 secs] [2020/08/24-09:03:00.334] [Thread-13] [INFO] [dku.job.slave] - Sending status update [2020/08/24-09:03:00.368] [Thread-13] [INFO] [dku.job.slave] - Status update sent [2020/08/24-09:03:00.402] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:03:00,400 INFO Starting dataframes iterator [2020/08/24-09:03:00.593] [qtp641449963-22] [INFO] [dku.format] - after stream totalComp=140763 totalUncomp=388560 totalRec=1809 [2020/08/24-09:03:00.594] [qtp641449963-22] [INFO] [dku.format] - Extractor run done, totalCompressed=140763 totalRecords=1809 [2020/08/24-09:03:00.595] [qtp641449963-22] [DEBUG] [dku.jobs] - Command /tintercom/datasets/read-data processed in 1585ms [2020/08/24-09:03:00.711] [null-err-37] [INFO] [dku.utils] - *************** Recipe code failed ************** [2020/08/24-09:03:00.712] [null-err-37] [INFO] [dku.utils] - Begin Python stack [2020/08/24-09:03:00.713] [null-err-37] [INFO] [dku.utils] - Traceback (most recent call last): [2020/08/24-09:03:00.714] [null-err-37] [INFO] [dku.utils] - File "/home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/python-exec-wrapper.py", line 206, in [2020/08/24-09:03:00.714] [null-err-37] [INFO] [dku.utils] - exec(f.read()) [2020/08/24-09:03:00.715] [null-err-37] [INFO] [dku.utils] - File "", line 172, in [2020/08/24-09:03:00.716] [null-err-37] [INFO] [dku.utils] - File "", line 154, in main [2020/08/24-09:03:00.717] [null-err-37] [INFO] [dku.utils] - IndexError: list index out of range [2020/08/24-09:03:00.717] [null-err-37] [INFO] [dku.utils] - End Python stack [2020/08/24-09:03:00.718] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:03:00,702 INFO Check if spark is available [2020/08/24-09:03:00.718] [null-err-37] [INFO] [dku.utils] - 2020-08-24 09:03:00,703 INFO Not stopping a spark context: No module named pyspark.context [2020/08/24-09:03:01.065] [FRT-34-FlowRunnable] [WARN] [dku.resource] act.compute_geo_encoded_NP - stat file for pid 2037 does not exist. Process died? [2020/08/24-09:03:01.066] [FRT-34-FlowRunnable] [INFO] [dku.resourceusage] act.compute_geo_encoded_NP - Reporting completion of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"DKU_CHURN","jobId":"Build_geo_encoded_2020-08-24T09-02-45.809","activityId":"compute_geo_encoded_NP","activityType":"recipe","recipeType":"CustomCode_forward_geocoding","recipeName":"compute_geo_encoded"},"type":"LOCAL_PROCESS","id":"QMODTbTlBL6a8l0R","startTime":1598259771721,"localProcess":{"pid":2037,"commandName":"/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/bin/python","cpuUserTimeMS":2280,"cpuSystemTimeMS":380,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":2660,"cpuCurrent":0.5267342476644802,"vmSizeMB":412,"vmRSSMB":55,"vmHWMMB":55,"vmRSSAnonMB":44,"vmDataMB":42,"vmSizePeakMB":412,"vmRSSPeakMB":55,"vmRSSTotalMBS":275,"majorFaults":53,"childrenMajorFaults":0}} [2020/08/24-09:03:01.068] [FRT-34-FlowRunnable] [INFO] [dku.usage.computeresource.jek] act.compute_geo_encoded_NP - Reporting completion of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"DKU_CHURN","jobId":"Build_geo_encoded_2020-08-24T09-02-45.809","activityId":"compute_geo_encoded_NP","activityType":"recipe","recipeType":"CustomCode_forward_geocoding","recipeName":"compute_geo_encoded"},"type":"LOCAL_PROCESS","id":"QMODTbTlBL6a8l0R","startTime":1598259771721,"endTime":1598259781067,"localProcess":{"pid":2037,"commandName":"/home/dataiku/dss/code-envs/python/plugin_geocoder_managed/bin/python","cpuUserTimeMS":2280,"cpuSystemTimeMS":380,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":2660,"cpuCurrent":0.5267342476644802,"vmSizeMB":412,"vmRSSMB":55,"vmHWMMB":55,"vmRSSAnonMB":44,"vmDataMB":42,"vmSizePeakMB":412,"vmRSSPeakMB":55,"vmRSSTotalMBS":275,"majorFaults":53,"childrenMajorFaults":0}} [2020/08/24-09:03:01.069] [FRT-34-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_geo_encoded_NP - Error file found, trying to throw it: /home/dataiku/dss/jobs/DKU_CHURN/Build_geo_encoded_2020-08-24T09-02-45.809/compute_geo_encoded_NP/custom-python-recipe/pyoutENPq611rFqhP/error.json [2020/08/24-09:03:01.071] [FRT-34-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_geo_encoded_NP - Raw error is{"errorType":"\u003ctype \u0027exceptions.IndexError\u0027\u003e","message":"list index out of range","detailedMessage":"At line 172: \u003ctype \u0027exceptions.IndexError\u0027\u003e: list index out of range","stackTrace":[]} [2020/08/24-09:03:01.072] [FRT-34-FlowRunnable] [INFO] [dku.recipes.code.base] act.compute_geo_encoded_NP - Now err: {"errorType":"\u003ctype \u0027exceptions.IndexError\u0027\u003e","message":"Error in Python process: list index out of range","detailedMessage":"Error in Python process: At line 172: \u003ctype \u0027exceptions.IndexError\u0027\u003e: list index out of range","stackTrace":[]} [2020/08/24-09:03:01.121] [FRT-34-FlowRunnable] [INFO] [dku.flow.activity] act.compute_geo_encoded_NP - Run thread failed for activity compute_geo_encoded_NP com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 172: : list index out of range at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:221) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:186) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:103) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:48) at com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner.run(CustomPythonRecipeRunner.java:71) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374) [2020/08/24-09:03:01.125] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_geo_encoded_NP - activity is finished [2020/08/24-09:03:01.138] [ActivityExecutor-29] [ERROR] [dku.flow.activity] running compute_geo_encoded_NP - Activity failed com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 172: : list index out of range at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:221) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:186) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:103) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:48) at com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner.run(CustomPythonRecipeRunner.java:71) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374) [2020/08/24-09:03:01.138] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_geo_encoded_NP - Executing default post-activity lifecycle hook [2020/08/24-09:03:01.142] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_geo_encoded_NP - Removing samples for DKU_CHURN.geo_encoded [2020/08/24-09:03:01.162] [ActivityExecutor-29] [INFO] [dku.flow.activity] running compute_geo_encoded_NP - Done post-activity tasks [2020/08/24-09:03:01.384] [ActivityExecutor-31] [DEBUG] [dku.flow.jobrunner] - No activity is running, and none is available --> End of ActivityExecutorThread [2020/08/24-09:03:01.385] [ActivityExecutor-32] [DEBUG] [dku.flow.jobrunner] - No activity is running, and none is available --> End of ActivityExecutorThread [2020/08/24-09:03:01.386] [ActivityExecutor-30] [DEBUG] [dku.flow.jobrunner] - No activity is running, and none is available --> End of ActivityExecutorThread [2020/08/24-09:03:01.386] [ActivityExecutor-33] [DEBUG] [dku.flow.jobrunner] - No activity is running, and none is available --> End of ActivityExecutorThread [2020/08/24-09:03:01.545] [ActivityExecutor-29] [DEBUG] [dku.flow.jobrunner] - No activity is running, and none is available --> End of ActivityExecutorThread [2020/08/24-09:03:01.546] [qtp641449963-23] [INFO] [dku.flow.jobrunner] - All activity executor threads joined, end of job [2020/08/24-09:03:01.547] [qtp641449963-23] [INFO] [dku.flow.jobrunner] - JOB IS COMPLETE [2020/08/24-09:03:01.554] [qtp641449963-23] [DEBUG] [dku.jobs] - Command /pintercom/run_fully processed in 11612ms [2020/08/24-09:03:01.640] [qtp641449963-19] [INFO] [dku.jobs] - Connects using Shared secret [2020/08/24-09:03:01.641] [qtp641449963-19] [DEBUG] [dku.jobs] - Received command : /pintercom/stop_session [2020/08/24-09:03:01.646] [qtp641449963-19] [DEBUG] [dku.jobs] - Command /pintercom/stop_session processed in 6ms