Argument List too Long error which is independent on the recipe.

UserBird Dataiker, Alpha Tester Posts: 535 Dataiker

I have a Filesystem datasource which is contains thousands of folders and each folder contains a list of comma separated files. Each file in each directory contains a different schema and the file name criteria is used to create partitioned data sources with the following using the following format:


This creates a datasource based on all the files that start with KEY in its name. That part is working as expected. My problem is that I can't do any recipe against that data source. I tried python, shell and sync recipes and all of the failed with the same error:

at java.lang.ProcessBuilder.start(
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(
at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$
Caused by: error=7, Argument list too long
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(
at java.lang.ProcessImpl.start(
at java.lang.ProcessBuilder.start(

My current recipe is in python and the code is:

# -*- coding: utf-8 -*-

import dataiku

import pandas as pd, numpy as np

from dataiku import pandasutils as pdu

# Recipe inputs


events_CSV = dataiku.Dataset("KEY_CSV")

events_CSV_df = events_CSV.get_dataframe()

# Recipe outputs

events_ORC = dataiku.Dataset("KEY_ORC")


Job fails before printing "Here".

These are the DSS instance settings:

{u'dipInstanceId': u'8bu1n1os-203c299d56c99ef078a53a1a81b6ea23-c60f6bab8e57ecd615a8ec240207f819', u'features': {u'TWITTER': {}, u'HADOOP': {}, u'HIVE': {}, u'PIG': {}, u'R': {}, u'SPARK': {}}, u'devInstance': False, u'distribVersion': u'7.3', u'debug': False, u'version': {u'product_commitid': u'', u'conf_version': u'16', u'product_version': u'4.0.5'}, u'distrib': u'redhat'}


Setup Info
      Help me…