Shell script teachable churn prediction

Registered Posts: 28 ✭✭✭✭✭

Hello, I am trying to replicate the churn prediction case that is in the teachable.dataiku website and I receive the following error:


[2017/04/26-19:02:26.285] [Exec-38] [INFO] [dku.utils] - /home/dataiku/dss/pyenv/lib/python2.7/site-packages/unidecode/__init__.py:46: RuntimeWarning: Argument <type 'str'> is not an unicode object. Passing an encoded string will likely have unexpected results.
[2017/04/26-19:02:26.285] [Exec-38] [INFO] [dku.utils] - _warn_if_not_unicode(string)
[2017/04/26-19:02:26.340] [Exec-38] [INFO] [dku.utils] - Traceback (most recent call last):
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils] - File "/home/dataiku/dss/lib/python/vw_transformer.py", line 99, in <module>
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils] - sys.stdout.write(vw_record + "\n")
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils] - IOError: [Errno 32] Broken pipe
[2017/04/26-19:02:26.457] [Thread-23] [ERROR] [dku.flow.shell] - Error while sending input to script
java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129)
at java.io.BufferedWriter.write(BufferedWriter.java:230)
at java.io.Writer.write(Writer.java:157)
at java.io.Writer.append(Writer.java:227)
at com.dataiku.dip.output.CSVOutputFormatter.appendExcelStyle(CSVOutputFormatter.java:109)
at com.dataiku.dip.output.CSVOutputFormatter.appendFieldToLine(CSVOutputFormatter.java:198)
at com.dataiku.dip.output.CSVOutputFormatter.format(CSVOutputFormatter.java:183)
at com.dataiku.dip.output.StringOutputFormatter.format(StringOutputFormatter.java:33)
at com.dataiku.dip.output.OutputStreamOutputWriter.emitRow(OutputStreamOutputWriter.java:32)
at com.dataiku.dip.input.formats.csv.CSVFormatExtractor.doExtractStream(CSVFormatExtractor.java:366)
at com.dataiku.dip.input.formats.csv.CSVFormatExtractor.doExtractStream(CSVFormatExtractor.java:161)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.run(ArchiveCapableFormatExtractor.java:135)
at com.dataiku.dip.datasets.AbstractSingleThreadPusher.pushSplits(AbstractSingleThreadPusher.java:176)
at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:226)
at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:64)
at com.dataiku.dip.recipes.code.shell.ShellScriptRecipeRunner$PipeInThread.run(ShellScriptRecipeRunner.java:220)
[2017/04/26-19:02:26.459] [Thread-23] [INFO] [dku.flow.shell] - Closing the script input
[2017/04/26-19:02:26.463] [FRT-35-FlowRunnable] [INFO] [dku.flow.activity] - Run thread failed for activity compute_PG83dheF_NP
com.dataiku.dip.exceptions.ProcessDiedException: The shell process failed (exit code: 127). More info might be available in the logs.

It seems to be entering my python code, but this is not sending back the info. I might be wrong.

Any help will be greatly appreciated.

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • Dataiker Alumni Posts: 19 ✭✭✭✭✭
    Hi - did you set something in the "Pipe in" or "Pipe out" dropdown menus? It needs to be set to "--nothing--" in both cases, as the Python script takes care of the reading the input data directly.
  • Registered Posts: 28 ✭✭✭✭✭
    Yes, I have tried it in all forms, with and without something in the pipe in pipe out, and I always receive the same error 127. I put a database in pipe in, and changed the value so it does not go through the python script and recceived the same error, here is the log:
    [2017/04/26-21:13:03.509] [Exec-37] [INFO] [dku.utils] - State Account_Length Area_Code Phone Intl_Plan VMail_Plan VMail_Message Day_Mins Day_Calls Day_Charge Eve_Mins Eve_Calls Eve_Charge Night_Mins Night_Calls Night_Charge Intl_Mins Intl_Calls Intl_Charge CustServ_Calls Churn splitter
    [2017/04/26-21:13:03.510] [Exec-37] [INFO] [dku.utils] - ^
    [2017/04/26-21:13:03.510] [Exec-37] [INFO] [dku.utils] - SyntaxError: invalid syntax
    [2017/04/26-21:13:03.544] [Exec-37] [INFO] [dku.utils] - /home/dataiku/dss/jobs/CHURNPREDICTION/Build_model_vw_2017-04-26T21-13-01.235/compute_PG83dheF_NP/shelljUDVJ4ANXQzQ/script.sh: line 36: --dataset=train: command not found
    [2017/04/26-21:13:03.546] [Thread-23] [ERROR] [dku.flow.shell] - Error while sending input to script
    java.io.IOException: Broken pipe
  • Dataiker Alumni Posts: 19 ✭✭✭✭✭
    Ok
    Also, could you please run "vw --version" in a terminal, on the server hosting DSS, and see what is the output ?
  • Registered Posts: 28 ✭✭✭✭✭
    okey, I not sure if I got it I have searched for the command and have not found it, so I checked my versions and are the latest of version 8, also tried to run my python code, but I do not have dataiku package in my python folder, so I can use it via dataiku, not in the terminal.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.