New to Dataiku DSS? Try out our NEW Quick Start Programs today and get onboarded on the product in just one hour! Let's go

Shell script teachable churn prediction

larispardo
Level 3
Shell script teachable churn prediction

Hello, I am trying to replicate the churn prediction case that is in the teachable.dataiku website and I receive the following error:




[2017/04/26-19:02:26.285] [Exec-38] [INFO] [dku.utils] - /home/dataiku/dss/pyenv/lib/python2.7/site-packages/unidecode/__init__.py:46: RuntimeWarning: Argument <type 'str'> is not an unicode object. Passing an encoded string will likely have unexpected results.
[2017/04/26-19:02:26.285] [Exec-38] [INFO] [dku.utils] - _warn_if_not_unicode(string)
[2017/04/26-19:02:26.340] [Exec-38] [INFO] [dku.utils] - Traceback (most recent call last):
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils] - File "/home/dataiku/dss/lib/python/vw_transformer.py", line 99, in <module>
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils] - sys.stdout.write(vw_record + "\n")
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils] - IOError: [Errno 32] Broken pipe
[2017/04/26-19:02:26.457] [Thread-23] [ERROR] [dku.flow.shell] - Error while sending input to script
java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129)
at java.io.BufferedWriter.write(BufferedWriter.java:230)
at java.io.Writer.write(Writer.java:157)
at java.io.Writer.append(Writer.java:227)
at com.dataiku.dip.output.CSVOutputFormatter.appendExcelStyle(CSVOutputFormatter.java:109)
at com.dataiku.dip.output.CSVOutputFormatter.appendFieldToLine(CSVOutputFormatter.java:198)
at com.dataiku.dip.output.CSVOutputFormatter.format(CSVOutputFormatter.java:183)
at com.dataiku.dip.output.StringOutputFormatter.format(StringOutputFormatter.java:33)
at com.dataiku.dip.output.OutputStreamOutputWriter.emitRow(OutputStreamOutputWriter.java:32)
at com.dataiku.dip.input.formats.csv.CSVFormatExtractor.doExtractStream(CSVFormatExtractor.java:366)
at com.dataiku.dip.input.formats.csv.CSVFormatExtractor.doExtractStream(CSVFormatExtractor.java:161)
at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.run(ArchiveCapableFormatExtractor.java:135)
at com.dataiku.dip.datasets.AbstractSingleThreadPusher.pushSplits(AbstractSingleThreadPusher.java:176)
at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:226)
at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:64)
at com.dataiku.dip.recipes.code.shell.ShellScriptRecipeRunner$PipeInThread.run(ShellScriptRecipeRunner.java:220)
[2017/04/26-19:02:26.459] [Thread-23] [INFO] [dku.flow.shell] - Closing the script input
[2017/04/26-19:02:26.463] [FRT-35-FlowRunnable] [INFO] [dku.flow.activity] - Run thread failed for activity compute_PG83dheF_NP
com.dataiku.dip.exceptions.ProcessDiedException: The shell process failed (exit code: 127). More info might be available in the logs.


It seems to be entering my python code, but this is not sending back the info. I might be wrong.



Any help will be greatly appreciated.

0 Kudos
4 Replies
Thomas
Dataiker
Dataiker
Hi - did you set something in the "Pipe in" or "Pipe out" dropdown menus? It needs to be set to "--nothing--" in both cases, as the Python script takes care of the reading the input data directly.
0 Kudos
larispardo
Level 3
Author
Yes, I have tried it in all forms, with and without something in the pipe in pipe out, and I always receive the same error 127. I put a database in pipe in, and changed the value so it does not go through the python script and recceived the same error, here is the log:
[2017/04/26-21:13:03.509] [Exec-37] [INFO] [dku.utils] - State Account_Length Area_Code Phone Intl_Plan VMail_Plan VMail_Message Day_Mins Day_Calls Day_Charge Eve_Mins Eve_Calls Eve_Charge Night_Mins Night_Calls Night_Charge Intl_Mins Intl_Calls Intl_Charge CustServ_Calls Churn splitter
[2017/04/26-21:13:03.510] [Exec-37] [INFO] [dku.utils] - ^
[2017/04/26-21:13:03.510] [Exec-37] [INFO] [dku.utils] - SyntaxError: invalid syntax
[2017/04/26-21:13:03.544] [Exec-37] [INFO] [dku.utils] - /home/dataiku/dss/jobs/CHURNPREDICTION/Build_model_vw_2017-04-26T21-13-01.235/compute_PG83dheF_NP/shelljUDVJ4ANXQzQ/script.sh: line 36: --dataset=train: command not found
[2017/04/26-21:13:03.546] [Thread-23] [ERROR] [dku.flow.shell] - Error while sending input to script
java.io.IOException: Broken pipe
0 Kudos
Thomas
Dataiker
Dataiker
Ok
Also, could you please run "vw --version" in a terminal, on the server hosting DSS, and see what is the output ?
0 Kudos
larispardo
Level 3
Author
okey, I not sure if I got it I have searched for the command and have not found it, so I checked my versions and are the latest of version 8, also tried to run my python code, but I do not have dataiku package in my python folder, so I can use it via dataiku, not in the terminal.
0 Kudos
A banner prompting to get Dataiku DSS