Unterminated quoted field at the end of the file
After our recent update to Dataiku 12 our data has been behaving oddly in dataiku.
We encountered this problem https://doc.dataiku.com/dss/12/troubleshooting/errors/ERR_DATASET_CSV_UNTERMINATED_QUOTE.html
Before setting the quoting style to 'Unix' from 'Excel' (default) the job wouldn't finish at all.
Now it finishes but I seem to loose 95 % of the original rows for some reason. There is only "Remove 14 columns" step in the Prepare recipe which shouldnt touch the rows at all. My coworker experiences same results / problems with the same fix (from the docs) she made for a filtering recipe when encountered with this issue of spark aborting the job.
We have been working with this data for a long time and we never alter it (its clean data when we receive it, bought from another company) except for filtering or removing columns. We never encountered any kind of problems with it before the update.
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron
I don’t have a good answer for you. However this does sounds like something that the Dataiku support team can help with..
I invite you to consider opening a support ticket on this issue. This link should get you to a support form.
https://support.dataiku.com/support/home
hope this helps.