How can I write to Minio as parquet file?
Hi,
I am using Dataiku version 12.5.2. I am trying to write a simple csv file as a parquet file to Minio. For this, I have successfully installed Hadoop integration with the following steps:
$ cd DATADIR
$ ./bin/dss stop
$ ./bin/dssadmin install-hadoop-integration -standalone generic-hadoop3 -standaloneArchive /data/dataiku-dss-hadoop-standalone-libs-generic-hadoop3-12.5.2.tar.gz
$ ./bin/dss start
However, when I try to write a csv to Minio as a parquet file using Sync recipe, it gives the following error:
java.lang.NoSuchMethodError: 'void org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(java.util.concurrent.ExecutorService, int, boolean)'
How can I overcome this problem?
Regards,
Oguz
Operating system used: Linux Ubuntu 22.04
Best Answer
-
Hi,
We solved the problem with Dataiku support. The problem was a conflict with phoenix jars. So moving the phoenix jars to another directory solved the problem.
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @oguzselvi
,
Could please open a support with the instance diagnostics so we can check if the standalone hadoop integration ran successfully?
Thanks