Setup & Configuration

Sort by:

1 - 10 of 24

Conversion to Parquet fails in Hadoop HDFS
$ hadoop version Hadoop 3.1.2 Source code repository https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a Compiled by sunilg on 2019-01-29T01:39Z Compiled with protoc 2.5.0…
Answered ✓
Hadoop
Started by Benoni
Most recent by Azhar
Jul 23, 2024
0
5
Solution by Clément_Stenac
Hi,

Dataiku does not support "home made" Hadoop distributions.

You may have some success by editing the "bin/env-hadoop.sh" file, locating the "DKU_HIVE_CP" line, and adding at the end (within the quotes):

:$DKUINSTALLDIR/lib/ivy/parquet-run/*

Then restart DSS
Reply to Discussion
No connection defined to upload files/jars
I am trying to execute a PySpark recipe on a remote AWS EMR Spark cluster and I am getting: Your Spark settings don't define a temporary storage for yarn-cluster modein act.compute_prepdataset1_NP: No…
Answered ✓
Hadoop
Spark
AWS
Started by J. Daniel
Most recent by J. Daniel
Jun 7, 2024
0
2
Solution by J. Daniel
All set! I had the wrong jars.

DSS is working perfectly with AWS EMR v6.15.0 and the very latest v7.1.0
What an amazing product! I can not say enough good things about the people that developed and continue to develop it.
Reply to Discussion
HDFS - Force Parquet as default settings for recipe output
Greetings ! I'm currently on a platform with Dataiku 11.3.1 and writing datasets on HDFS. IT requires all dataset to be written in Parquet, but the default setting is on CSV (Hive) and it can generate…
Answered ✓
Hadoop
lex
Ignition
Started by Charly
Most recent by Charly
May 21, 2024
0
2
Solution by
Reply to Discussion
Enabling parquet format in Dataiku DSS
Hi Currently when we write into Dataiku file system we only csv and avro format. How can I enable parque format in Dataiku DSS running on linux platform on EC2 instance. I need steps for that. Also we…
Answered ✓
Linux
Hadoop
Started by Ankur30
Most recent by somepunter
Apr 14, 2023
1
3
Solution by
Reply to Discussion
Permission Denied Installing Standalone Hadoop Integration
I am trying to install the standalone hadoop integration for Dataiku. My Dataiku instance is hosted on a linux server and when I follow the directions for standalone installation here (Setting up Hado…
Answered ✓
Hadoop
Ignition
Started by ryanraasch
Most recent by ryanraasch
Mar 21, 2023
0
3
Solution by
Reply to Discussion
How to add a file to the Resources directory so that it is accessible at runtime
How can I quickly update the code environment, upload a zipped certificate file to the resources directory, and then make the certificate file accessible during runtime? I upload the file, modify the …
Question
Backup
Hadoop
Started by btsshirt
Most recent by clayms
Mar 8, 2023
0
5
Last answer by
Reply to Discussion
Accessing Spark web UI
Hello, I am a beginner in Spark and I am trying to setup Spark on our Kubernetes cluster. The cluster is now working and I can run Spark jobs; however, I want to access Spark web UI to inspect how my …
Answered ✓
Hadoop
Resource control
Spark
Started by ObadaJab
Most recent by Omar
May 13, 2021
1
1
Solution by
Reply to Discussion
NoClassDefFoundError when reading a parquet file
I have setup an HDFS connection to access a Google Cloud Storage bucket on which I have parquet files. After adding GoogleHadoopFileSystem to the hadoop configuration I can access the bucket and files…
Question
Hadoop
Spark
Started by phildav
Most recent by tstaig
Nov 24, 2020
0
3
Last answer by
Reply to Discussion
Spark and HDFS Integration
Hi All, Thanks for great support. I would like to install DSS in different server where Spark and HDFS are not exist. is it possible to integrate DSS with remote server where spark and hdfs are exist …
Question
Hadoop
Spark
Virtual machine
Started by Bader
Most recent by Bader
Jun 16, 2020
0
2
Last answer by
Reply to Discussion
Spark on Kubernetes
In Install guide, it talks about Spark on Hadoop-YARN. Does DataIKU support spark on kubernetes cluster? if yes, then is hadoop install required? . Please link any doc or article in this matter if ava…
Question
Hadoop
Spark
Kubernetes
Started by sjacob
Most recent by Omar
Mar 12, 2019
0
1
Last answer by
Reply to Discussion

1 - 10 of 241

Trending Discussions

Custom script for Container execution to push the base images
Answered
1
How to connect to shared drive to extract files dynamically from there
Answered
1
Dataiku Free Edition "Network Error"
Answered
1

Leaderboard

Member	Points
Turribeach	3689
tgb417	2513
Ignacio_Toledo	1082

Setup & Configuration

Top Tags

Trending Discussions

Leaderboard