Parquet datasets no longer viewable

Solved!
ryanraasch
Level 2
Parquet datasets no longer viewable

I have parquet datasets as part of my flow, but I can longer view them. I either get a "root path does not exist" error or "org/apache/hadoop/conf/Configuration, caused by: ClassNotFoundException: org.apache.hadoop.conf.Configuration" when trying to view my datasets. Nothing has changed except for updating our dataiku instance from version 10 to 11. Is there something that needs to be installed when upgrading to version 11 in order to allow the use of parquet format? If so, could you provide instructions how to go about that?

Thanks in advance!


Operating system used: Linux


Operating system used: Linux

0 Kudos
1 Solution
SarinaS
Dataiker

Hi @ryanraasch,

I think you are running into the issue described here: https://doc.dataiku.com/dss/latest/troubleshooting/problems/no-class-def-found.html 

Note that the hadoop integration must be re-run after each upgrade, so it likely simply hasn't been run post-upgrade. Re-running your hadoop integration should resolve the issue for you. 

Thanks,
Sarina 

View solution in original post

0 Kudos
1 Reply
SarinaS
Dataiker

Hi @ryanraasch,

I think you are running into the issue described here: https://doc.dataiku.com/dss/latest/troubleshooting/problems/no-class-def-found.html 

Note that the hadoop integration must be re-run after each upgrade, so it likely simply hasn't been run post-upgrade. Re-running your hadoop integration should resolve the issue for you. 

Thanks,
Sarina 

0 Kudos