DSS type geopoint is not supported by Parquet writer

EllaHuang
Level 1
DSS type geopoint is not supported by Parquet writer

In the Fuzzy Join Recipe Tutorial, when I tried to use the Fuzzy Join Recipe, it could not generate the dataset given the error message 'DSS type geopoint is not supported by Parquet writer'.

 

 

0 Kudos
1 Reply
Redouxne
Level 2

Hi Ella Huang,

The error message 'DSS type geopoint is not supported by Parquet writer'  is likely related to the compatibility of the geopoint data type with the Parquet file format.

In Dataiku, a geopoint column can contain coordinates of a single point, expressed using WKT (Well-Known Text) format. However, it appears that the Parquet writer in Dataiku does not support this geopoint data type.

Parquet is a column-oriented file format of the Hadoop ecosystem, often used for efficient storage and processing of large datasets. While it offers advantages such as block-based compression and the ability to push down filtering predicates, it seems to have limitations regarding certain data types, including the geopoint type used in Dataiku DSS.

To work around this issue, you might consider converting geopoint data into a format that is compatible with Parquet. Dataiku provides processors to convert between a Geopoint column and latitude/longitude columns.

By converting your geopoint data into separate latitude and longitude columns, you may be able to bypass this limitation and successfully use the Fuzzy Join Recipe without encountering the error.

Here to help,

Redouane

0 Kudos