Google Sheets Plugin import bug

Eldiias
Level 1
Google Sheets Plugin import bug

Hi there!

I encountered a bug (or potentially it is a hardcoded limitation) while using Google Sheets plugin. I have the following flow:

  1. Dataset is transformed and stored into GSheets
  2. Another dataset is transformed, the dataset from Gsheets is loaded and appended to it. New dataset is stored as a Dataiku dataset.

Sounds simple. But it didn't work. After careful review of the flow, I found out the following:

  1. The first dataset has long column names. There is no limitation on the column name length, so not a problem.
  2. Dataset in GSheets is OK, the column names are correct.
  3. While importing a dataset from GSheets, the column names are cropped. So, instead of total_transport_cost_per_ton it is total_transport_cost_per. 

I believe the issue is related with a plugin, though I didn't find any similar comment yet.

I could definitely just update the column names (since I have the same ones in the second dataset), but I would love to retrieve column names correctly.

0 Kudos
3 Replies
HarizoR
Developer Advocate

Hi Eldiias,

After reviewing the plugin's code it appears that your issue is indeed due to a hardcoded limitation of 25 characters for the slugified column names (see here: https://github.com/dataiku/dataiku-contrib/blob/master/googlesheets/python-connectors/googlesheets-s...). In your case, the easiest workaround is to switch the plugin to development mode and edit that value manually.

Hope this helps!

Best,

Harizo

tgb417

@HarizoR 

Is there any likelihood that this is going to be updated to have longer column names.  I have also run into this limitations.  

I work at a small non-profit with limited software development skills.  We would prefer not to make a Dataiku plugin into a local version that we will have to continue to support.  What is the likelihood that this will be enhanced to take much longer column names.  Say at least as long as say PostgreSQL server will take?  Which is 59 characters

--Tom
0 Kudos
Eldiias
Level 1
Author

Hey Tom!

 

I would propose you to use a different flow. 

  1. Store the credentials of your google user account in a Dataiku folder.
  2. Import/export data to GSheets using Python recipe. That increases the robustness and if you need to update the credentials, you will need to update it only once in the folder instead of updating every single GSheet dataset.
  3. In python you can use gspread library. It has quite simple code allowing pretty smooth workflow.

Best!

Eldiias

 

0 Kudos