In Pandas, tables with hierarchical indexed are rendered gracefully, enabling users to quickly visualize their datasets in detail and perform mass transformations to data deep within hierarchies. In Dataiku, hierarchical data is currently visualized as a string. These strings can be hovered over to expand, but don't get formatted neatly and are often truncated.
Dataiku does already have some fairly powerful processors to manipulate hierarchical data, but lacks visualization capability to make these processors quickly useful for exploration and preparation.
I think it would improve Dataiku tremendously for the capability to visualize hierarchical data in some way to be incorporated. Pandas has a couple styles for this:
There are some other styles for displaying this type of data as well:
Even simple rendering of json objects on hover would be a big improvement, but it would be very cool if some type of deeper rendering enabling all of the advantages of Dataiku column metadata could be enabled, especially where all the fields in a json object or array happen to match a strict schema. Then it also becomes much easier to apply prepare recipe processors to hierarchical data without needing to flatten the schema. A common pattern in prepare recipes is to flatten a layer of hierarchical data, process it, then to reformat it as json so another json column's data can then be processed without creating a product join in the process. I think this could be less laborious if nested fields were recognized as part of a table's schema, allowing users to quickly apply processors multiple layers deep in objects while maintaining the original structure of a table.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.