Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on July 7, 2023 3:24AM
Likes: 0
Replies: 4
This dataset is pulled from internal stats commits which is produced by Dataiku. But I'm finding it hard to understand the last 5 columns it produced. Does anyone know what they mean or link me to reference that I can use?
Operating system used: Linux
I think you have misunderstood this functionality. It's not meant to track data rows inserts, deletes, etc but metadata changes (definition changes done to your flow objects). Have a look at the Version Control menu in your project and you will be able to see the changes and review how they are persisted.
The last 5 columns show you the changes the user has done in the commit. Most of the Dataiku objects in GUI are stored as XML or JSON files so these columns are indicating if new objects have been added or changed.
Some of the columns are pretty straightforward, for example
added files: adding files into the project
removed files: files that were removed from the project
changed files: if any of the existing files in the project has been changed (might be wrong on this one)
But what I don't understand is the other two, namely added lines and removed lines. I uploaded a dataset that only contains 4 rows of data and only changed the column names. However, if you noticed on the added lines column in the picture, you can see that the it says 343 added lines were made. This is the part that I have trouble on understanding. I don't fully understand why it said 343 lines were added whereas I've only changed the column names.
Would be glad if anyone can help to clarify on this.
Thank you! The answer I was just looking for. I didn't know about the version control.