Generating Project Identifier for Versioning Training Data
Is there an easy way to identify which version of a Project (maybe by git hash?) was used to retrain a particular model? The use case I'm considering is as follows. I would like to version my training data so that each time the training data in the flow is updated and the model is retrained (possibly manually or through a scenario), another scenario will run (probably python), and create a backup of that training dataset in my RDBMS (or on S3) that I can link back to the project at that point in time. Has anyone done something like this before? I was thinking of possibly using a git hash and the date?