Improve ML Experiment Tracking by allowing updating of Session descriptions

0 Kudos

It's great to be able track one's experiments when developing ML models. However, the key description for each experiment is not editable. It'd be really helpful if one could edit this so each experiment can be accurately described.

More specifically, the request is to allow editing of the description assigned to Sessions on the Model Results page (i.e., the model results that is at Models within a visual analysis).

Currently sessions may be named in one of two ways. If a name is specified at the Training Models popup that appears after clicking the TRAIN button, that name is used as the description for the resulting session. Otherwise the session is named numerically. 

In either case, once the name/description is assigned to the session it can't be changed.

However, it would be really helpful to be able to update the description of each session. Being able to accurately describe sessions (experiment) is important because most of the time the session represents a test of something that will apply to all models within the experiment. Examples: a new feature, more data, a different split, different feature handling, weighting by class or not, and so forth. The only change that won't be session level are model parameters.

Ideally one would be describe the experiment (session) accurately at the Training Models popup. However, in my experience this is difficult to do consistently. And even then sometimes it would be helpful to update the description after the fact as one realizes the importance of the experiment and would want to make sure it is clearly described.

Marlan

5 Comments
MaximeA
Dataiker

Hello @Marlan 

Thank you for submitting this idea, I understand your point and I think it makes sense in the context of experimenting multiple ML models setups. I'll mark it as Need Info for now, because I would like to have your opinion on something. 

Indeed sessions are important because all models in the same session share a setup, but in the end it's a specific model that we are interested in (and might eventually want to deploy to the flow).

When you access the Model Report, you can edit the model's name & description on the Summary page. For instance here, I am quite happy with this specific model I trained (custom optimization metric - good results that I want to keep and compare later), so I renamed everything accordingly: 

image.png

Then when I go back to the Models tab, it appears as is (top left, "RF - Optimized ..."):

image (1).png

 

What do you think? Is that something that could solve your problem? 

Thanks again, and take care

Maxime

Hello @Marlan 

Thank you for submitting this idea, I understand your point and I think it makes sense in the context of experimenting multiple ML models setups. I'll mark it as Need Info for now, because I would like to have your opinion on something. 

Indeed sessions are important because all models in the same session share a setup, but in the end it's a specific model that we are interested in (and might eventually want to deploy to the flow).

When you access the Model Report, you can edit the model's name & description on the Summary page. For instance here, I am quite happy with this specific model I trained (custom optimization metric - good results that I want to keep and compare later), so I renamed everything accordingly: 

image.png

Then when I go back to the Models tab, it appears as is (top left, "RF - Optimized ..."):

image (1).png

 

What do you think? Is that something that could solve your problem? 

Thanks again, and take care

Maxime

MaximeA
Dataiker
 
Status changed to: In the Backlog
 
CoreyS
Dataiker Alumni
 
Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
Status changed to: Gathering Input
 

Hi @MaximeA,

Thank you for your attention to this request!

Yes, I'm aware that I can change the label and description for individual models. And it is true that at the end of the process one is most interested in the model that will be deployed.

However, this misses the process that one goes through to arrive at the model to be deployed. This typically involves multiple runs that explore different combinations of factors. It is really helpful to be clear about one's experiments both during that process and even more importantly later when revisiting the project to retrain the model because of some change or just a drift in input data.

It's also the case that some experiments aren't useful and aren't worth keeping. At least that's my experience. Given that, I don't spend a lot of time trying to fully document each run initially. I'd much rather do this after the fact once I understand what I have learned from it. 

Another factor here is how the UI displays information. The session label is all upper case and in bold and provides room for a full description. The model label is in a light font and is cut off at about 10 characters. I'd much rather be able to use the more obvious and longer session label to describe the experiment. It just provides a better user experience.

Not being able to change the session label also just seems odd. Pretty much everything else in DSS that has a name or description can be changed.  Why not this?

Just using the model description seems like a work around. Experiment tracking is one of the key features of DSS. So why make the ability to update the label of an experiment a workaround?

Speaking of workarounds, my current approach is to repeat the experiments with the desired label that I want to keep and have clear descriptions of. This works OK especially if training times aren't too long.

Hope this is helpful.

Marlan

Hi @MaximeA,

Thank you for your attention to this request!

Yes, I'm aware that I can change the label and description for individual models. And it is true that at the end of the process one is most interested in the model that will be deployed.

However, this misses the process that one goes through to arrive at the model to be deployed. This typically involves multiple runs that explore different combinations of factors. It is really helpful to be clear about one's experiments both during that process and even more importantly later when revisiting the project to retrain the model because of some change or just a drift in input data.

It's also the case that some experiments aren't useful and aren't worth keeping. At least that's my experience. Given that, I don't spend a lot of time trying to fully document each run initially. I'd much rather do this after the fact once I understand what I have learned from it. 

Another factor here is how the UI displays information. The session label is all upper case and in bold and provides room for a full description. The model label is in a light font and is cut off at about 10 characters. I'd much rather be able to use the more obvious and longer session label to describe the experiment. It just provides a better user experience.

Not being able to change the session label also just seems odd. Pretty much everything else in DSS that has a name or description can be changed.  Why not this?

Just using the model description seems like a work around. Experiment tracking is one of the key features of DSS. So why make the ability to update the label of an experiment a workaround?

Speaking of workarounds, my current approach is to repeat the experiments with the desired label that I want to keep and have clear descriptions of. This works OK especially if training times aren't too long.

Hope this is helpful.

Marlan

MichaelG
Community Manager
Community Manager
 
I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
Status changed to: Gathering Input