Try your hand at analyzing royal sentiment in Dataiku DSS! Learn more

Export a model to a jupyter notebook doesn't preserve the feature

Level 3
Export a model to a jupyter notebook doesn't preserve the feature

I have a model opened within an Analysis and exported it to a jupyter notebook. 



 



The model has one text feature that uses TF/IDF vectorization: 



The model in the notebook is using TruncatedSVD/HashingVectorizer. This is the 'default' option in the model design page, i.e. the option gets selected when a text feature is added to a model: 



But I changed that default option to TF/IDF vectorization as evident from the second image and trained the model. 



I can modify the notebook and use tf-idf as designed. 



But the question is whether it is possible to export a model the way it is designed?

3 Replies
Dataiker
Dataiker

Hi,



notebook generation to export a model only exports a "similar" model (documentation here: https://doc.dataiku.com/dss/latest/machine-learning/models-export.html#export-to-jupyter-notebook ). It is not possible to export the exact same model as the actual code might be much more complex than something that can fit in a human-editable notebook. The idea is to provide a good enough starting point that data scientists can actually build on.



Regards,



Joachim Zentici



Dataiku

Level 3
Author
Thank you for the quick response. In that case, maybe it would make sense to rename this option from "Export Model" to "Create a similar model"?
The language "Export" is misleading. If I export Airbus A380 , I am expected to deliver Airbus A380 , not Airbus A340, A350, etc.
Also, I don't think I'd agree with the statement " It is not possible to export the exact same model as the actual code might be much more complex than something that can fit in a human-editable notebook." I believe, the opposite is true, one can do much more and has more flexibility using a notebook than working with a predefined set of options of Visual Recipes. After all, all the options of Visual recipes were originally created by a human in a human-editable notebook (or some equivalent thereof), weren't they? 🙂
0 Kudos
Dataiker
Dataiker
Most of the resulting code gets generated, so even if the source was indeed written by an human, the output is clearly not human readable unfortunately.
0 Kudos
Labels (4)