Export a model to a jupyter notebook doesn't preserve the feature

davidmakovoz
davidmakovoz Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 67 Neuron

I have a model opened within an Analysis and exported it to a jupyter notebook.

The model has one text feature that uses TF/IDF vectorization:

The model in the notebook is using TruncatedSVD/HashingVectorizer. This is the 'default' option in the model design page, i.e. the option gets selected when a text feature is added to a model:

But I changed that default option to TF/IDF vectorization as evident from the second image and trained the model.

I can modify the notebook and use tf-idf as designed.

But the question is whether it is possible to export a model the way it is designed?

Best Answer

Answers

  • davidmakovoz
    davidmakovoz Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 67 Neuron
    Thank you for the quick response. In that case, maybe it would make sense to rename this option from "Export Model" to "Create a similar model"?
    The language "Export" is misleading. If I export Airbus A380 , I am expected to deliver Airbus A380 , not Airbus A340, A350, etc.
    Also, I don't think I'd agree with the statement " It is not possible to export the exact same model as the actual code might be much more complex than something that can fit in a human-editable notebook." I believe, the opposite is true, one can do much more and has more flexibility using a notebook than working with a predefined set of options of Visual Recipes. After all, all the options of Visual recipes were originally created by a human in a human-editable notebook (or some equivalent thereof), weren't they? :)
  • cperdigou
    cperdigou Alpha Tester, Dataiker Alumni Posts: 115 ✭✭✭✭✭✭✭
    Most of the resulting code gets generated, so even if the source was indeed written by an human, the output is clearly not human readable unfortunately.
Setup Info
    Tags
      Help me…