Sentence Embedding - Machine/Deep Learning Model

Options
vinhdiesal
vinhdiesal Registered Posts: 11 ✭✭✭✭

Using a python 2.7 code environment I was able to use the macro's pre-trained word embeddings and create sentence embeddings using the plug in for a column in my dataset which is a corpus of text data. What I'm trying to do is figure out how I can use DSS to classify the text data in that column by using the sentence embeddings. The sentence embeddings are put in a different column. I would also like to map back to the actual words when the clustering is complete so I can get text analysis.

What I tried doing was use the available K-means algorithm on the sentence embedding column to create clusters of data but I often got list of of index errors during fit of the model.

Can you help me with advice on how to use sentence embeddings plugin out with existing deep/machine learning algorithms.

Thanks,

Answers

  • Alex_Combessie
    Alex_Combessie Alpha Tester, Dataiker Alumni Posts: 539 ✭✭✭✭✭✭✭✭✭
    Options

    Hi,

    In order to use the embedding column for Machine Learning / Deep Learning models, you can choose the "Vector" feature handling, as shown below:

    Screenshot 2020-05-07 at 01.56.00.png

    Hope it helps,

    Alex

  • vinhdiesal
    vinhdiesal Registered Posts: 11 ✭✭✭✭
    Options

    Thanks Alex for that information, it really helps.

    Right now I have the cluster number and cluster ID located in a different column how do I convert the vectors back to words so the cluster can be the string of word instead?

    For example, instead of cluster ID I want it to display the topics.

    Thanks,

    Vinh

  • Alex_Combessie
    Alex_Combessie Alpha Tester, Dataiker Alumni Posts: 539 ✭✭✭✭✭✭✭✭✭
    Options

    Hi,

    The plugin outputs vectors at the sentence level, so there's no direct way to map it back to words.

    If your use case is about understanding clusters of documents, I would suggest using the 'Topic modeling' predefined notebook: https://doc.dataiku.com/dss/latest/notebooks/predefined-notebooks.html

    Best regards,

    Alex

Setup Info
    Tags
      Help me…