Removing newlines/ flattening multilines in a cell

Options
manssari
manssari Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 7 Partner

Dear Dataiku Community,

I have some data that I need to push into power BI from Dataiku,
However, because some cells of that data came with multilines ( data imported from .xlsx file)
like : " BHGE OCTOPUS9-5/8" X 3-1/2"
when pushed into PowerBI, this results in new blank row creation, with only the column that conatins that cell populated with the data that came after line break.
When searching through dataiku, I saw someone posted a similar question ,
And the answer was to find & replace " \n " in prepare recipe step.
I tried both that, and to replace the symbol that was displayed " "
None of which worked,
In the end I went with Python Code to solve this issue:
------------
This_df = This_df.replace('\n',' ', regex=True)

-------------

I can't help but think, is there any other way to do that with Dataiku dss that I am missing?
Thank you,


Operating system used: Centos 7.9


Operating system used: Centos 7.9

Best Answer

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    Answer ✓
    Options

    Hi,

    When using find/replace, \n will only match if you are set the “Match mode” to Regular expression. Otherwise it only matches actual “\n”.

    raw.png

    prepared.png

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    ¶ is a visual representation of a new line character which is invisible usually. Try copying the new line character from a multi-line text and paste it in your find and replace formula.

  • manssari
    manssari Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 7 Partner
    Options

    Hi Turribeach,
    Thanks for your answer,
    True , when copying the cell contents it is invisible and just translates into new line, however, The symbol does show in dataiku cell contents, and can be copied on its own,
    temp_dss_newline_issue.PNG
    I did try to copy it in find & replace , that did not work either, just like \n

  • manssari
    manssari Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 7 Partner
    Options

    Hi Adrien,
    Thank you so much for your answer, it works perfectly now :D!

Setup Info
    Tags
      Help me…