Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello, just wanted to know if this data transformation is possible out of the box.
text label text label
abc ['A', 'B'] => abc ['A', 'B']
def C def ['C']
Hi @RohitRanga ,
So you only need to handle the single string e.g convert C to ['C'] if it already starts with [ then do nothing.
First thing that comes to mind would be using a formula :
if(startsWith(new_column, "['"),new_column, concat("['",new_column,"']"))
Let me know if that works for you.
Hi @RohitRanga,
Not sure I fully understand the transformation you are looking for, are you looking to convert C to an array?
We do have the following processor which does what you are looking for :
https://doc.dataiku.com/dss/latest/preparation/processors/tokenizer.html
Let me know if that helps.
@AlexT Thanks the response! Let me clarify my question:
I have a classification dataset with a label/class column. This column has either a list of strings or a single string. I want to make it uniform by converting those single string rows into a list with one string. Is this clear now?
Hi @RohitRanga ,
So you only need to handle the single string e.g convert C to ['C'] if it already starts with [ then do nothing.
First thing that comes to mind would be using a formula :
if(startsWith(new_column, "['"),new_column, concat("['",new_column,"']"))
Let me know if that works for you.
Wow, I was not aware that we could do this. Thanks a lot @AlexT !