Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi all,
I have a column with both numbers and text and I'd like to extract the values to a new column. Everything works fine, except for large numbers. The original file is an Excel file with around 400k rows and only a handful of values are large. For this task I use the prepare recipe -> โExtract a numberโ. The number in Excel is listed as (for example) 1.7099041E12. The prepare recipe makes it ~1,7 instead of ~1.700.000.000.000. What recipe do I need to use or how can I fix this?
Regards,
Qris
Hi @Qris ,
My approach would be to look for a suitable separator between textvalue and numeric value, and then split that column on that separator. A space for example. To find that possible separator i would temporarily filter out everything below a certain amount of characters. Tip : resulting columns from splitting can be limited with the "truncate" option. Another approach might be to filter on values with the structure number(s)-E-number(s) with some regex, and extract those to a new column.
Edit : you mentioned a low volume of problematic values. it might be helpful to split the dataset in a part which works out fine and a part with those problematic values, for example using that number(s)-E-number(s) format. Analyzing a smaller record-set is faster and avoids the pitfall of not adjusting the sample size when analyzing.
Jurre