filtering row in output data set doesn't work

PARTEEK
PARTEEK Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 23 ✭✭✭✭

I am trying to put the following filter in Configure Sample setting page:

charged_phone_number == 31643000430

However, it doesn't return anything. When I download the dataset and put the filter locally, it does contain the value in that column.

My dataset lies in S3. Is it possible that filter records doesn't work in S3?

What is happening?

Tagged:

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @pkansal

    Just a quick thought. Check your data type. You might want to force the column with the phone number in the schema to string. I’ve seen similar things happen with US postal codes that start with 0. DSS will sometime try to be overly helpful and convert things that look like numbers to numbers, and then matching does not work.

    Or you already have the phone number as stings and your comparison is of a number. You might also try putting quotes around your number in your comparison.

    just a few thought….

  • PARTEEK
    PARTEEK Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 23 ✭✭✭✭

    No, it is stored as an integer only. Even if it is an integer, it should work, right?

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @pkansal
    ,

    I've seen time where I've had to play around with the scheme to get certain things working. Sometimes with these very large integers I've seen things converted to Scientific Notation and other such anomalies.

    The other thing to do is to reach to the Dataiku Support team. There really are great at what they do and can help you with the very specific challenges you are having.

    --Tom

  • Jurre
    Jurre Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered, Dataiku DSS Developer, Neuron 2022 Posts: 115 ✭✭✭✭✭✭✭

    not necessarily @pkansal
    , all int's are not equal. The number you are filtering on is outside the 'standard' int32-range (max : 2,147,483,647). Datatype bigint would be the datatype of choice if you insist on having it stored as a number. Personally i only use any kind of numbertype when values have a numerical meaning and use; Telephone numbers don't get summed up, they are just identifiers.

    Not sure if that's the underlying issue here but i concur with @tgb417
    that it's worth checking.

  • Berilio
    Berilio Partner, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 5 Partner

    Hi @pkansal , i agree with @Jurre
    , but maybe you can try with: toNumber(charged_phone_number) == 31643000430

Setup Info
    Tags
      Help me…