store live data to record data

sasidharp
sasidharp Registered Posts: 27 ✭✭✭✭

I recieve live data through a sql connection for a interval of 15 min for past hour.

time.PNG

i want to store a week data into my hdfs dataset, which is the best way to do that.

Every 15 min, 1 row disappears and a row gets added, please suggest me the best way.

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron

    @sasidharp

    One sort of brute force method would be to use scenarios.
    If you are using a full license of DSS, you should have access to the scenario feature. This will allow you to poll your SQL dataset every 15 minutes or slightly more frequently. This should allow you to successfully see when the new data arrives in the SQL data set by checking against what you have in hdfs. If you have new data in SQL you can then import the data, otherwise you do nothing.

    Just a thought.

  • sasidharp
    sasidharp Registered Posts: 27 ✭✭✭✭

    I want to append the new timestamp row to my hdfs dataset. for every 15 min i will get a new row added deletes one row. i just needed the new row which got added to be appended to the flow.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron

    @sasidharp

    If your recipe can produce just the needed new results. Then you can see if you can use the "Append instead of overwrite" option in the recipe input / output section.

    2020-11-03_16-18-56.jpg

    I'm not using HDFS so I don't know if this option is available to you with this data management type.

    I hope that this might help a bit. If not. If you can say a little bit more about what you are trying to do. Or show a little bit more. I or someone else might be able to help you a little further.

    Does anyone know if Partiticians will help with this process?

  • sasidharp
    sasidharp Registered Posts: 27 ✭✭✭✭

    Dear @tgb417

    HDFS Datasets doesn't come with Append option.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron

    @sasidharp

    Here is an earlier discussion thread about a similar issue.

    https://community.dataiku.com/t5/Using-Dataiku-DSS/how-can-select-the-append-mode-in-a-dataset/m-p/3367

    As you have discovered this says that HDFS does not have an append function. It does suggest partitioning.

    I don't think you have said whether you are using a community edition or the Paid edition of DSS. If you are on a paid edition, you will have the opportunity to use Partitioning. Apparently, Partitioning does work with HDFS.

    However, I'm not clear if you will want to create a new HFPS partition for a single row of data. If that's the case could you save a temporary dataset and create partitions daily.

    In case you have not come across this; Here is the learning module on Partitioning.

    https://academy.dataiku.com/partitioned-models-open

    I'm not going to be of much additional help.

    @DvMg
    where did you end up going with your append issue in the thread linked above? Can you be of any help to @sasidharp
    ?

    @tomas
    I see you were giving @DvMg
    some help. Can you be of any help to @sasidharp
    ?

Setup Info
    Tags
      Help me…