Push to editable recipe

UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker
Hello, Could you take an example to use "Push to editable" recipe? It seems like group or windows.. What exactly is it used for?
Tagged:

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Hi,

    "Push to editable" copies a regular dataset to an editable dataset while keeping changes.

    The first time you run a Push to editable recipe, it will simply copy the whole content of the regular dataset to the editable dataset. If you make changes to the content in the editable dataset, and then rerun the push to editable recipe, it will copy over all data that was new or changed in the original dataset but will preserve every modification you did in the editable dataset.

    To identify what is considered as "new" or "was modified in editable dataset", you need to select some columns that form an identifier.

    The main use case for a push to editable recipe is if you want to make some corrections to a dataset. For example, you have an input dataset of product categories in a database, but there are some errors inside, and for some reason, you can't get the error to be fixed in the source data: you use a push to editable recipe, fix the erroneous entries, and base the rest of the flow on the editable dataset.

    Note that since editable datasets are limited to 100K rows, so are push to editable recipes.
  • MJLEE
    MJLEE Registered Posts: 6 ✭✭✭✭
    Okay, Thanks! I didn't know that I can edit a dataset when I use "Push to editable" recipe.

    Please confirm if my understanding is right or wrong.
    Let’s suppose the original dataset is as follows.

    ID | values
    _____________
    23 | yellow
    18 | red
    92 | blue
    81 | red
    30 | reed

    There is a errata “reed” so I want to correct it. With “Push to editable” recipe, I correct the error from “reed” to “red”. Then If I rerun the recipe, the value of ID 30 is preserved as “red”. Right?

    If it is right, what if the new data ID=24, value=“reed” is imported and rerun the recipe? Is it modified to “red" automatically? And, what if the new data ID=30, value=“green” is imported?
  • cperdigou
    cperdigou Alpha Tester, Dataiker Alumni Posts: 115 ✭✭✭✭✭✭✭
    To do this you should use a prepare recipe with a step that replaces reed with red within the column "values". -> ID24 and ID30 will both be changed from reed to red. If ID30 changes to green, the recipe does nothing on ID30 and it stays green.
    --> Prepare recipe replaces values based on the values column.

    In the case of the push to editable recipe->ID 24 will stay reed, ID30 will be red. If ID30 changes to green, it will still be changed to red by the push to editable recipe.
    --> Push to editable recipe replaces values based on the ID column.
  • Marty
    Marty Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 11 Partner

    Is there a way to "reset" the dataset so the old edits are NOT kept? My situation is one where I have a dataset that gets produced on a monthly basis after some significant transformations are completed, and sometimes I need to make a few slight manual edits to it before it's finalized. However, the following month I want to clear out all those edits because they probably aren't pertinent anymore. The problem is, a lot of the data is still going to be the same (so any edits from last month would be kept based on the example in the discussion above). I just want to keep the original data on a monthly basis, not the edits. Anyone have any ideas on how to do that? Is there a way to "reset" the edits? Thanks!

  • Marty
    Marty Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 11 Partner

    I think I may have a workaround. If I create a blank dataset (with just the column headers but no values in the rows) and push it to editable, it clears everything out. Then I can do all my transformations and push the transformed dataset to editable and the changes I made before are not preserved. Going to test this out a bit more, but would love to hear if anyone has thoughts on this approach. Thanks!

Setup Info
    Tags
      Help me…