How do I fetch date from a filename and add it to a column in the file using DSS?

UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker
For e.g. if the filename is Test_20170101 and the file has 3 columns test1, test2 and test3. I want my file now to have 4 columns - test1, test2 ,test3 and the new column date, with the value 20170101. How do I do this in DSS ?

Answers

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker

    Unfortunately there is no built-in feature to do this, you'd have to use a python recipe.

    If you do this kind of things, you may want to look into partitioning.

  • UserBird
    UserBird Dataiker, Alpha Tester Posts: 535 Dataiker
    Is this still the case? I am trying to get source's latest partition date using following python code -

    file_date = dataiku.dku_flow_variables["DKU_SRC_LAST_DATE"]

    I was wondering if you have built-feature now after 1 and 1/2 year.
  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    There is still no built-in feature to do that.
  • rmnvncnt
    rmnvncnt Registered Posts: 41 ✭✭✭✭✭
    Any update on this topic? Partitioning doesn't work in my case (since "Missing partitions as empty" is still not supported for discrete + time partitioning). I could create a new connector for each file in the folder, but this doesn't scale at all. Being able to add the source to the dataset could solve this problem.
  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    I'm not sure I understand your use case fully. I suggest you contact you Customer Success Manager and provide him/her with the details of what you need for a more tailored recommendation.
Setup Info
    Tags
      Help me…