Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi
I am a beginner. I want to use the prepare recipe to filter the "Data Period" Column. My expectation is to filter only the current month data. Here i am expecting the output to show only the 3rd and 4th row. Please enlighten me on how to achieve this.
Name | Country | Order ID | Data Period | Reporting Period |
Jacob | US | 341234 | 30-Sep-2021 | 30-Oct-2021 |
John | UK | 123413 | 30-Oct-2021 | 30-Nov-2021 |
Matt | IN | 234234 | 30-Nov-2021 | 30-Dec-2021 |
Scott | IN | 453423 | 30-Nov-2021 | 30-Dec-2021 Operating system used: Windows |
If you want to do this kind of manually, I think you can just simply use "Filter rows/cells with formula" for example - but this will require to change current month every time this month change.
If you want to automate this process you could use "max date" column if you have one, or you can create it yourself, for example starting with groupby or maybe creating index column that will help you identify what is the current month (having max_date or fixed index where for example most up do date month will get 0 for example, will allow you again to filter /remove rows that are not matching with max date)
Hope this helps,
Mateusz
Hi Mateusz,
Thanks for the response. May you please help to elaborate how to create max_date column to identify the current month. Appreciate if you could share some screenshots of possible for better understanding. My apologies if i am asking a laymen question.
Thanks
Welcome to the Dataiku community.
To add to what @emate has said. Here are a few more thought that might help with your question.
First dates can be hard. Iāve often been known to say that dates are the āBane of my existenceā as an analyst.
It sounds like you want to filter your data on a date. The first thing that I would do is make sure that the date column you want to filter on is āparsedā to a format that Dataiku dss can use to do date calculations against. The format of dates you are showing in your example are not in that format. You do this format change with date parsing visual step.
Once the dates are parsed there are lots of things you can do to filter. In this case I might create a formula to build a new column that indicates if the date is in the current month or not. And then add a step to filter on this new column. Iād then as a next step likely delete the column.
Iād also look at the date functions to see if there is a function that will do this in one step. (Iām not at a workstation at the moment to check.)
Here is some basic information on using dates with dss
https://doc.dataiku.com/dss/latest/preparation/processors/filter-on-date.html
Let us know how you are getting on with this. And welcome to the Dataiku community.
https://knowledge.dataiku.com/latest/courses/resources/excel-to-dss/dates.html