Converting week and year to datetime stamp using Python function in prepare

Options
vho
vho Registered Posts: 9 ✭✭✭✭
edited July 16 in Using Dataiku

Hello,

I have a dataset with Year (ex 2019) and Week (ex 37) columns. I would like to convert this to a date timestamp using Monday as the day, so the output would look like '2019-09-09T00:00:00.000Z'

I have two questions 1) how do I import the modules needed and 2) Is this the correct python code?

from datetime import datetime
from isoweek import Week
def process(row):
    year = "year"
    week = "week"
    datetime = Week(year, week).monday()
    return datetime

OR

from datetime import datetime
def process(row):
    year = "year"
    week = "week"
    datetime = datetime.strptime('{} {} 1'.format(year, week), '%Y-%m-%dT%H:%M:%S.000Z')
    return datetime

Thanks!

Vanessa

Best Answer

  • Jean-Yves
    Jean-Yves Dataiker Posts: 14 Dataiker
    edited July 17 Answer ✓
    Options

    Hi,

    You can run the following code (datetime is a core package of python)

    from datetime import datetime
    def process(row):
        year = row["year"]
        week = row["week"]
        date = "{}-{}-1".format(year, week)
        dt = datetime.strptime(date, "%Y-%W-%w")
        return dt.strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3] + 'Z'
    

    This is assuming that your year column is named "year" and your week column is named "week".
    I hope this helps!

    Jean-Yves

Answers

  • vho
    vho Registered Posts: 9 ✭✭✭✭
    Options

    This works great, thanks a lot for your help

  • vho
    vho Registered Posts: 9 ✭✭✭✭
    Options

    Hi @Jean-Yves
    I noticed that there is an issue with last week and first week using this formula. For example 2019 Week 1 should begin on 2018-12-31, instead it starts at 2019-01-17. This results in a 2 week gap. Do you know of a workaround for this?

    Screenshot 2020-10-27 at 15.18.58.png

  • Jean-Yves
    Jean-Yves Dataiker Posts: 14 Dataiker
    edited July 17
    Options

    Hi,

    Thanks for bringing this up - it is an interesting case. After reading the datetime documentation, it turns out that the %W directive does not respect the ISO definition of week numbering.

    To fix that issue, new directives were introduced in later python versions. Unfortunately, those directives cannot be used in the python step of a prepare recipe.

    To get the correct date, you can use the following code, (bits of which I copied from stack overflow) :

    import datetime
    def iso_year_start(iso_year):
        "The gregorian calendar date of the first day of the given ISO year"
        fourth_jan = datetime.date(iso_year, 1, 4)
        delta = datetime.timedelta(fourth_jan.isoweekday()-1)
        return fourth_jan - delta 
    
    def iso_to_gregorian(iso_year, iso_week, iso_day):
        "Gregorian calendar date for the given ISO year, week and day"
        year_start = iso_year_start(iso_year)
        return year_start + datetime.timedelta(days=iso_day-1, weeks=iso_week-1)   
    
    def process(row):    
        year = int(row["year"])
        week = int(row["week"])
        dt = iso_to_gregorian(year, week, 1)
        return dt.strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3] + 'Z'



    I hope this helps!

    Best,

    Jean-Yves

Setup Info
    Tags
      Help me…