Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

Not able to build missing partition

Amarnath
Level 1
Level 1
Not able to build missing partition

Hi Everyone,

I'm facing issue in creating the missing partitions for folder based datsets, please see the attached image for the partition patteren, After partion there are some missing partitions, for that we use the below code but not able to create the missing partitions.

 

from dea_common.hp_scenario import build_dataset
from dataiku.scenario import Scenario
import dataiku
from datetime import timedelta, date, datetime

# generate all partitions that should be built (here based on months until current day)
def dates_range(start, end):
total_months = lambda dt: dt.month + 12 * dt.year
mlist = []
for tot_m in range(total_months(start) - 1, total_months(end)):
y, m = divmod(tot_m, 12)
mlist.append(datetime(y, m + 1, 1).strftime("%Y-%m-%d"))
return mlist

def get_missing_partition(dataset_name, date1, date2):
l = []
# let's get all curent existing partitions from a dataset of the flow
dataset = dataiku.Dataset(dataset_name)
partitions = dataset.list_partitions()
print("Existing partitions:")
print(partitions)

# generate all partitions that should be built (from '2014-11-01' until '2021-05-17')
all_dates = [dt for dt in dates_range(datetime.strptime(date1,'%Y-%m-%d').date(), datetime.strptime(date2,'%Y-%m-%d').date())]
print("Partitions that should exist:")
print(all_dates)

# finding missing partitions
for partition in all_dates:
if partition not in partitions:
print("%s : missing partition" % partition)
l.append(partition)

return l

emea_partitions = get_missing_partition('Campaign_1','2021-7-15','2021-10-5')
#us_partitions = get_missing_partition('gcw_us_sell_to_pos_joined','2014-11-01','2021-08-09')

 

if len(emea_partitions) != 0 :
scenario = Scenario()
scenario.build_dataset("Campaign_1", partitions=",".join(emea_partitions))

 

0 Kudos
1 Reply
AlexT
Dataiker
Dataiker

Hi,

So for the missing partition build could you try to manually build one of the missing partitions and check the underlying job logs. There is likely an issue with the underlying data. 

From the failed job page, Actions > Download job diagnosis, and then send us the resulting file via a support ticket or Live Chat.

https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html

 

0 Kudos
A banner prompting to get Dataiku DSS