Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I have a table that is partitioned by date, how can I access the partition date in a pyspark recipe
I tried the following code but does not recognize actual_date
fct_pm_card.select("application_id", "product") \ .filter(col('actual_date') <= end_date)
Hi @torbiks ,How is your dataset partitions are you using Spark partitions e.g repartition?https://doc.dataiku.com/dss/latest/spark/datasets.htmlDSS partitions and spark partitions are different.
So, you can't reference a DSS partition in Spark directly since only that partition will be available when running a DSS partitioned spark job.If you need the partitioning, you can add back the partition as a column using the prepare recipe processor: https://doc.dataiku.com/dss/latest/preparation/processors/enrich-with-record-context.htmlThen you would be able to use that column you pyspark.
Post a Question