How to find the parition ID in dataset settings
Hello,
We activated the 14-days trial. I set up a partition for a remote file connection:
Then in Input/Output, I am asked for the partition Id. I tried "date" then save but the field remains blank when I reload the page.
Thanks!
Answers
-
EDIT: The question is actually about partitioned Remote files. The syntax here is that:
* In the URL, you add patterns like ${DKU_DST_date}
* In partition id, you enter a partition identifierie, in the URL it's the same syntax as the "partitioning variables substitution" that you would use in recipes.
See: http://doc.dataiku.com/dss/latest/connecting/remote.html#defining-partitioned-remote-datasetsHow partition identifers are created: http://doc.dataiku.com/dss/latest/partitions/identifiers.html
Previous (unrelated) answer:
I'm assuming that you're asking about the partition which is requested when you try to build a partitioned dataset from a recipe.
What you must enter is the identifier of the partition to build.
Assuming that you have a recipe that makes A -> B, and A and B are both partitioned by day, and you have an "equals" dependency, then you would want for example "2015-10-21" as partition id.
Meaning that: you ask DSS to build the partition "day=2015-10-21" of the output dataset B, from which DSS will infer that it needs to read the "day=2015-10-21" partition of A.
You might find this useful:
* http://doc.dataiku.com/dss/latest/concepts/index.html#partition-level-dependencies and the concept that you ask for output and then DSS finds the input
* How partition identifiers are formed: http://doc.dataiku.com/dss/latest/partitions/identifiers.html -
In fact, I try to set up a daily scheduled file import from a sFTP to get the current day csv into a dataset.
Can I put something like "CURRENT_DAY"? Is it normal the "Download sources" remains quiet no matter what I try? (2015-10-21, date, day=2015-10-21...) -
Partial answer: CURRENT_DAY woud be used in administration → scheduler, where you can ask a recipe to run each day and build the partition of today. When manually launching a job, you need to specify the partition to build as 2015-10-21
-
Ah sorry about that, I hadn't understood. What you want is ${DKU_DST_date}, ie it's the same syntax as the "partitioning variables substitution" that you would use in recipes.
See: http://doc.dataiku.com/dss/latest/connecting/remote.html#defining-partitioned-remote-datasets