caused by: DataStoreIOException: Path does not exist in the dataset: '/*/'

saraa1
Level 2
caused by: DataStoreIOException: Path does not exist in the dataset: '/*/'

Hello all, 

I have a dataset that is partitioned by a descriptive column, 

I want to apply a python recipe for each partition,

but I get this error : 

Path does not exist: Error while connecting to dataset NLP_SARA.Gph1_2_data (partition *)

caused by: DataStoreIOException: Path does not exist in the dataset: '/*/'

 

can someone please help ?

thank you very much

 

 

 

 

0 Kudos
1 Reply
AlexT
Dataiker

Hi,

Wildcard "*" is not supported. In order to run something across all partitions, you will need to explicitly list the discreet partitions to build.

https://doc.dataiku.com/dss/latest/partitions/identifiers.html#ranges-specifications

So it would be something like Part1/Part2/Part3... instead of "*"

If you want to build all partitions initially you can do so from a Scenario for example and run :

To generate the list of all partitions you can run the following in a notebook :

import dataiku

dataset = dataiku.Dataset("my_dataset_name")
partitions = dataset.list_partitions()
partitions_str = str('/'.join(partitions))
print(partitions_str)

 

To actually build all partitions you can use a Scenario: 

from dataiku.scenario import Scenario
import dataiku

scenario = Scenario()

dataset = dataiku.Dataset("split_input_dataset")
partitions = dataset.list_partitions() # get all partitions from input datasets
partitions_str = ','.join(partitions) # concatenate them

# Building a dataset
scenario.build_dataset("split_output_dataset", partitions=partitions_str)

 

 

0 Kudos

Labels

?
Labels (2)
A banner prompting to get Dataiku