Join us on July 16th as we explore real-world Reinforcement Learning Learn more

reading file after partitioning

Level 1
reading file after partitioning

Hello,



I have a filesytem organized this way:



/folder/YEAR/MONTH/DDHH



I tried to partition at the DDHH level, with one folder per partition. Since it is not a 'regular' structure (such as %Y/%M/%DD/.*), I did the partitioning as %Y/%M/%{dimension_2}/.* and it outputs 718 partitions of 1 file (json)



After this operation, I get a problem reading a file from a specific partition :



Error in pull background thread, aborting push

org.codehaus.jackson.JsonParseException: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens



I checked the file: when I load it as one dataset (without partition though), I have no problem for reading it.



Any suggestion?



Many thanks in advance!

0 Kudos
1 Reply
Dataiker
Dataiker
It is possible to compose several of the partitioning dimensions in a single path component of the partitioning pattern, like : "/%Y/%M/%D%H/.*" . The important part to notice is that the pattern must end with a "/.*" to catch all files in the folder defined by the rest of the pattern.
0 Kudos