reading file after partitioning
Cecile
Registered Posts: 2 ✭✭✭✭
Hello,
I have a filesytem organized this way:
/folder/YEAR/MONTH/DDHH
I tried to partition at the DDHH level, with one folder per partition. Since it is not a 'regular' structure (such as %Y/%M/%DD/.*), I did the partitioning as %Y/%M/%{dimension_2}/.* and it outputs 718 partitions of 1 file (json)
After this operation, I get a problem reading a file from a specific partition :
Error in pull background thread, aborting push
org.codehaus.jackson.JsonParseException: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens
I checked the file: when I load it as one dataset (without partition though), I have no problem for reading it.
Any suggestion?
Many thanks in advance!
Tagged:
Answers
-
It is possible to compose several of the partitioning dimensions in a single path component of the partitioning pattern, like : "/%Y/%M/%D%H/.*" . The important part to notice is that the pattern must end with a "/.*" to catch all files in the folder defined by the rest of the pattern.