reading file after partitioning

Cecile
Cecile Registered Posts: 2 ✭✭✭✭

Hello,

I have a filesytem organized this way:

/folder/YEAR/MONTH/DDHH

I tried to partition at the DDHH level, with one folder per partition. Since it is not a 'regular' structure (such as %Y/%M/%DD/.*), I did the partitioning as %Y/%M/%{dimension_2}/.* and it outputs 718 partitions of 1 file (json)

After this operation, I get a problem reading a file from a specific partition :

Error in pull background thread, aborting push

org.codehaus.jackson.JsonParseException: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens

I checked the file: when I load it as one dataset (without partition though), I have no problem for reading it.

Any suggestion?

Many thanks in advance!

Answers

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    It is possible to compose several of the partitioning dimensions in a single path component of the partitioning pattern, like : "/%Y/%M/%D%H/.*" . The important part to notice is that the pattern must end with a "/.*" to catch all files in the folder defined by the rest of the pattern.
Setup Info
    Tags
      Help me…