Importing Splunk data

Options
Gabriel
Gabriel Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 Partner

Hi everyone

I try to import Splunk data using the official: plugin.

The problem is, that the plugin seems to assume all my Splunk data is ascii encoded, when in reality it is UTF-8. The plugin has no option that lets me specify the encoding.

Since Splunk data is usually voluminous, I can't really solve the issue there since reindexing data is associated with significant costs.

Following the error message I get when I try to import data:

...
2023-03-21 15:44:56,091 INFO SplunkIndexConnector:Connected to Splunk

2023-03-21 15:44:56,093 INFO Processing task: read_rows

2023-03-21 15:44:57,899 ERROR Connector send fail, storing exception Traceback (most recent call last):
File "/.../dataiku-dss-11.0.2/python/dataiku/connector/server.py", line 110, in serve
read_rows(connector, schema, partitioning, partition_id, limit, output)
File "/.../dataiku-dss-11.0.2/python/dataiku/connector/server.py", line 32, in read_rows
for row in connector.generate_rows(schema, partitioning, partition_id, limit):
File "/tmp/tmp_folder_nSFiBdol/dku_code.py", line 91, in generate_rows

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 173000: ordinal not in range(128)
2023-03-21 15:44:57,901 INFO Processing task: finish_read_session

Tagged:

Best Answer

  • AlexB
    AlexB Dataiker Posts: 67 Dataiker
    edited July 17 Answer ✓
    Options

    Hi !

    Could you try to convert this plugin into a dev plugin, then edit line 91 of the file named `python-connectors/splunk_import-index/connector.py`, from this:

                for sample in content.decode().split("\n"):

    into this:

                for sample in content.decode("utf-8").split("\n"):

    and see if it solves the issue..

Answers

  • Gabriel
    Gabriel Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 Partner
    Options

    Hey Alex!

    First of all, thanks for your reply and patience. I was held up by a couple other tasks for a while :S

    I just tested your solution and it seems to work. However, I got a couple other errors but I think they are related to something else. I will come back to you with a more thorough reply!

    Thanks for taking the time to sift through the code and providing a solution. I greatly appreciate your effort!

    Regards,

    Gabriel

  • Gabriel
    Gabriel Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 Partner
    Options

    All right, I tested your solution and it works perfectly.

    Thanks again and have a great day!

Setup Info
    Tags
      Help me…