Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Importing Splunk data

Solved!
Schiggy
Level 2
Importing Splunk data

Hi everyone

I try to import Splunk data using the official: plugin.

The problem is, that the plugin seems to assume all my Splunk data is ascii encoded, when in reality it is UTF-8. The plugin has no option that lets me specify the encoding.

Since Splunk data is usually voluminous, I can't really solve the issue there since reindexing data is associated with significant costs.

Following the error message I get when I try to import data:

 

...
2023-03-21 15:44:56,091 INFO SplunkIndexConnector:Connected to Splunk

2023-03-21 15:44:56,093 INFO Processing task: read_rows

2023-03-21 15:44:57,899 ERROR Connector send fail, storing exception Traceback (most recent call last):
File "/.../dataiku-dss-11.0.2/python/dataiku/connector/server.py", line 110, in serve
read_rows(connector, schema, partitioning, partition_id, limit, output)
File "/.../dataiku-dss-11.0.2/python/dataiku/connector/server.py", line 32, in read_rows
for row in connector.generate_rows(schema, partitioning, partition_id, limit):
File "/tmp/tmp_folder_nSFiBdol/dku_code.py", line 91, in generate_rows

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 173000: ordinal not in range(128)
2023-03-21 15:44:57,901 INFO Processing task: finish_read_session

 

 

 

 

0 Kudos
1 Solution
AlexB
Dataiker

Hi !

Could you try to convert this plugin into a dev plugin, then edit line 91 of the file named `python-connectors/splunk_import-index/connector.py`, from this:

            for sample in content.decode().split("\n"):

into this:

            for sample in content.decode("utf-8").split("\n"):

and see if it solves the issue..

View solution in original post

3 Replies
AlexB
Dataiker

Hi !

Could you try to convert this plugin into a dev plugin, then edit line 91 of the file named `python-connectors/splunk_import-index/connector.py`, from this:

            for sample in content.decode().split("\n"):

into this:

            for sample in content.decode("utf-8").split("\n"):

and see if it solves the issue..

Schiggy
Level 2
Author

Hey Alex!

First of all, thanks for your reply and patience. I was held up by a couple other tasks for a while :S

I just tested your solution and it seems to work. However, I got a couple other errors but I think they are related to something else. I will come back to you with a more thorough reply!

Thanks for taking the time to sift through the code and providing a solution. I greatly appreciate your effort! 😃

 

Regards,

Gabriel 

0 Kudos
Schiggy
Level 2
Author

All right, I tested your solution and it works perfectly.

 

Thanks again and have a great day! 😎

0 Kudos

Labels

?
Labels (2)
A banner prompting to get Dataiku