Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Dataiku Community, happy New Year 🙂
I am new over here and this is the first time I post a Question, so here it comes:
Would like to know how can I extract data from jpg images I mean what kind of node and or recipe should I utilize in order to solve a task like this.
Basically I got multiple screenshots and now I need to extract the data from this images and then apply a validation process against an undelying xls file.
I will appreciate your kind commentaries and suggestions.
Operating system used: Windows
Hey @anthonyfergon28, happy new year!
You can use a Python recipe on the design node.
For example your script may look something like the below:
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
images_for_retraining = dataiku.Folder("YOUR_FOLDER_NAME")
images_for_retraining_info = images_for_retraining.get_info()
paths = images_for_retraining.list_paths_in_partition()
# Display a single image
from IPython.display import Image
Image(filename=images_for_retraining.file_path(paths[0]))
### Do your processing here ###
# Write output dataset, if you create a dataframe it may be e.g.
output_ds = dataiku.Dataset("YOUR_OUTPUT_DATASET")
output_ds.write_with_schema(your_dataframe)
Thank you Muennighoff I appreciate your kind response. Sincere apologies for the delay response. I am going to test it today. Kind regards.