Extract data from jpg images

Tags
Registered Posts: 11

Hi Dataiku Community, happy New Year

I am new over here and this is the first time I post a Question, so here it comes:

Would like to know how can I extract data from jpg images I mean what kind of node and or recipe should I utilize in order to solve a task like this.

Basically I got multiple screenshots and now I need to extract the data from this images and then apply a validation process against an undelying xls file.

I will appreciate your kind commentaries and suggestions.


Operating system used: Windows

Answers

  • Dataiker, Registered Posts: 3 Dataiker
    edited July 2024

    Hey @anthonyfergon28
    , happy new year!

    You can use a Python recipe on the design node.

    For example your script may look something like the below:

    # -*- coding: utf-8 -*-
    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    # Read recipe inputs
    images_for_retraining = dataiku.Folder("YOUR_FOLDER_NAME")
    images_for_retraining_info = images_for_retraining.get_info()
    
    paths = images_for_retraining.list_paths_in_partition()
    
    # Display a single image
    from IPython.display import Image
    Image(filename=images_for_retraining.file_path(paths[0]))
    
    ### Do your processing here ###
    
    # Write output dataset, if you create a dataframe it may be e.g.
    output_ds = dataiku.Dataset("YOUR_OUTPUT_DATASET")
    output_ds.write_with_schema(your_dataframe)

  • Registered Posts: 11

    Thank you Muennighoff I appreciate your kind response. Sincere apologies for the delay response. I am going to test it today. Kind regards.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.