Using the deeplearning classification on images in a subfolder of managed folder?

longhowlam
longhowlam Partner, Registered Posts: 24 Partner
edited July 16 in Using Dataiku

Hi All,

There is a nice python package google-images-download, that will download images given one or more search terms.

In a python code recipe I have


import dataiku, os.path

from google_images_download import google_images_download
response = google_images_download.googleimagesdownload()

handle = dataiku.Folder("peugeots")
path = handle.get_path()

arguments = {
"keywords":"peugeot 206,peugeot 306",
"limit":20,"print_urls":True,
"output_directory": path
} #creating list of arguments

response.download(arguments)

Now this works, but the thing is: the images are separated into subfolders of the managed folder, and if I then want to apply the deeplearning plugin, for image classification, with an input folder that contains sub folders it does not work.

Regards,

Longhow

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Hi,

    This is not currently supported. After the download loop in your downloading recipe, you could add Python code that flattens everything at top-level of the directory.
  • longhowlam
    longhowlam Partner, Registered Posts: 24 Partner
    OK, thanks for the tip.

    The reason to keep it separated though is because these subfolders already form the two (or more) categories for which to retrain a pretrained network.
Setup Info
    Tags
      Help me…