Using the deeplearning classification on images in a subfolder of managed folder?

longhowlam Partner, Registered Posts: 24 Partner

Hi All,

There is a nice python package google-images-download, that will download images given one or more search terms.

In a python code recipe I have

import dataiku, os.path<BR /><BR />from google_images_download import google_images_download <BR />response = google_images_download.googleimagesdownload() <BR /><BR />handle = dataiku.Folder("peugeots")<BR />path = handle.get_path()<BR /><BR />arguments = {<BR /> "keywords":"peugeot 206,peugeot 306",<BR /> "limit":20,"print_urls":True, <BR /> "output_directory": path<BR />} #creating list of arguments<BR /><BR /> <BR />

Now this works, but the thing is: the images are separated into subfolders of the managed folder, and if I then want to apply the deeplearning plugin, for image classification, with an input folder that contains sub folders it does not work.




  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer Posts: 753 Dataiker

    This is not currently supported. After the download loop in your downloading recipe, you could add Python code that flattens everything at top-level of the directory.
  • longhowlam
    longhowlam Partner, Registered Posts: 24 Partner
    OK, thanks for the tip.

    The reason to keep it separated though is because these subfolders already form the two (or more) categories for which to retrain a pretrained network.
Setup Info
      Help me…