Using the deeplearning classification on images in a subfolder of managed folder?

Options
longhowlam
longhowlam Partner, Registered Posts: 24 Partner

Hi All,

There is a nice python package google-images-download, that will download images given one or more search terms.

In a python code recipe I have


import dataiku, os.path<BR /><BR />from google_images_download import google_images_download <BR />response = google_images_download.googleimagesdownload() <BR /><BR />handle = dataiku.Folder("peugeots")<BR />path = handle.get_path()<BR /><BR />arguments = {<BR /> "keywords":"peugeot 206,peugeot 306",<BR /> "limit":20,"print_urls":True, <BR /> "output_directory": path<BR />} #creating list of arguments<BR /><BR />response.download(arguments) <BR />

Now this works, but the thing is: the images are separated into subfolders of the managed folder, and if I then want to apply the deeplearning plugin, for image classification, with an input folder that contains sub folders it does not work.

Regards,

Longhow

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer Posts: 753 Dataiker
    Options
    Hi,

    This is not currently supported. After the download loop in your downloading recipe, you could add Python code that flattens everything at top-level of the directory.
  • longhowlam
    longhowlam Partner, Registered Posts: 24 Partner
    Options
    OK, thanks for the tip.

    The reason to keep it separated though is because these subfolders already form the two (or more) categories for which to retrain a pretrained network.
Setup Info
    Tags
      Help me…