Post

Top 13 Machine Learning Image Classification Datasets

July 16, 2021

When building any kind of image classification model, diverse image classification datasets are critical. Without them any object recognition models, computer vision models, or scene recognition models will surely fail in their output. That’s why we at iMerit have compiled this list of the top 13 image classification datasets we’ve used to help our clients achieve their image classification goals.

image classification model

Image Classification Datasets for Medicine

  1. TensorFlow patch_camelyon Medical Images – Containing over 327,000 color images from the Tensorflow website, this image classification dataset features 96 x 96 pixel images of histopathological lymph node scans with metastatic tissue.
  2. Recursion Cellular Image Classification – Gathered from the results of the Recursion 2019 challenge, this data is the result of a biological microscopy data competition that involved developing a model to identify replicates. 
  3. Blood Cell Images – Containing 12,500 augmented images of blood cells, with 3,000 images for each of 4 different cell types grouped into 4 different folders, this image classification dataset features images of patient blood samples.
  4. ChestX-ray8 – This medical imaging dataset features 108,948 frontal-view X-ray images of 32,717 unique patients collected between 1992 and 2015.
Image Classification Datasets for Medicine - 2

Image Classification Datasets for Agriculture and Scene

  1. Indoor Scenes Images – This MIT image classification dataset was designed to aid with indoor scene recognition, and features 15,000+ images of indoor locations and scenery. Each image is a JPEG that’s divided into 67 separate categories, with images per category varying across the board. Each category comes with a minimum of 100 images.
  2. Images for Weather Recognition – Perfect for any type of multi-class weather recognition project, this dataset features 1125 images that are divided into four separate categories based on sunrise, cloudy, rainy, and sunshine.
Weather recognition
  1. Intel Image Classification – An expansive image dataset initially created and compiled by Intel, this dataset contains 25,000 images divided into categories including forest, mountain, sea, glacier, buildings, and street. There are multiple folders depending on the function you’re looking to train your model for including training, testing, and prediction. Each folder contains thousands of images that are more than adequate for getting up and running.
  2. TensorFlow Sun397 Image Classification Dataset – This Tensorflow image classification dataset features 108,000 images from the Scene Understanding (SUN) benchmark that have been divided into 397 separate categories, with each category featuring a minimum 100 images of different scenes, objects, and other image categories.
  3. Coastset Image Classification Dataset – This open-source image classification dataset was initially used for shoreline mapping. It includes a variety of aerial images initially taken by satellites along with label metadata. 

Other Image Classification Datasets

  1. Image Classification: People & Food – This image classification dataset is in CSV format and features a substantial sum of images of people enjoying delightful food. Each image has been annotated and classified by human eyes based on gender and age.
  2. Images of Crack in Concrete for Classification – over 40,000 images of concrete, with each image in 227 x 227 pixel format. Half the images in this image classification dataset include concrete that’s marred with cracks, with the other half completely unmarred.
  3. Architectural Heritage Elements – This image classification dataset was initially created for the purpose of training models to classify images featuring architecture. There’s also a special emphasis on cultural heritage for each image. The dataset features over 1,000 images across 10 separate categories including altar, column, dome (inner), dome (outer), stained glass, vault, flying buttress, apse, and bell tower.
Architectural Heritage Elements
  1. Fruits 360 – This dataset features 90,483 images of different fruits and vegetables. The training set features 67,692 images (one fruit or vegetable per image), with the test set containing 22,688 images across 131 different classes.