Post

5 Million Faces – Top 14 Free Image Datasets for Facial Recognition

July 21, 2021

Facial recognition is a leading branch of computer vision that boasts a variety of practical applications across personal device security, criminal justice, and even augmented reality. If you’re working on a computer vision project, you may require a diverse set of images in varying lighting and weather conditions. Each of the faces may also need to express different emotions.

That’s why we at iMerit have compiled this faces database that features annotated video frames of facial keypoints, fake faces paired with real ones, and more.

Top 14 Free Image Datasets for Facial Recognition

  • Face Detection in Images with Bounding Boxes: This deceptively simple dataset is especially useful thanks to its 500+ images containing 1,100+ faces that have already been tagged and annotated using bounding boxes.
  • CelebA Dataset: This dataset from MMLAB was developed for non-commercial research purposes. It contains 200,000+ celebrity images.
  • Flickr Faces: This high-quality image dataset features 70,000 high-quality PNG images at 1024×1024 resolution with considerable variation/diversity in terms of age, race, background, ethnicity, and more. 
  • Face Images with Marked Landmark Points: This free image dataset for facial recognition contains 7049 images with up to 15 keypoints marking each of them. While the number of keypoints per image varies, the max number of keypoints is 15 on a single image. All keypoint data will be included on a CSV file. 
  • Real and Fake Face Detection: Compiled to train facial recognition models to better distinguish between real face and fake ones, this image dataset contains 1,000+ real faces with another 900 fake faces that vary in their recognizable difficulty.
  • Labeled Faces in the Wild: This database of face photos was initially designed to help build an understanding of the challenges around unconstrained face recognition. It contains over 13,000 images of close to 6,000 people.
  • Google Facial Expression Comparison: Straight from Google AI, the Google Facial Expression Comparison dataset contains 156,000 facial images. The dataset contains face images of triplets that are coupled with human annotations that specify which two or the triplets’ faces form the most similarly in terms of facial expression. Every image is meticulously annotated by six human annotators.
  • Tufts Face Database: Commonly touted as the most comprehensive face dataset due to its 10,000+ images of males and females ranging between 4 and 70 years old across 15 countries, the Tufts Face Database contains a wide breadth of image modalities including visible, near-infrared, thermal, computerized sketch, LYTRO, recorded video, and 3D images. All and all, the Tufts Face Database contains 100,000 images of 112 different participants.
  • Simpsons Faces: D’oh! Taken from seasons 25 to 28 of the longest running television show in history, this dataset features almost 10,000 cropped images of Simpsons character faces.
  • UMDFaces: The largest dataset on this list by a longshot, UMDFaces contains 367,000+ face annotations across 8,200 unique subjects. The dataset also boasts 3.7 million video frames which have all been annotated using the facial keypoints of over 3,100 subjects. Please keep in mind that this dataset was compiled and created for non-commercial purposes only.
  • Wider Face: Containing over 10,000 images of both multiple people and single people, this image dataset is divided into numerous scenes including traffic, parades, meetings, parties, and more.
  • UTKFace: Fantastic for anyone needing a sample size that contains people of all ages, the UTK Face dataset includes 20,000 face images that have already been annotated based on age, ethnicity, and gender. 
  • Yale Face Database: Containing 165 images across 15 unique subjects within different lighting conditions, the Yale Face Database is a commonly cited dataset for its application. All subjects and images show different expressions pertaining to unique emotions.
  • Youtube with Facial Keypoints: Totalling 155,560 still frames, this dataset is composed largely of celebrities that were taken by the general public and posted on YouTube. Each video has been cropped to focus on the celebrities, with each face annotated for keypoints across every frame of every video.