AI Image Detector: Can You Use Image Classification to Spot the Fakes?

We build an AI image detector to see how accurately an image classification function can spot AI-generated images. Learn more and build a classifier of your own.

Chris Shuptrine

Sep 2023

For this article, we hired Becca Miller, a freelance software developer and technical writer, to build an AI image detector using Nyckel image classification. Becca details her experience below and shares how you can build the image classifier yourself.

Thanks to recent advancements in artificial intelligence, we’ve seen remarkable quality improvements in AI-generated images. Tools like DALL-E, Midjourney, and Stable Diffusion continue to impress us with each new product release. However, these improvements have also led to growing concerns about identifying authentic and trustworthy images (e.g., deepfakes on social media). With surges in AI-generated content, we now encounter synthetic images created by image generators that are difficult to distinguish from real photos.

In this article, I explore the process of building an AI image detection tool using Nyckel image classification. I share a step-by-step overview of how I created the image classifier and reflect on my experience working with Nyckel’s product.

Looking for a way to detect if a specific image is AI-generated? Upload your image to Nyckel’s pretrained AI-Generated Image Identifier.

CIFAKE: Real and AI-Generated Synthetic Images

Building an image classification function starts with identifying a high-quality dataset. To train the AI image detection classifier, I used images from the publicly available CIFAKE dataset. This dataset contains 60,000 synthetically-generated images and 60,000 real images, divided into training and testing sets.

The real images in the dataset were collected from the publicly available CIFAR-10 dataset. Then, the synthetic images were generated by applying a technique called latent diffusion to the real images. This dataset is free for public use, so if you want to follow along with this tutorial by building your own classifier, you can get started by downloading the images and creating a free Nyckel account.

5 Steps to Detect Fake Images with Nyckel

Here’s a glimpse into the high-level steps I took to craft the AI image detection tool:

1. Create a new function

After signing up for a Nyckel account, I started by creating a new function from Nyckel’s dashboard. This takes mere seconds; I simply had to specify that the function should accept an image as the input and that the output is “classify.” In other words, I’m creating an image classification function.

2. Import images

After creating the function, I started importing images. Nyckel lets you bulk-upload and bulk-label images. Import batch sizes are limited to 1,000 images, so I only used a subset of the CIFAKE training dataset. I selected 1,000 real images to import and set their label as “real,” and then selected 1,000 synthetic images to import and set their label to “synthetic.” The platform also allows you to import images unlabeled, and then later add the labels manually.

3. Train the AI model

Once the images were imported and labeled, Nyckel launched the training process immediately. Within seconds, the model was making predictions based on the training data. With this near-instant feedback, I could quickly see whether the AI image detector was correctly classifying images as real or synthetic, as well as the model’s certainty (confidence score) about its predictions.

Nyckel made the training process simple by handling the model fine-tuning behind the scenes. While that limits your ability to manually adjust training parameters, it makes the platform much more accessible for people who are new to machine learning. Nyckel tries out various training parameters and techniques to find one that works best for your data. Even if you do have some experience in ML, the benefit of this automated hyperparameter sweep is that it speeds the process, and you don’t have to worry about selecting optimal values.

4. Review model outputs

Nyckel provided a variety of sorting and filtering options that I could use to assess the AI model’s performance on individual examples. I could sort image classifications based on the recency of the image import, the recency of the annotation, and the confidence of the model in its prediction. I could also filter by function’s output (e.g., real vs. synthetic and disagrees vs. agrees with the label), as well as by the label type (real, synthetic, or unlabeled). These sorting and filter options were useful for identifying examples where the model struggled to classify an image.

5. Invoke for new inputs

With the model trained, I could now invoke the model with new inputs. Nyckel’s invoke tab allowed me to upload new images that our model would classify as real or synthetic, directly from the user interface. The invoke tab only allowed me to assess one image at a time, but Nyckel also provided an API to invoke the model, complete with example requests:


python

import requests

url = 'INSERT YOUR URL HERE`’

headers = {

'Authorization': 'Bearer ' + ‘INSERT YOUR BEARER TOKEN HERE’

}

with open('INSERT FILE NAME', 'rb') as f:

result = requests.post(url, headers=headers, files={'data': f})

print(result.text)

You can learn more about the API via the API documentation.

The AI Image Detector’s Performance

Nyckel’s web interface did not provide a way to assess the model’s performance on a held-out validation set, but the AI image detector provided promising results in cross-validation. Since cross-validation involves resampling the data so that different portions of that data are used to test and train a model on each iteration, it provides a good idea of how the performance will generalize to an independent data set. Additionally, cross-validation is a very data efficient way to train a model, allowing users to produce models with less training data.

In cross-validation, the model correctly identified 92.4% of AI-generated images as synthetic and 92.3% of real images as authentic. That’s great performance for a model only trained on 2,000 images!

Spot AI-Generated Images Without ML Expertise

The process of building our AI detection tool was simple and fast, taking less than 15 minutes to accomplish. The most substantial portion of my time was spent selecting and organizing the subset of images I would use to train the model, since I couldn’t train the model on the full dataset.

Although the web interface doesn’t include certain features that machine learning experts might expect to find (like setting training parameters), Nyckel makes image classification accessible to non-experts through its user-friendly interface, real-time feedback, and easy navigation. It enables even those new to computer vision to train an algorithm on their own dataset in a matter of minutes.

Demo of building an AI image detector

Click through the demo below to see how quick it is to create an AI image detector with Nyckel.

Interested in building an image classifier yourself, or using Nyckel for AI detection? Sign up for a free account and reach out to the Nyckel team at any time with any questions.