You also know that there are many best practices for designing thumbnails and that the image must always represent the content found within your video. What you may not know is that in addition to being recognizable to humans, your thumbnails must pass the Google AI test, so that it knows when to suggest your video to a viewer.
What Will Google’s AI See?
This is where Google Vision enters the picture. The publicly available testing page offers a preview of how Google will read your image and what faces, objects, text, and more that they associate with it.
To get started, design your thumbnail image and upload it on the Google Vision page which looks like this:
Reading The Results
The results can be viewed across several different tabs.
What can each result tab tell you? Let’s view an example and check out the results tab by tab.
The image below is a thumbnail of a recent PewDiePie video titled “Minecraft just became 10X better – Part 32” (I even took a screengrab with a partial carousel button showing – Let’s see how Google handles that!)
When we put this image into Googe Vision we see the results of Google’s AI analysis beginning with “Faces.”
Faces AI Results
Here the AI is reading the facial expression of Joy with a 91% confidence rating. If this was your thumbnail, it would have passed this “test” on expression as that is the emotion PewDiePie was going for in this thumbnail. However, the AI was not perfect. It did miss that he was wearing headwear in the image. If that were a focal point in your thumbnail (maybe a sponsor) you may want to go back and select an image where his headphones are more prevalent. Unless…
Objects AI Results
Google picks up the headphones in the object portion of the analysis right away. They also picked out a person as an object correctly, but PewDiePie has confused Google into classifying his Minecraft character as a toy by using a LEGO Minecraft skin or theme. Which, to be honest, is an easy error to make and despite not recognizing Minecraft, the LEGO aspect of it you could argue makes it more accurate.
Labels AI Results
As far as “Labels” go Google has correctly assigned five labels to the image – all around the game and gaming in general. You will want to make sure that these are accurately assigned to your thumbnail and adjust any elements that are inaccurate.
Web AI Results
“Web Entities” assigns associated digital destinations and intellectual property it recognizes in your image. In this case, Minecraft is coming through the strongest.
This also shows you “Pages with Matched Images” – a list of URLs with the same image. This will be useful to check after your video is published to see where the thumbnail image has been shared or reposted.
Text AI Results
Text is minimal in this example, but you can see it is reading video run time captured in the screenshot accurately. In a more text-heavy example…
The text is broken up into two blocks. For the most part, this is a very accurate feature, but you’ll want to double-check it in case there is an element causing Google to misread your text.
Properties AI Results
The properties reader is something to look at if you are adhering to a very specific color spectrum across your Library or would like to preview how your image will look cropped at different aspect ratios. This is one to check, but not a critical component.
Safe Search AI Results
Safe Search, on the other hand, is a critical component to check. If your content is rated Likely for Adult, Racy, or Violence your video may be buried by the YouTube discovery algorithm. Make sure this reading is reflective of your content and rethink anything that rates high on questionable attributes.
Google Video AI Technology
Similarly to Google Vision, Google has released Google Video Intelligence. This gives us a look into the process behind how they are reading and classifying the videos themselves with AI. Unfortunately, at the time of publishing this article, you cannot upload your own video but only choose from their samples to review the thinking behind the technology. Let’s take a quick look.
The Video Google AI reports on three things:
Labels AI Video Results
Similarly to the “Labels” tab in the Google Vision AI, this component assigns labels to “entities” it detects throughout the video. Here the top two entities are “dinosaur” and “vehicle.” Not pictured above are bicycles in the opening shot of the video – which it is also classifying as a “Land Vehicle.” What is interesting here is that the AI is correctly identifying the elements in the video, but fails to put it together for the overall point. This video is a short walk around of a park at a Google complex. It is neither about bicycles, nor dinosaurs. To see it for yourself, go here and select a sample video from the drop-down.
This reminds us that when we are creating a video to be sure to put visual queues to the content directly into the body of the video for Google to pick up on. If you are talking about a subject, say “Coffee Beans”, make sure to actually show coffee beans in the body of your video.
Shots AI Video Results
The shots component to the video AI attempts to break down different shots in your video and assigns “Labels” to each shot. You can see the shot list in the screenshot above where it says “Shot 3 of 4”. That bar is a timeline, with breaks that represent the run time of each shot.
Again, this feature only supports the idea that visual representation of what your video is about is important throughout the entire run-time of the video.
Explicit Content AI Video Results
The final component of the Video AI is Explicit Content. This analyzes your video frame by frame to assess the likelihood that your video contains pornography. This should be an easy one to avoid, but it is always good to make sure that you are not false-positively identified, as a ranking here will kill your video’s discoverability.
Final Thought: Why Does This Matter? How is it Useful?
Google has allowed us a sneak peek into their AI here and although it is not a direct evaluation tool for your video or thumbnail, it is a tremendously useful guide or double-check measure to make sure you are setting up your video for success.
Designing your content so that Google’s AI will recognize what you are conveying in your thumbnail and in your video will only make it easier for that video to be suggested to viewers searching for or watching content on that topic.
For your videos, you can learn the thinking behind the AI and incorporate it at the production stage. For thumbnails, this is a much more practical application of testing each thumbnail before you upload a video to make sure your video will be correctly classified exactly how you want it to be.
Have you tried testing your Thumbnails in Google Vision AI?
Share your experience in the comments!