Get image description right

Date of publication

As a blind person, I depend on meaningful image descriptions. But what does "meaningful" mean in this context? What does the perfect verbal description look like? And what actually belongs where?

Basic stuff

Before we get lost in the details of image description, we need to clarify the question of what actually goes where. As a place for descriptions, we

  • the alternative text
  • the caption
  • the actual article text and
  • an extra page with an image description

to provide. In doing so, I first take care of images that illustrate articles.

The alternative text

The alternative text should contain only what can be seen at the very first glance. Around 80 characters should be the length of the description. Details are to be neglected.

The caption

The caption can include details. For example, the name and location of the person in the photo, or what plant it is in the picture. In other words, anything that doesn't jump right out at you.

In plain-text

Embedding specific aspects in the body text is interesting when dealing with very complex images. For example, the peculiarities of a painting can be highlighted that would be too long for the caption. Moreover, by including additional information in the continuous text, not only blind people get something out of the descriptions.

Extra page for image descriptions

A specially created page for image descriptions is relatively rare to find. I personally would probably not click on it. Unless it is a picture whose detailed description is particularly important to me.

How do I describe it correctly?

The description in the alternative text can be kept concise. The important thing here is that, if possible, no interpretations are made. So really only what can be seen on the picture should be explained. Let's take the picture of a cat as an example. The color of the cat's fur or clear conspicuous features such as particularly large ears fit into the alternative text. After all, that's what catches the eye.

The breed of the cat or its name are better off in the caption. That's information that doesn't come directly from the image and also adds value for those visiting the site who can see the image.

If even more information about the image is to be conveyed, then it is recommended to place it in the body text. This could be features to which one would like to particularly point out, such as a particular painting technique or a special use of colors. Everything thus, on which one would like to direct the attention of the reader.

This results in a structure:

  • In the alternative text, what directly catches the eye
  • In the caption, what represents additional information
  • In the body text even more detailed information.

Lupe, hamburger menu and co.

A modern website can hardly do without so-called functional graphics for the hamburger menu or the usual magnifying glass to start a search. How are these icons actually described in a meaningful way? The answer: the function that is executed is set as alternative text. So the hamburger menu is not called "hamburger menu" but "menu" or "open menu". The magnifying glass would be labeled accordingly with "Search" in the alternative text. As a note, for these symbols, their concrete function goes in the alternative text.


Purely decorative elements should not get an alternative text. Really, why don't they get alternative text when web designers and site builders have put so much thought into layout and illustration? When I'm blind on the Internet, I'm only interested in content. This includes article images and functional graphics. Background or decorative images, on the other hand, have no substantial use.

Having every graphic that "only" looks pretty described would also simply take too long and disrupt the flow of reading.

Images on social media

Images on social media should also be described. The description here can be a bit more detailed. As a rule of thumb: The text should describe what is to be transported with the image. Here it may be quietly also somewhat less formal than with a picture, which is used with a technical article.

Images that represent screenshots of texts are a particular challenge here, because they represent much more text than fits into an image description. In such cases, the core message can be used as the image description. This is not a perfect solution. But it's as close as you can get to it at the moment, unfortunately. Having the core message can make it easier to decide whether the blind user should go the extra step and scan the image with text recognition software. This then allows the full text on the screenshot to be read.

Personal annotations

I write from my own consternation, because in order to perceive images, I depend on them being accompanied by descriptions.

With picture descriptions I have already experienced almost everything. From the lovelessly scribbled word to the extremely detailed description that gave me a convincing idea of why the image was chosen for the article. But there are also image attributions that are just plain wrong. For example, while searching for directions, I came across an image that was described to me as "arriving by plane." When I showed this page to our team, there was great laughter. After everyone regained their composure, I learned that it showed a public bus.

The most important thing again in brief

  • Graphics used for layout get no image description and no alternative text.
  • Functional graphics (hamburger menu and co) get as alternative text what they do and not what they represent.
  • For images in articles, the alternative text only reflects what can be seen on the image at first glance.
  • The caption provides additional information that is not apparent from the image at first glance.
  • Complex image descriptions that go into more detail should be designed as continuous text.
  • Image and image description must fit together
Profile picture for user DeepL

DeepL is a deep learning company that develops AI systems for languages. The company, based in Cologne, Germany, was founded in 2009 as Linguee, and introduced the first internet search engine for translations. Linguee has answered over 10 billion queries from more than 1 billion users.

Profile picture for user dennis.westphal

Dennis Westphal

Dennis is an IT consultant at the Company for the Development of Things. His field is accessibility. Helpfully, Dennis has been blind since birth. He creates his screencasts with open source software.