Skip Navigation
Go to home page

Don’t Overwrite Visual Labels With `aria-label`

The aria-label property can be extremely useful when used correctly. When used incorrectly, it can wreak havoc on speech-input user experiences. It's an easy mistake to make and luckily it's pretty easy to fix!

What is the aria-label property?

As defined by the WAI ARIA 1.2 standards, the aria-label property "defines a string value that labels the current element". It has the same purpose as the aria-labelledby property, which references another element that labels the current element. The purpose of both of these properties is to provide an Accessible Name of the current element.

Related to these properties is the aria-describedby property, which references another element that describes the current element. The purpose of this property is to provide an Accessible Description for the current element. The Accessible Name should be short and concise, and the Accessible Description should be more verbose and complement the Accessible Name.

The Accessible Name is part of what screen readers use to announce elements to a user when they come into focus. A <button> with no aria-label and plain text between its opening and closing tags would have an Accessible Name that matches its plain text. That same <button> can have its Accessible Name overridden with something completely different using the aria-label property.

Use the next two example code snippets to compare how the same button sounds with and without an aria-label.

Example 1: <button> with no aria-label

The code:

1<button>OK</button>
2

The output:


Example 2: <button> with an aria-label

The code:

1<button aria-label="Confirm selection">OK</button>
2

The output:


Why does aria-label need to match the visible label?

First of all, an aria-label must match its element's visible label because there is a Level A WCAG Success Criterion that says so. SC 2.5.3: Label in Name says:

For user interface components with labels that include text or images of text, the name contains the text that is presented visually.

This means that the Accessible Name computed for an element should contain the text of the element's visual label. One best practice for this is to have the Accessible Name start with the visual label's text. So if a link contains the text "Learn more" and the author of that link wants to include more information about the link's purpose in the aria-label, a good practice is to start the aria-label with "Learn more".

If you're like me and my ADHD brain, you might need a little more reasoning than "the rules say so" to motivate you to follow the rules. A lot of people don't like being told what to do, and that's fine!

So why do we need to follow Success Criterion 2.5.3? It partially has to do with speech-recognition technology. As it turns out, the Accessible Name for elements is how speech-input users can interact with elements in a web page.

Referencing our two previous examples:

  • A speech-input user can activate the button from Example 1 by saying "Click OK". This is because the Accessible Name for that button is "OK".
  • A speech-input user cannot activate the button in Example 2 in the same way. This is because the Accessible Name for that button is "Confirm selection". If a speech-input user wanted to activate this button, they would have to say "Click confirm selection".

For speech-input users that rely on visual labels to come up with voice commands, there no way for them to know before speaking a command if the Accessible Name of an element differs from its visual label. They will have to speak a command and find out if it does or does not work. If a command doesn't work, they may retry a couple of times. And if the command actually does not work at all because the Accessible Name doesn't match the visual label, their energy has been wasted and they're also stuck.

What happens when aria-label doesn't match the visible label?

As I wrote this article, I started to learn about Voice Control on Mac. I wanted to see if I could find a real world example of speech-input navigation being broken due to mismatched visual labels and accessible names. I found a couple of examples right on the Twitter home page.

Unfortunately, more than half of the main navigation links have accessible names that don't match their visual labels. Even in the cases where a link's aria-label began with the text in the visual label, I was not successful in activating those links with speech-input. I made countless attempts.

  • Home: The aria-label for this link is dynamic. If there are unread tweets in a user's feed, then the aria-label reads "Home (New unread Tweets)". The "Click Home" command only activates this link if there are no unread tweets.
  • Explore: The aria-label for this link is "Search and explore". Saying "Click Explore" does not activate this link.
  • Notifications: The aria-label for this link is dynamic. If a user has 2 notifications for example, the aria-label is "Notifications (2 unread notifications)". The "Click Notifications" command only activates this link if a user has 0 unread notifications.
  • Messages: The aria-label for this link is dynamic. If a user has unread messages from one other user, for example, the aria-label is "Direct Messages (1 unread conversation)". If there are no unread messages, the aria-label is "Direct Messages". The "Click messages" command does not ever work.

Here's a quick demo of me trying to activate the "Messages" link that shows on the home page of Twitter:

Are there other users that benefit from aria-label matching the visible label?

This article primarily focuses on the sighted speech-input user's experience, but there are other users affected by this issue as well. Think of a sighted screen reader user that comes across the <button> element in Example 2. It is presented by their screen reader as "Confirm selection, button", but the text in the button says "OK".

This can be a very confusing and stressful experience for the user. How can they be certain the issue is with the website and not their own action? We're imposing an extra cognitive load on the user to process the mismatched audio and visual experiences. There is also extra cognitive load required to first memorize the behavior, and then recall it the next time they come across the element.

It's not necessary or kind to do this to users. It's energy-draining, and energy is an extremely limited resource for all kinds of people who rely on accessible interfaces.

How do I fix mismatched aria-labels and visual labels?

Now that you know mismatching aria-label and visual labels is harmful for speech-input users, how do you go about preventing that harm? First, I'd recommend reading Playing with state by Sarah Higley and Be Careful with Dynamic Accessible Names by Adrian Roselli.

Second, think about whether or not you need to provide an aria-label for the element that differs from its visual label. Some cases that come to mind:

  • Are there duplicate visual labels?
    • Your markup can generally be improved by ensuring all visual labels are unique. This is beneficial for screen reader navigation.
    • The speech recognition technology probably has a way to help the user choose which element they want. With Voice Control on Mac, a number appears next to the elements with matching Accessible Names after the command is spoken. Then you're able to say the number of the element you want to activate.
  • Is the content of the element plain text? Omit the aria-label property. The computed accessible name for the element will be its text content.
  • Is the element a toggle button? Reference the Playing with state article I mentioned and also check out the Toggle Buttons writeup from Inclusive Components. The aria-pressed property better suits your needs than a dynamic aria-label.

After you've spent time thinking thoroughly about your aria-labels and visual labels, make the updates and test them with your users. It's hard for us to know if something is usable if it doesn't intersect with our own needs. The only experiences we can truly be experts of are our own. Don't miss the step where you get your information from the source of truth: your users with expertise in their lived experiences.

Back to Top