Using rems for saner typography

Using relative units is essential to responsive design, and considering the infinite scenarios in which web content can be viewed, it is almost always necessary to be responsive by default. I’m finding that many teams still aren’t using relative units, instead relying on magic pixel numbers everywhere, leading to very fragile layouts. The em is a magical unit in CSS that helps us achieve all our dreams of flexible layouts.

So of course, I’m here to tell you why we shouldn’t really be using ems. But first — a quick history lesson!

What is an em really?

The em unit actually has origins in metal type. The length of a sort, which is the metal block representing a single letter, would be equal to the specified point size. The em, in turn, is equivalent to the currently specified point size. For example, if the metal sort were set to 16pt, then 1em in this context would equal 16pt.

Now, imagine we have the letter H represented on a metal sort. The H itself is not 1em tall; if all letters were exactly 1em tall we’d have a very hard time reading words, since the baseline of each letter would be completely different. So actually, 1em represents the maximum height a particular letter could be. There are no restrictions on the width of the letter. All this explains why 16pt in one typeface could legitimately look bigger or smaller than 16pt in a different typeface; the em simply helps to establish a canvas on which the letter can be drawn, without enforcing a physical height or width.

These Ms are both 16pt, but the different typefaces cause them to appear different sizes. The box illustrates a canvas that is 1em × 1em. Notice that the baseline for each letter is different.

This also illustrates the relationship between the em and the point; the analogous relationship for the web is between the em and the pixel. The pixel is an absolute* unit of measure, while the em is a relative unit of measure that depends on the current context.

* Technically, the pixel is an angular measurement. How much physical space a pixel takes up on your screen will depend on your screen resolution.

Calculating em values

By default, the browser will set the default font-size of the html element to 16px, although this can be changed by the user. We want to give our users the freedom to increase or decrease their default font-size to support accessibility in a wide range of use cases, so we shouldn’t lean too heavily on pixel-perfect values.

Since the em is dependent on its context, if you choose to use ems, it is important to understand how this value gets calculated, as it will have repercussions for descendant elements.

result (em) = target font size (px)
            ÷ font size of containing element (px) 

For example, if I want all p elements to appear as 18px, but the font-size defined at the html element is 10px, then I would want to set the font-size for my p elements to 1.8em.

Sounds simple enough, right? But imagine we’ve set article elements to have a font-size of 1.8em. What size would p elements inside those article elements be? In ems, it’s still 1.8em. But to find out the size in pixels, we use the formula in reverse: 1.8em × 18px (the font size of the containing element, which is article) equals 32.4px.

The cascading effects of ems

Now we’ve found ourselves in a situation in which some p elements have a font-size much larger than we expected. This has happened because the context of those p elements inside the article elements is different than for the other p elements. If we had set the font-size to be less than 1em, we would find our p elements shrinking. This same issue would still hold true if we decided to define our font-size in percentages instead of ems.

While many people advocate the use of ems, and believe that this cascading behavior is desired, in practice this can make it very difficult to manage the way you expect your copy to appear in different parts of your application. This is especially true of larger applications with many nested modules written by many different people. We could get around this by resetting the font-size of p elements within article elements, but this adds bloat and unnecessary complexity to our style rules that we don’t need in the first place.

Enter the rem

The rem, or the root em, can really save us here from the cascading problem. The rem gives us all the benefits of using ems without any of the drawbacks, because the context for a rem is always the size defined at the root of the HTML document, rather than its parent element.

The rem has been around for a while and is supported by all major browsers, though is still under-utilized today. (If you need to support IE8, it is a good idea to provide pixel fallbacks.) It is sometimes erroneously called a relative em, which doesn’t really make sense as ems are already a relative unit of measure.

I advocate for using rems 100% of the time when defining font sizes, because it allows me to make changes to my styles with confidence, and to not have to think too hard about the code I’m reading. With ems, I can’t confidently say that an element will behave as I expect, because I may be inheriting my font-size context from ancestor elements. If my markup changes, I will have to recalculate the new em value to ensure it’s still using the correct context. Using rems helps me to maintain my sanity and save my brainpower for worrying about other, more important things.

Margins and padding

Now, which makes more sense to use with margins and padding—ems, or rems? For padding, I suggest ems. This is probably the only time it ever makes sense to use ems. Remember, since the context of an em is the containing element, it seems reasonable that if you change the font size of an element, that you will want the padding to scale along with it. It’s safe to do this without fear of this style rule leaking into other places, since padding is internal to the element. If you were to specify padding in rems, the padding would not grow and shrink with the font-size, which is important for comfortable reading.

Margins are less straightforward. In many cases, I would actually argue that pixels are the most suitable choice; generally you want your margins to stay fixed, such as when you have two buttons sitting next to each other. You certainly don’t need the gap between buttons to be responsive, although you might stack the buttons in smaller viewports. In the situations in which you don’t want fixed margins (such as in a grid system), rems would be the best choice. I can’t think of a scenario in which you would want your margins to scale with the font size, so you really should avoid ems here.

You can play with the CodePen below to understand how changing the font size will affect the padding and margins of elements:

But what about typographic rhythm?

I have seen others argue for using ems for type, and rems for margins/padding, because supposedly the cascade helps maintain consistent vertical rhythm across different elements for free. This is just not accurate. Vertical rhythm is less about the ability to scale your type up and down, and more about providing a structure that helps guide your reader through your content. It involves some careful thought and design to implement beyond just using ems everywhere. But more on that for another time!