Engineering CSS: applying method to the madness

For the umpteenth time, stop with the “HTML and CSS are not programming languages” nonsense.

It’s pure gatekeeping and accomplishes nothing. — @laurieontech

CSS is real code, and a very meaningful part of the experience we deliver to users, so we should care about it quite a lot. It takes real engineering discipline to manage the complexity of large CSS codebases.

The challenges of CSS “at scale” have been documented at length by many folks for many years now. How we engineer our CSS has to do with recognizing what those challenges are, and optimizing for systematic, modifiable, and proactive code.

Understanding the Cascade

The Cascade is where the C in CSS comes from. It’s an algorithm that defines how to combine property values for elements and their descendants. These values may originate from different sources:

  • User agent (the browser)
  • Author (you! and other third party sources)
  • User (the reader/consumer)
  • Animations
  • Transitions

If multiple sources each provide a value for the same property, the most important source (according to The Cascade) will win the conflict. If no sources provide a value for a given property, The Cascade will try to fill in the gaps by determining if it should inherit its value from an ancestor, or supply its own initial value.

Authors want deterministic CSS that provides a consistent user experience for their application, while users want the ability to customize at will in order to meet their accessibility needs. The Cascade is the attempt to balance those needs. The goal is noble, but the resulting polymorphic inheritance makes learning and writing CSS a real challenge.

For some light bedtime reading, have a crack at the W3C spec for CSS Cascading and Inheritance. You can also check out Amelia Wattenberger’s excellent breakdown of the algorithm.

Learning CSS is hard

The sheer number of properties and combinations of interactions between properties requires a lot of memorization. Intellisense support for CSS is relatively new for editors like Visual Studio Code, and it only helps you as far as hinting possible values for a given property. You still have to know which properties you want to use in combination and how they’ll work together, and it’s hard to develop that intuition for CSS. The relevant specifications can be difficult to digest at times, as the intended primary audience is actually user agents, not authors.

Of course, there are a lot of amazing resources available that can help people overcome this challenge. Rachel Andrew’s article on how to learn CSS is a particularly useful guide on how to move beyond rote memorization and understand the underlying models and principles.

Still, many underestimate the effort it takes to learn CSS, as there’s nothing else quite like it! Browser developer tools, as a group, remain the primary learning mechanism for many people, and that generally just scratches the surface of all there is to learn.

It’s impossible to understand what style rules are applied without running the application. Because The Cascade executes at runtime, static code analysis is infeasible. Writing tests against the existence of a class on an HTML element is not valuable, as there’s no guarantee what will actually be visible or what effect the rules will actually have.

As the codebase grows, it gets increasingly difficult to know why some style rule is being applied or overridden in a specific context, or if it’s even being used at all. When running the application, making sense of what actually got applied versus what was intended can involve some trial and error. Resolution of style rules may be non-deterministic, especially if stylesheets are requested asynchronously. Support for identifying and removing unused CSS is still early days (there are some tools sorta-kinda-enabling this, but in an ideal world, an IDE could identify this at the time of authorship.)

Writing CSS is also harder than it looks

CSS is a global namespace. There’s no native way to scope style rules other than by increasing the specificity. Increased specificity tends to close the door on extensibility.

Overriding is easy, effective reuse is hard. We might create overly broad rules in order to reuse them more easily, but this may lead to unintended consequences. Over-reliance on overrides can increase the cost of change as our codebase grows.

Desired behaviors or complex layout needs are often way ahead of platform-supported capabilities. CSS has had to leap from supporting simple documents to fully-fledged, interactive, immersive user interfaces. While CSS has been catching up in recent years, it is still often seen as “behind the times” with respect to tooling and modern web application needs — especially as compared to the rise of Javascript tooling (in particular the usage of transpilation and polyfills, allowing authors to write latest-edition Javascript without waiting for native platform support). There are sometimes limitations that we have to overcome or work around.

For some more context on why CSS is the way that it is: Bert Bos, one of the co-creators of the original CSS spec, wrote a history spanning from 1994 to 2016. Amy Dickens' helpful and empathetic primer on web standards is useful for understanding how web technologies evolve.

A value system for CSS

In attempting to produce deterministic code, we end up fighting with ourselves most of the time. Every technique that’s been developed in relation to “scaling CSS” in modern applications is predicated on overcoming some of the effects of The Cascade.

Jacob Thornton gave a talk in 2015 that runs through some of the techniques folks have adapted for dealing with CSS. Some of the ones he mentions: resets (e.g. Normalize.css), BEM, OOCSS, SMACSS, Atomic CSS, CSS-in-JS.

While the pain of maintaining large CSS codebases is real, engineering our CSS (or any part of the stack, for that matter) isn’t simply about making a bunch of hard decisions around which Shiny New Thing to use. How might we apply a holistic set of principles to manage this complexity?

My former colleague Pete and I found it helpful to think in terms of a value system, which we came up with a few years ago when we were running an internal workshop at ThoughtWorks. We wanted to define what “good” looks like for CSS:

  • Understandable and systematic
  • Modifiable with confidence
  • Proactive, not reactive

Each value builds on top of each other; without understandable CSS, we can’t modify with confidence; if we can’t modify with confidence, each new style rule becomes a reaction to the constraints of the current system. I’ve found this framing so useful as a teaching tool that I’ve continued to use it with all my teams.

Below I’ll highlight a few predominant CSS engineering practices through the lens of these values. I won’t dwell too long on specific implementations or technology choices, as they should be evaluated separately, but I’ll provide some concrete examples to illustrate the point.

Understandable and systematic

It doesn’t take very long for CSS to reach the “ball of mud” stage, especially if our styles are coming from different sources (e.g. frameworks, one-off third-party sources, CSS files, style tags, inline styles, etc.). We should strive for CSS that can be reasoned about.

Below is an example of some “chaotic” CSS. How can we possibly know what is supposed to be the desired behavior? Did we forget to remove the old code, or is there some reason why that’s not possible? What decisions or constraints got us to this place?

// main.css
.derp {
    color: black !important;
}

// vendor.css
p {
    color: blue;
}
// main.html
<p class="derp" style="color:red;">Hello</p>

<style>
    .derp {    
        color: hotpink;
    }
</style>

Practice: style encapsulation

Style encapsulation addresses the issue of the global namespace by giving our code structure and more predictable behavior.

Style Encapsulation

In a component-based architecture, each component should be described by a single source of style rules (e.g., one corresponding CSS file).

The styled output for a given HTML element should be described by one CSS class that is globally unique.

It is a natural extension of a component-based architecture, but a specific UI architecture isn’t necessary in order to achieve encapsulation. Some ways we might do this include: Shadow DOM, iframes, BEM, OOCSS, CSS-in-JS libraries (such as CSS modules, JSS, Styled Components, Emotion), et. al.

Using CSS modules to achieve style encapsulation might result in a transformation like this:

// MyComponent.module.css
.derp {
    color: blue;
}

// compiled output
.MyComponent_derp_34jh45kpw {
    color: blue;
}

The hash you see in the class selector above is not known at the time of authorship, so it would be difficult to override that style rule by using the same class name in a CSS declaration elsewhere.

Some strategies (like BEM) require more diligence from the authors and can be hard to enforce. Others provide a greater degree of isolation, but usually at some cost: as an example, choosing to create components via the Shadow DOM API could hurt the accessibility of our application. This is why it’s important to evaluate each technique in the context of the problem we’re trying to solve.

The goal isn’t necessarily 100% total isolation (which is basically impossible anyway thanks to The Cascade) — it’s to make behavior observed in the browser traceable to a single place in the source code. This is far easier to debug, maintain, and understand.

Practice: style linting

A linter can tell us about invalid properties and values, set some restrictions on specificity, limit certain features, and give feedback on code stylistic issues. This can help our codebase look more consistent and make it easier to interpret.

It’s too bad there’s no linter that can help detect code smells like magic numbers and brute-forcing. Sometimes limitations in CSS itself have forced developers to take weird measures (just check out the history of the clearfix for a classic example). CSS is particularly susceptible to the circulation of kludgy solutions, and this can seriously hinder people’s ability to learn CSS and develop intuition for it.

At this point I have to give a shout out to the Firefox team for some recent updates to their developer tools, including incredibly handy tooltips to explain why some CSS isn’t doing anything, and crucially, providing links to learn more. This is a great feature that will go a long way to help people learn CSS.

Modifiable with Confidence

Can we add new things easily? Can we make changes to old code without breaking existing behaviors or producing unintended side effects? This goes hand-in-hand with our ability to reason about our CSS.

Because it is impossible to know at the time of authorship what effects the CSS will produce, we can end up with a lot of dead or duplicate code that everyone is afraid to touch. Refactoring CSS is inherently a risky business. If we are able to understand why those rules exist, then we will feel more empowered to make changes to them.

Here’s an example of some unsafe CSS:

// somefile.css
.derp {
    color: hotpink;
}
// main.html
<p class="derp">
    Hello
</p>

// other-page.html
<h1 class="derp">
    I Am Hero
</h1>

The .derp class is not safe to change as we would be affecting the behavior of multiple elements at once. It is difficult and tedious to detect changes manually across multiple components or multiple pages.

Practice: automated visual regression testing

The only way to have any confidence that the CSS is doing what we expect is by running it in a browser. Inspecting the class name or the style properties in a Javascript unit test provides minimal value.

Automated visual regression testing allows us to create screenshots of components in our application, and compare the differences with reference screenshots that are checked in to source control. Some example tools are applitools and BackstopJS.

What about snapshot tests? Jest’s snapshot testing should not be confused with visual regression testing. They do not serve the same purpose.

We wouldn’t refactor a big ball of other code without a test harness in place, so it follows that enabling visual regression testing is the only way we can safely refactor CSS and ensure we didn’t unintentionally remove or alter behavior in a completely separate part of the application. A great tool would allow us to target multiple viewports and user agents for better cross-platform support.

Proactive, not reactive

If we’ve lost the ability to reason about our code, we’re no longer able to modify it confidently, and thus further changes become reactions to the constraints of the existing system. Reactionary code usually increases in specificity, until we finally just throw down !important as our white flag.

If we have to override a style, it usually means we made some kind of assumption and applied the rule too soon. We value the ability to make proactive decisions about the code we write, rather than reacting defensively to code that already exists and creating unnecessary workarounds.

Here’s an (admittedly silly) example of some reactive CSS:

// vendor.css
.derp {
    color: blue;
}

// main.css
#omg-must-override > p.derp {
    color: black !important;
}

Sometimes, it’s impossible to avoid using !important in a reactionary way, especially when dealing with clashes from vendor CSS. However, this now makes it extremely difficult to override ourselves if we need different behavior for the same element in a different context.

Practice: preprocessing

CSS on its own doesn’t currently allow for any control flow or complex functions. The only mechanism we have for reusing rule sets is by applying the same class to multiple elements. Preprocessing gives us the language features we need to eliminate the trade-off between reusability and predictability of behavior, and allows us to make proactive design decisions about our CSS architecture.

I put preprocessors (like Less and Sass) and CSS-in-JS in the same “preprocessing” bucket as there is some overlap in the problems they help to solve. (You could even use them together.)

What about PostCSS? PostCSS is not quite like preprocessors or CSS-in-JS solutions. It’s rather in a category of its own: previously referred to as a “post-processor”, it’s probably best described as similar to what Babel does for Javascript. PostCSS has enabled an ecosystem of plugins that allow styles to be transformed with Javascript.

PostCSS is a specific technology choice, and a means to an end. You could implement a lot of the practices I mention in this post by hooking PostCSS into your application’s build tool (e.g. webpack), but it certainly isn’t the only way. Hence, I haven’t called it out as part of a broader engineering strategy.

Here’s a brief example how we could use Sass to reuse core styling logic for buttons:

// _buttons.scss
@mixin button($color, $backgroundColor) {
    color: $color;
    backgroundColor: $backgroundColor;
    padding: 1rem;
    border: none;
}

// CreateThingy.scss
.create-thingy__button {
    @include button(white, blue);
    width: 200px;
}

// DeleteThingy.scss
.delete-thingy__button {
    @include button(white, hotpink);
    width: 100px;
}

CSS-in-JS has certainly been getting a lot of attention in the last few years while tools have been developing and gaining maturity. It’s a very attractive option for teams, as there are often multiple benefits beyond enabling effective reuse, including but not necessarily limited to: dead code elimination (which, as discussed earlier, may not entirely work as expected), critical CSS extraction, and baked-in style encapsulation. There are plenty of framework-agnostic alternatives, and you also don’t need to learn a special new syntax like you would for a preprocessor — it’s just plain Javascript. However, if your tool of choice must calculate CSS rule sets at runtime, you may be incurring performance costs.

Practice: design systems

Design systems have emerged as a way to effectively reuse visual and interaction patterns at scale. Preprocessing answers: how do we reuse this style in a single codebase? Design systems answer: how do we reuse this style across multiple codebases?

The design system itself is a product that is the result of designers and developers pairing together. It generally contains guiding principles, reusable design assets (such as icons, or Sketch symbols), reusable code components, and possibly stylesheets / style utility functions independent of the reusable components. This helps wireframing, prototyping, and developing not only go a lot faster, but also helps organizations managing a large suite of products to retain a consistent experience across those products.

Creating centralized tools like design systems and UI component explorers (via tools like Storybook) is a key part of movements like DesignOps. I won’t go into deep into the ethos of DesignOps now, but it’s clear if we really want to scale our CSS, it goes beyond mere engineering practice to proactive, cross-disciplinary collaboration.

The future of stylesheets

In case you’re interested in reading more about specific trends, tools, and techniques being used in the wild, the 2019 State of CSS survey is a good place to start. CSS-in-JS will likely continue to be trialed by more teams over the next few years. It’s existed as a concept for about 5 years now, but it’s only in the last couple of years that a greater variety of tools are being developed and are beginning to mature.

Despite being slow-moving for the past few years, there is continued work on the proposals for the CSS Houdini APIs, which are low-level Javascript APIs that expose part of a browser’s render engine. This would allow developers to effectively create their own CSS features that execute at CSS speeds, without having to wait for browser adoption. When that eventually happens, I’m sure an explosion of libraries creating CSS extensions will appear on the scene. Some are already starting to appear for the Paint API.

Another space to be watching is increased capabilities and tooling for cross-platform stylesheets. Many organizations need to support multiple platforms, and with that comes increased friction in trying to keep feature parity and a consistent experience across all those platforms. Today there are a number of hybrid Javascript frameworks that allow teams to “write once, run anywhere” (Xamarin, Ionic, PhoneGap, ReactNative, Flutter, Electron, et. al.), but this may not be the right choice for every team for any number of reasons.

What would be really interesting is if a single design system could house design resources, components, and stylesheets that you could use in your application, regardless of your target platforms and your technology choices. Facebook’s Yoga seems to be an early (but still immature) entrant to this category, and it’s clear that the folks working on Houdini are interested in capturing lessons learned from tools like Yoga. I think it’s a long way off, but it’s certainly an interesting problem to keep an eye on.

Adapting to the future

I’ve said it before, but it’s worth repeating: having a set of values provides us with a guiding philosophy for what is most important to us, and serves as motivation to adopt new practices or abandon ones that aren’t working. They also help us to better communicate with each other, through both face-to-face conversations and the code we write.

I’ve found enormous benefit in using these values to onboard other developers to how to engineer our CSS: everyone can nod their head and agree that we want understandable, systematic, modifiable, and proactive CSS. The mechanics of how to achieve that is entirely up to each team, and will continue to evolve as the specs and the tooling around CSS evolve.