How do you manage non-functional requirements?

If you are in the software industry, you’re already familiar with the concept of non-functional requirements (NFR). Still, organizations fail to deliver because of them.

It is very unlikely that an application fails to do what it was conceived to do. A trading application can trade and a spreadsheet application can perform calculations, but there are aspects that cannot be captured as “features” that may compromise the effectiveness of the software.

How many times have you seen software that is good on paper, yet is slow, does not run on your favorite Operating System (OS), or does not handle the load?

There are many distinct types of non-functional requirements. Some of them are constraints (like the OS or cloud vendor) and are typically captured during the inception of a project. They become general directions, pervasive of the guidelines of the project so they aren’t usually a problem.

It doesn’t mean that they are immutable and cannot change (“Hey, remember I told you that the customer is on Azure? Sorry, they switched to Google Cloud last week”), but they are clear in everybody’s mind.

Subtler types of NFR are related to

  • Performance
  • Scalability
  • Resiliency
  • Security

In many cases, they are not even addressed during the collection of the requirements. If the organization does discuss them, it’s typically in very general terms.

Best case is that there are indications like “must be fast” or “must be resilient.”

NFRs are like any other requirements: they can be precisely stated, are specific, and they have their acceptance criteria.

It is universally accepted that non-functional requirements (NFR) are “attributes” of the software, and they cannot be expressed as stories.

I’m not sure how the notion got traction, but it’s far from the truth and this characterization hides a flaw in the development process.

The Product Owner bias

In an agile organization, the Product Owner is the person responsible for the product. To simplify, this is the person who decides what the product should do and decides the the priorities surrounding the product. A Product Owner knows the users and owns the user stories.

As development is nowadays driven by user stories, it is only natural that NFRs are perceived as “optional” or an after-thought, something that can be adjusted in process.

So, this is what happens:

  • Product Owners are often not interested in NFRs, although they affect the users
  • The implementation of the NFRs is not tracked and measured (no story or story-points, hence the time you spend on them is not visible)
  • The specification for the NFRs is not driven by businesspeople (the product owners), hence they are not guaranteed to fit the requirement or be what the user wants.
    Did you ever hear “let’s do benchmarking” at the end of the project?

When you are in this situation, it’s predictable that you will address some of the NFRs as “quality issues” and handle them at the end of the development process—when it’s too late. NFRs can be a stronger driver for the architecture than the functional details, so they should be addressed at the beginning rather than at the end.

For example, you can always change the way you calculate prices, but an application that is not designed to scale horizontally will never do.

So “must be fast” is not enough. It needs to be specific:

  • What operation needs to be fast and how fast?
  • How long do we have to perform a given operation?
  • What is the acceptable range and how many outliers can I afford?
  • How do I measure and track it?
  • What should I do if it takes longer?

Latency and throughput are very much business concerns and are extremely specific. A latency critical operation must be quantifiable, and the value is something that is business-driven. What is the acceptable response time for a credit check? This is obviously a business consideration.

The same goes for throughput. How many trades per second do I have to deal with? In case of a burst of X trades (when X is information that is very business-specific), what is the acceptable time to deal with the peak?

Resiliency is also a concern that is tightly related to business. There is a level that is purely technical and the development team should be able to handle it, but being resilient is about handling failures. What to do in case of failures is a business decision, rather than a technical one.

If the Pricer in a trading application fails to provide a quote in time, what do you do? Retry? Raise a warning to the user and abort the trade? Wait more and flag the operation as “late?”

You can see that this does not look like an attribute. This is a full-fledged user story, with a user who wants to trade, an event that triggers an error situation, and an expected result.

The idea that NFRs are not user stories and that the Product Owner is not interested begins to sound less obvious. The entire point is that non-functional requirements impact the users as much as the functional requirements, so there is no reason to consider them differently.

What do we do then?

How do you solve the problem? We learn how to express non-functional requirements as user stories, and we educate development teams and product owners to collaborate and address them.

I will give a few guidelines on how to address some performance and resiliency requirements.

First, there must be representatives of the business (such as a Product Owner) and the development team (such as an Architecture Owner).

For what concerns resiliency, the trick is to deal with failures. The architecture of a system, especially when using microservices, implicitly defines a value stream where there is an input (a service that receives an event or a request), a sequence of steps performed by the different microservices, and finally the output.

Once you have drawn a diagram, it is easy to identify all the operations and that’s your list of potential failures. Just write the list and ask the businessperson, “what do you expect to happen if this step does not work?”

This is a great exercise because it automatically produces acceptance criteria for failures, educates the product owners to think in terms of failure management (and they get a sense of how the system works), and the development team can learn about what the users do for a living.

Performance dimensions are trickier and require more interaction, and probably some research.

The first step is to identify the performance dimensions that are relevant for each functional story (latency, throughput, and resources utilization are the most common ones).

Once the dimensions are clear, the acceptance criteria must be identified. In some cases, it is useful to understand where the competition stands and provide thresholds to match the requirement (to make the customer happy) and to beat the competition.

For example, imagine discussing a trade entry feature, where the user manually feeds trades using a form.

Performance dimension associated to the story can be:

  • Responsiveness: when the user commits the trades, how long do I expect to wait before the trade is in the system and I see my position updated?
  • Load: how many concurrent users do I expect? Scaling up the number of users must not affect the other dimensions (like responsiveness). In other words, within the range of expected users plus a margin, latency must be a constant.

The initial story where the trade entry feature is described must then be enriched or split so to capture the performance requirements.

Notice that the discussion may require some investigation and some iterations with the Product Owner.

The responsiveness requirement for example: is it 100ms, 1s, or 1ms? The developers may have to mockup the three different cases and let the Product Owner play with it before deciding which one is the best. Why not go for 1ms? Because the faster we want it, the more expensive and riskier the development.

Same for the load. It entails knowing how the customer operates and has a direct impact on the scalability model of the application, so it will also require a new story to enable the Ops team (that is a perfectly legitimate user) to dimension the system.

There are interesting side effects of seeing NFR as stories. One is that the discussion leads to analyzing several “what if’s.” When the Product Owner understands that there will be outliers, it is only natural to draft stories for them. “In case the response time of the trade entry screen is higher than 2 seconds, a banner should inform the user that the operation is taking longer than usual and to wait for the banner to close.”

This kind of feature does not change the nature of the software, but certainly improves the user experience, or at least the perception of quality.

Another positive side effect is that we can have stories to measure and monitor failures, so the developers’ team and the business can discuss what KPIs (key performance indicators) should be collected to quantify and track the exceptions. This will clearly improve observability and eventually reduce the cost of ownership.

If you produce software, make sure your development process is transparent for what concerns non-functional requirements. If you use software, make sure your vendor or provider handles NFRs correctly. When they do, they will be happy to discuss the details and show you everything you need to see.

Curious about how ION handles NFRs? Contact us and a representative will be happy to answer your questions.