The choice between CVSS and FMEA is a choice between estimating a vulnerability and a risk.
First things first, the title of this blog is a misnomer: while it can be said that the both methods have a shared goal of improving safety and security, CVSS and FMEA are not alternatives for the same purpose and function but rather two distinctly different methods that measure different metrics. Having said that, CVSS and FMEA are complementary methods.
The Common Vulnerability Scoring System (CVSS) is designed to measure the severity of a vulnerability and should not be used assess risk. In fact, CVSS 3.1 documentation specifically states:
The CVSS Base Score represents only the intrinsic characteristics of a vulnerability which are constant over time and across user environments. […] More appropriately, a comprehensive risk assessment system should be employed that considers more factors than simply the CVSS Base Score. Such systems typically also consider factors outside the scope of CVSS such as exposure and threat.CVSS 3.1 User Guide
This is where the Failure Modes and Effects Analysis (FMEA) comes along. Granted, there are alternative methods for identifying threats and assessing risks, but FMEA has been used since 1950s to study problems that might arise from malfunctions in systems. The method was originally developed for the military, but since then it has been widely adopted across many industries; perhaps most notably by the automotive industry.
Common Vulnerability Scoring System
The Common Vulnerability Scoring System (CVSS) provides a way to capture the principal characteristics of a vulnerability and produce a numerical score reflecting its severity. The numerical score can then be translated into a qualitative representation (such as low, medium, high, and critical) to help organizations properly assess and prioritize their vulnerability management processes.First.org CVSS
CVSS 3.1 is composed of three groups of metrics: (1) The Base Metric Group, (2) Temporal Metric Group, and (3) Environmental Metric Group.
The Base Metrics produce a score ranging from 0 to 10 representing the intrinsic characteristics of a vulnerability that are constant over time and user environments. The Exploitability metrics reflect the ease and technical means by which the vulnerability can be exploited; these are the characteristics of the thing that is vulnerable. On the other hand, the Impact reflects the direct consequence of a successful exploit; these are the effects experienced by the thing that suffers the impact.
When scoring base metrics, it is assumed that the attacker has advanced knowledge of the weaknesses of the target system, including general configuration and default defence mechanisms. After all, it is better to be prepared for the worst situation instead of placing one’s bet on the attacker being incompetent and uninformed.
The Base Metrics score is modified by the Temporal Metrics. There are factors that may change over time but not across user environments. For example, consider a Zero-day vulnerability: initially a vulnerability exists, but has not yet been discovered so at best the vulnerability can be considered theoretical if it is considered at all. Eventually the vulnerability is discovered and there are rumours of existing exploits, or perhaps there is a proof-of-concept for the exploit, but the vulnerability is not yet widely known nor exploited. As time goes by and the vulnerability becomes known and this again changes the vulnerability estimate. The remediation goes through a similar cycle: initially there is no fix but over time there may be workarounds and patches until (hopefully) the vulnerability is properly fixed.
The Temporal Metrics is further modified by the Environmental Metrics. These are factors, or rather, modifiers that are unique to a specific user environment. For example, a server uses a software component, which has a known vulnerability with a high Confidentiality Impact in the Base Metrics, but since this particular user environment does not contain any confidential user data the vulnerability can be considered less serious. Another user environment might place a very high importance to data integrity (e.g. a banking system) so even minor vulnerabilities must be taken more seriously.
When estimating CVSS vulnerability scores it is recommended to use the First.org CVSS Calculator.
Failure Modes and Effects Analysis
The objective of an FMEA is to look for all of the ways a process or product can fail. A product failure occurs when the product does not function as it should or when it malfunctions in some way. Even the simplest products have many opportunities for failure.The Basics of FMEA (2nd Edition) by McDermott, Mikulak & Beauregard
As a method FMEA is fairly straightforward. Begin by identifying product’s constituting Components and Functions. For each component there are almost certainly one or more ways it might fail, and each of these Failure Modes comes with one or more potential effects. The severity i.e. the seriousness of each effect is estimated as a value between 1 and 10 from least severe to most severe.
Each failure has one or more potential causes or failure mechanisms. These causes are evaluated in the sense how likely it is for this failure event to occur, or what is the probability that the component’s failure mode is triggered. The cause of failure is estimated as a value between 1 and 10 from almost never happening to almost guaranteed certainty.
Knowing the potential causes for a failure mode, it should be possible to identify what (if any) controls there are to prevent the failure mode from happening as well as to identify the current means to detect the failure when it happens. The detection of failure is estimated as a value between 1 and 10, from plainly obvious to nearly impossible to detect.
A Risk Priority Number is calculated for each failure mode:
severity x occurrence x detection.
RPN is a subjective value so it cannot be used to compare FMEAs. It is also a reflection of the composition of the group of people who worked on the given FMEA. However, by calculating RPN for each failure mode it is possible to assess and prioritise risks, refine product requirements and come up with means to either eliminate risks or at least to mitigate the effects. These mitigating actions are the Recommended Actions.
FMEA should be performed at various points of a project. Each round takes a look at what has been done to address the failure modes and then recalculate the RPM as an indicator for the residual risk. If the residual risk remains unacceptably high then more actions should be taken until the risk is either eliminated or at least has been brought down to an acceptable level: either limit the severity of failure mode’s effects, decrease or eliminate the level of occurrence and/or improve the detectability.
Example Case: Fire Extinguisher
Failure Modes and Effects Analysis can be performed multiple times during the product’s development life cycle. It can be done to take a critical look at the concept of a fire extinguisher, which refines the product design. FMEA could be performed for each product design review and/or when a prototype is being tested. A final FMEA could be performed at the end of the project to estimate the final, residual risk for a commercial product.
Common Vulnerability Scoring System, on the other hand, is usually best used when there is an actual construct like a prototype or commercial product; while it is possible to look at a product concept or design and identify potential vulnerabilities, much depends on the actual implementation. A poor implementation will introduce exploitable bugs into the product regardless of the design’s level of quality.
So, let us consider a fire extinguisher. It has multiple components, including
- a hose (usually made of rubber or fabric)
- a container (usually made of aluminium or steel)
- fire suppressing substance (under pressure inside the container)
- a pressure gauge (some models might not have gauge at all, which is also a feature)
- a handle
- and so on…
Consider the hose. It is possible that there is an obstruction inside the hose that either severely limits or completely prevents the product from being used. It is also possible that the hose’s material deteriorates over time so that it breaks off when pressure is applied, but at least it should not completely prevent the product from being used although the performance would likely be somewhat limited. The container’s material or manufacturing process could be such that the internal pressure eventually causes a seam to leak. The pressure gauge might be incorrectly calibrated, or the gauge hand could be stuck, but either case would have only minimal effect on the product’s intended use.
Severity of an obstructed hose (9/10): an obstruction would limit or completely prevent the flow through the hose, which would be a severe effect for a failure mode as it prevents the product from being used – the severity is emphasised when considering when and how the product is intended to be used.
Occurrence of an obstructed hose (2/10): having an obstruction in a hose is something that almost never happens, but it is not absolutely guarantee that it would never happen under any circumstances.
Side note: since gas cools as it expands, it is theoretically possible for a CO2 extinguisher to freeze in continuous use. While the freezing gas could create an obstruction, it is a separate failure mode.
Detection of an obstructed hose (7/10): it is not likely that an obstructed hose would be detected before an attempt to use the fire extinguisher and even than the initial assessment might be that the container is empty.
The Risk Priority Number for an obstructed hose is
9 x 2 x 7 = 126.
Severity of a stuck pressure gauge hand (2/10): having a stuck gauge hand has no effect when the fire extinguisher is activated. However, it would have a compounding effect when combined with another failure mode, a leaking container. There might not be any pressure inside the container while a stuck pressure gauge hand indicates differently.
Occurrence of a stuck pressure gauge hand (3/10): let’s assume that this a moderately common occurrence considering that a pressure gauge hand almost never moves between filling up and emptying the container.
Detection of a struck pressure gauge hand (4/10): a stuck gauge hand would be virtually undetectable while the fire extinguisher is stored in a wall mount, but we can assume that the extinguisher is being regularly maintained as recommended so the faulty gauge should be easily detected during the inspection process.
The RPN for a stuck pressure gauge hand is
2 x 3 x 4 = 24.
So, what to do next? Should the manufacturer invest time and money to acquire pressure gauges that are less likely to malfunction or that might have some nifty testing feature that allows users to verify pressure levels between scheduled inspections? Should the manufacturer call the pressure gauge “good enough” and pay attention to the hose? As long as there is a hole through the hose there can be obstructions, but would there be any way to mitigate the effects of having an obstructed hose?
Assuming the initial RPN was calculated during the design phase, the first prototype might have incorporated the recommended design changes. Another FMEA would look at the prototype’s hose and calculate a revised RPN by re-estimating the severity, occurrence and detection scores. If the revised RPN is still above acceptable level of risk the cycle of review and refine should be repeated until the remaining risk is at or below the acceptable level.
Consider the hose. How vulnerable it is when there is an attacker that intends to sabotage the fire extinguisher?
Attack Vector (AV:P) – PHYSICAL: the attack requires physical access to the fire extinguisher.
Attack Complexity (AC:L) – LOW: the attack is as simple as stuffing a hose with arbitrary crud.
Privileges Required (PR:N) – NONE: considering strictly the attacker having a physical access to a fire extinguisher, there are no authentication or authorization measures.
User Interaction (UI:N) – NONE: the attacker has no need for a user involvement in order to create an obstruction inside the hose.
Scope (S:U) – UNCHANGED: it could be argued that the real target of the attack is the thing that ends up being on fire. For example, the attacker is an arsonist who intends to set a building on fire and prepares by sabotaging local fire extinguishers. However, just to keep this scenario simple let’s assume the attacker is being a complete nob without any grand designs…
Confidentiality (C:N) – NONE: having an obstructed hose has no impact on confidentiality of information.
Integrity (I:H) – HIGH: To be far, CVSS refers to Integrity as trustworthiness of information. In the somewhat tortured context of this example, let us assume that the integrity refers to trustworthiness of any fire extinguisher in the area: if one was sabotaged then all might have been?
Availability (A:H) – HIGH: CVSS refers to Availability as the availability of the impacted components. In the context of this example, having an obstructed hose would result in complete loss fire extinguisher’s availability. In other words, it would not function.
The Base Score 6.1 indicates Medium vulnerability.
Initial vector string:
Exploit Code Maturity (E:H) – HIGH: while CVSS refers to exploit code, this metric is about the maturity of exploit techniques and how well known the vulnerability is. In the case of stuffing a hose it is safety to say that the vulnerability is widely known and there are any number of ways to do this…
Remediation Level (RL:W) – WORKAROUND: while there are various workarounds like placing the fire extinguisher inside a locked cabinet or intentionally covering the hose in order to prevent somebody maliciously blocking the hose there is not much that can be done to prevent this kind of attack.
Report Confidence (RC:C) – CONFIRMED: yes, it is confirmed! It is indeed possible to obstruct a hose by physically preventing any kind of flow from going through it.
The Temporal Score 6.0 indicating practically no changes to the vulnerability over time.
Revised CVSS vector string:
Confidentiality Requirement (CR:X) – NOT DEFINED: there are no reasonable requirements for the confidentiality of information. This metric has no influence to the Environmental Score.
Integrity Requirement (IR:H) – HIGH: considering this example’s refined interpretation of integrity, the requirement would be high for the trustworthiness of all fire extinguishers in this particular user environment.
Availability Requirement (AR:H) – HIGH: considering this example’s refined interpretation of availability, the requirement would be high that all fire extinguishers would be operable in this particular user environment.
The Environmental Score is 6.6, which indicates that a vulnerability becomes more severe when the product is being used in a critical environment.
Final revised CVSS vector string:
Modified Base Score
This example does not modify the Base Score.
For example, the
Privileges Required could be modified from
LOW by placing the fire extinguisher inside a locked cabinet: the attacker would need to have key to access the fire extinguisher. Alternatively, the
Attack Complexity could be modified from
HIGH if the attacker would need to pick the cabinet lock in order to access the fire extinguisher without leaving a trace (as would be the case when brute force is applied).
When considering the security and safety of a product it makes sense to apply more than just one method. CVSS allows us to estimate product’s vulnerability against malicious actions such as wanton sabotage, but it provides no insight to the related risks or even what mechanisms there are that the attacker might try to exploit.
FMEA provides a method to systematically analyse the product by identifying relevant components, how those components might fail and what the consequences might be if and when something does fail. This in turn encourages us to come up with ways to either prevent the failure or at least to mitigate the effects of that failure. Furthermore, FMEA helps to identify and prioritise more severe issues over less important ones.
By understanding the failure mechanisms, we can identify attack vectors along with the other related vulnerability metrics. Ultimately both FMEA and CVSS help to revise the product’s requirements, which in turn directly influences the overall design and quality assurance work.