Using Technology Readiness Levels and System Architecture to Estimate Integration Risk

By Steven D. Eppinger, ScD, Tushar Garg, Nitin Joglekar, PhD, and Alison Olechowski, PhD

The challenge: Risk management is one of the most critical activities in new product development. Improper or insufficient risk identification practices can result in unanticipated schedule overruns, significant rework, budget inflation, and reduced capability for delivering the project’s chartered scope. Although several decision support tools exist to help project managers identify and mitigate risks, few explicitly consider the impact of a system’s architecture.

The approach: This article describes a practical risk identification tool that can be used by engineers and technical managers on projects involving integration of new technology components into systems. Its framework combines system architecture concepts and analysis with technology readiness levels (a metric describing where a given technology is on the path to full maturity) to focus attention on high-risk components and interfaces. It focuses specifically on technical risk, which deals with the uncertainty related to developing and integrating new or complex technologies.

Our goal is to offer a novel risk estimation framework that:

  • includes system architecture considerations;
  • embraces traditional project management literature;
  • defines risk as a combination of likelihood and impact;
  • uses technology readiness levels as a proxy for the likelihood that a component will require a change to fulfill its function;
  • and, given that change propagates through interfaces, employs network measures to estimate impact related to connectivity.

We then:

  • describe how this framework was applied to a project at a high-tech company where data was visualized in different formats to aid in analysis;
  • discuss insights gained from this analysis; and
  • demonstrate that the risk estimation framework provides insight that is in line with the experience of engineers at the company.

For more detailed information, please see our technical article with supporting citations and a thesis.

In developing this framework, we grappled with the following questions:

  • how to estimate technology integration risk using concepts of technical maturity, architecture, and connectivity; and
  • how to keep this assessment effort low enough to enable practical application within industry.

In defining technology integration risk, we focused on concepts of engineering change and change propagation. For highly complex systems, engineering change is required to address mistakes during the design process resulting from uncertainty. In some cases, those changes propagate through interfaces to other components in the system. When mismanaged, relatively small changes can propagate into a cascade of changes that sweep across the system, incurring significant costs and rework. We therefore began our definition by asserting that the technology integration risk of each component i is estimated using a common risk metric—the product of likelihood and impact as seen in this equation:

Riski = Li ∙ Ii

Li is the likelihood that the component technology requires a change to fulfill its function. This is estimated by using technology readiness levels (TRLs), which have been shown to be good estimators of uncertainty in the technology integration process.

Ii is the severity of impact if the component is forced to change. We examined the overall architecture, and the component interfaces specifically, to estimate the impact of context on change propagation.

The following sections describe the rationale and method behind the inputs for our risk calculation. Given that some of our inputs are unbounded scales, we chose to calculate relative risk rather than absolute risk by rescaling all inputs to fall in the 1–10 range. We choose 1–10 for our range as this is the standard used in failure mode and effects analysis.

A. Likelihood of change

There is a relationship between the likelihood of technical or integration problems in design and the degree of certainty that we have about the design, implementation, and capabilities of a particular component or technology. As we design, test, iterate, and integrate the product or system, we drive uncertainty out through a range of validation activities. To include uncertainty in our risk calculation, it was critical to establish a means of measurement. Fortunately, NASA’s TRL scale offered a well-documented, widely used scale for measuring the degree of maturity in a given component. Maturity is also an indicator for uncertainty: Highly mature components have been well-proven in relevant environments and thus have low uncertainty levels. This is precisely the purpose of integration and testing—to minimize uncertainty within the system. The full TRL scale is presented in Table 1.

Table 1. Summary of Technology Readiness Levels from NASA’s Office of the Chief Engineer.

We evaluate each component using this 1–9 TRL scale to get the base likelihood score. Since a TRL of 9 corresponds to the lowest possible uncertainty, and thus the lowest likelihood of manifesting risks, we inverted this scale and made a TRL of 9 correspond to likelihood value of 1, and a TRL of 1 to likelihood value of 9. This produces a vector where the highest value corresponds to the highest likelihood of risks manifesting. As mentioned earlier, we also rescaled the vector linearly so that the range falls between 1–10.

B. Severity of impact

When presented with a specific engineering change, a panel of experienced engineers can provide a rough magnitude estimate of the system impact with relative ease. However, without a specific change instance, it can be difficult to conceive of how impactful future changes to any particular component may be. One approach is to estimate the component’s potential to propagate change.

Change propagation should be closely monitored in development programs because it can lead to unanticipated impacts to costs and schedule. It has been shown that change propagates between components through their interfaces.[1] Therefore, when estimating the potential impact on the overall system, it is reasonable to consider the system architecture and the connectivity of each component.

Because change propagates through interfaces, we propose that components with higher connectivity are more likely to spread change within the system. With this assumption, there are several tools at our disposal to estimate impact severity. System architecture can be analyzed as an undirected network where components are represented as nodes and interfaces as the edges between nodes. With this view, a simple method for estimating the severity of impact would be to count the number of interfaces for each component. In network terms, this would be referring to the nodal degree of the components. After rescaling the degree count for each node to fall between 1–10, we obtained a vector of scores reflecting the severity of risk for each component. The severity score was then multiplied by our likelihood vector to obtain a risk score for each component. The key advantage to this method is ease of calculation. Engineers can compute this risk score for their system with simple tools such as Microsoft Excel and immediately reap the insights.

While nodal degree is a simple measure that can be applied for this analysis, it does not consider architectural characteristics beyond immediate interfaces of the component. Alternative network analysis metrics that account for more indirect change propagation paths could also be useful, such as closeness centrality, betweenness centrality, and information centrality. Each provides a unique perspective on the importance of network nodes; however, they are all highly correlated and in most cases will net similar insights to nodal degree. Still, on occasion there will be some nodes where different measures have significant differences, and generally these nodes have unique characteristics worth examining. Calculating the three centrality measures generally requires specialized software which, while freely available, may be less accessible and more difficult to understand. Practitioners must decide which centrality measure will be most meaningful for their application.

The overall method that we apply in this research is illustrated and summarized in Figure 1.

Summary of method used to calculate risk involved in integrating a new component into a system.

The results: Analog Devices Inc., a large multinational semiconductor company headquartered in Massachusetts, was our industry partner for this research. Together we analyzed a new product development program that is currently under way for a sensor package that could be used to precisely measure angular position. We gathered the following inputs:

  • a decomposition of the system into six subsystems and 20 components,
  • a list of interfaces between every component in the system, and
  • a TRL assessment for every component in the system.

Using these data, we built a view of the system architecture and developed a network representation of the system as illustrated in Steps 1 and 2 from Figure 1. Once all data was collected, we calculated our impact and likelihood vectors as in Steps 3, 4, and 5 of Figure 1 to obtain final risk scores (Step 6). For simplicity’s sake, we demonstrated this example using nodal degree as our measure for impact. The inputs and final risk calculation is shown in Figure 2, with bars in each cell representing magnitudes.

This graphical representation of the components and their change likelihood, change impact, and overall risk scores provides an insightful view of the system integration risk.

To preserve information about interfaces, we combined risk score information with a design structure matrix (DSM) view of the system (Eppinger and Browning, 2012). To do this, we chose each off-diagonal mark in the matrix to represent a risk score composed of the two interfacing components. The calculation is done according to this equation:

Interface riskij = max(Li,Lj) ∙ max (Ii,Ij)

Li and Lj represent likelihood scores for the two interfacing components, and Ii and Ij represent impact scores for each component. We can see the intuition behind this choice in the following example: Suppose a highly uncertain (low-TRL) component were to interface with a highly connected (high-impact) component. If the high-uncertainty component had to be changed during the design process, it is possible that the highly connected component would require a change as well, and it could take careful design and planning to ensure that the change would not propagate beyond that component. Indeed, it may not be possible to fully contain the changes at this highly connected component, and thus you can see the need to scrutinize that interface carefully. Figure 3 enables us to see the results of this analysis. We leave the component-level risk calculations as a vector in the “risk” column as an additional reference.

We presented our findings to Analog Devices team and discussed the results. Analysis suggests the riskiest components were both sensors (Sensor 1 and Sensor 2), followed by the analog-to-digital converters. This aligned with the Analog Devices team’s experience and expectations. In addition, analysis shows the die attach portion of the packaging subsystem is risky. In the early phases of the data collection, the managers had mentioned that the packaging was a point of concern for them, and this is seen in the risk of the die attach.

One manager remarked that the team at Analog Devices implicitly does this kind of risk assessment mentally to gauge risk level of various components in their program. The engineer would consider the “newness” or uncertainty of a component, and the centrality to its role in the system, and use these two ideas to estimate risk. He noted that the newly developed method formalizes the thought process, making it measured and objective.

The technology risk design structure matrix provides an architectural view of the system integration risk for the Analog Devices project.

Next steps: This method could be built into an analytical tool as an add-on to an existing DSM system architecture software toolkit (for an example, see www.dsmweb.org/en/dsm-tools.html). These concepts are already being taught in MIT’s System Design & Management program and in other system-based classes.

This work will be presented at the International Conference on Engineering Design in Vancouver, Canada, in August 2017. The research team continues to pursue research related to technology integration risk, and in particular the technology readiness levels.

[1] P. J. Clarkson, C. Simons, and C. Eckert, “Predicting Change Propagation in Complex Design,” J. Mech. Des., vol. 126, no. 5, p. 788-797, 2004.

About the Authors

Steven D. Eppinger is MIT’s General Motors Leaders for Global Operations Professor, a professor of management science and engineering systems, and the codirector of MIT System Design & Management. His research centers on improving product design and development practices. He holds SB, SM, and ScD degrees in mechanical engineering from MIT.

 

Tushar Garg is a program manager in the low-voltage and system integration groups at Tesla. He has spent most of his career launching new products at automakers, including Kia, Hyundai, and Toyota. He received an SM in engineering and management from MIT as a graduate of System Design & Management. He also has a BS in mechanical engineering from the University of California, Irvine.

 

Nitin Joglekar is a dean’s research fellow and associate professor of operations and technology management at Boston University’s Questrom School of Business. His research focus is digital product management. He has a bachelor’s degree in naval architecture from the Indian Institute of Technology, Kharagpur, and two SM degrees from MIT, in mechanical and ocean engineering. He also has a PhD in management science from MIT.

 

Alison Olechowski is an assistant professor, teaching stream, at the University of Toronto in the Department of Mechanical & Industrial Engineering and the Institute for Leadership Education in Engineering. She has a BSc from Queen’s University and an MS and a PhD from MIT, all in mechanical engineering.