The case for SBOM benchmarks: "Ground truth" is key


Software Bills of Materials (SBOM) are designed to help software teams protect their supply chains, by making the composition of applications more transparent. But a lack of standardization creates a challenge for using SBOMs to bolster security.

Researchers Henrik Plate and Joseph Hejderup exposed the challenges when they ran several SBOM tools on the same software product as part of an evaluation for Endor Labs. They discovered discrepancies in the results. Naming conventions also contributed to variable output.

"Every SBOM generator adds a little bit of different detail to its output, making SBOMs less compatible with each other. There are a number of differences in how components are identified. Tools can give components slightly different names. For the SBOM user, that can be confusing."
Henrik Plate

Another source of discrepancies: omissions. "Some tools don't find components at all, even though they're present, which can also be problematic," Plate added.

Mike Parkin, a senior technical engineer with Vulcan Cyber, said that if a tool omits a vulnerable library, an SBOM will falsely show an application is not susceptible to a vulnerability when it is.

Charlie Jones, Director of Product for ReversingLabs Software Supply Chain Security, noted that many SBOMs today are generated by software composition analysis (SCA) tools that rely on build manifests, which can introduce a risk that the resulting list of components and dependencies will be incomplete.

"This is because build manifests only include a list of components that should be in the final build. This approach overlooks the possibility of the inadvertent or malicious addition of components by either a developer or malicious actor."
Charlie Jones

The unavoidable conclusion is that SBOMs need to adhere to some standard benchmarks to provide their full value to software teams charged with software supply chain security. Here are some options for making SBOMs more actionable for security teams. 

Get a free comprehensive SBOM and supply chain risk analysis report ]

Binary analysis beats build manifest for SBOM ground truth 

Jones advocates for static binary analysis for generating SBOMs. Such an approach recursively unpacks software packages, extracting internal indicators and metadata— rather than relying on build manifests-- to form a more complete bill of materials.

"This provides a validation of components and dependencies which is independent of what is declared in a build manifest. This independent validation enables both software publishers and consumers to verify the code they are working with and take appropriate action to protect themselves and their partners from software supply chain risks."
—Charlie Jones

Call graphs expose unused components

Omissions can make an SBOM untrustworthy, but so can superfluous inclusions. "When a component is included, but never actually used in practice, it can lead to fixing a problem that may not exist," Parkin observed.

One way to address the issue of unused components is with call graph tools. Plate explained that call graphs add additional information to explain how a component is used in the context of the software. "It allows a user to see what functions of a component are used or not used, vulnerable or not vulnerable." 

"With metadata, you're just looking at the outside of the box. With call graph, you're looking inside the box, into the functions and function detail."
—Henrik Plate

There is a price, however, to garnering all that detail in an SBOM. Call graph analysis is a resource-consuming activity, Plate said. "A call graph has to imagine all possible executions of a piece of software. That can be a resource-consuming and computationally expensive tool."

Metadata and runtime analysis for SBOM ground truth

David Lindner, CISO of Contrast Security, said call graphs are a "decent mechanism" for generating SBOMs, and will initially provide you with an ability to determine what libraries are being referenced in code, "which may be drastically different than an SBOM that has used metadata."

However, Lindner said call graphs are not perfect.

"Applications are extremely complex. There are applications with thousands of potential paths for every call. They will never be able to be mapped efficiently or accurately. What this means is, as application complexity and depth increase, the likelihood of missing libraries increases."
David Lindner

By using metadata to create an SBOM, teams are able to establish a "ground truth." The method should be used at both the pre- and post-build stages of application development, then the two should be compared, which enables a development team to determine which software libraries are used only in test and development, and which are in the production build, Lindner said.

But using metadata alone for SBOMs is a problem too. "[You] will most certainly miss any transitive dependencies, and any dependencies that are pulled in or used by the running environment."

Lindner advocates generating a runtime SBOM, in addition to the pre- and post-build SBOMs. Runtime analysis helps organization create an extremely accurate SBOM of its live running environment, he explained.

Runtime SBOM analysis has eyes on what code is being run without needing any call graphs or metadata, and can be extremely accurate when referencing transitive dependencies as many levels deep as the code goes.

"The one downside to using runtime analysis is if a path or flow is never used or followed by a user of the application, there could be a missed library. But is that library important if it is never used?"
—David Lindner

Metadata is used as the current de-facto standard for establishing the "ground truth" for an SBOM. Tom Goings, director of product management at Tanium, said call graphs have their place, but today's reliance on third-party software calls for a new standard for SBOMs.

"If we want to improve on the creation of these files, a standard for testing the outputs of SBOMs would provide a better way of doing it." 
Tom Goings

Wanted: SBOM benchmarks

Plate called for a vendor-neutral organization to create SBOM benchmarks that would serve as a more accurate ground truth than we have now — something along the lines of what was done for Java Virtual Machines with Dacapo.

With neutral benchmarks, vendors can show how their SBOMs compare to those produced by the benchmarks.

"Now we have software vendors generating SBOM. One vendor's SBOM generator will say one thing, while another's will say something else. So it's hard for the consumer of the SBOMs to figure out who's right."
—Henrik Plate

Article Link: The case for SBOM benchmarks: "Ground truth" is key