OWASP's LLM AI Security & Governance Checklist: 13 action items for your team

AI-LLM-OWASP-checklistArtificial intelligence is developing at a dizzying pace. And if it's dizzying for people in the field, it's even more so for those outside it, especially security professionals trying to weigh the risks the technology poses to their organizations.

That's why the Open Web Application Security Project (OWASP) has introduced a new cybersecurity and governance checklist for those security pros who are striving to protect against the risks of hasty or insecure AI implementations.

Sandy Dunn, a member of the OWASP Top 10 for Large Language Models (LLM) Applications team, wrote in a LinkedIn posting:

"[T]his checklist is a valuable resource for executive technology, cybersecurity, privacy, compliance, and legal leaders to strategize and secure their AI initiatives effectively."

Chris Romeo, co-founder and CEO of the threat modeling company Devici, said security teams need such a checklist for LLM security to have any chance of achieving “trustworthy” AI.

"The trust problem with AI can be summarized as 'How do we know if the answers coming out of the LLM are factual, correct, and worth applying?' We need steps for security and privacy assurance leading us down a path toward building things that scratch the surface of trustworthiness."
Chris Romeo

LLM systems are highly complex, even when compared to typical AI systems, said Shing-hon Lau, a senior AI security researcher at the CERT division of Carnegie Mellon University's Software Engineering Institute. He said LLM systems are designed to be easily used by anybody and have an output space that is vast — any English text at the minimum, usually.

"A highly complex system with a lot of interconnecting parts that can output almost anything is hard to secure, control, or govern. This complexity necessitates a deep tech stack, which means a large attack surface. A checklist helps ensure that nothing obvious is overlooked."
Shing-hon Lau

The new OWASP LLM AI Security & Governance Checklist (PDF) is organized into 13 areas of analysis. Here are the most important points from each area.

[ Related: GenAI and low-code: What could go wrong? | Report: The State of Software Supply Chain Security (SSCS) 2024 | Download Report: State of SSCS ]

1. Adversarial risk

This risk consideration should cover both adversaries and competitors. Consider how competitors are using AI and how existing controls may no longer provide their intended security in the face of generative AI (GenAI) attacks.

2. Threat modeling

Doing threat modeling for GenAI accelerated attacks and before deploying LLMs is the most cost-effective way to Identify and mitigate risks, protect data, shield privacy, and ensure a secure, compliant integration within the business, the checklist authors argued.

However, Romeo said he finds the definition of threat modeling used in the checklist a bit rigid. "Threat modeling is analyzing a representation to uncover security and privacy challenges for mitigation," he explained. "The question format they use is tricky to understand."

"Describing threats works better as a story than a blunt question. Blunt questions are too easy to skip over."
—Chris Romeo

For example, the checklist asks, "How will attackers accelerate exploit attacks against the organization, employees, executives, or users?" Romeo said it's better to describe threats in this way: An attacker hyper-personalizes a phishing attack using GenAI, making an end user more likely to click on a link.

3. AI asset inventory

The inventory should apply to both internally developed and third-party solutions. Existing AI services, tools, and owners should be cataloged, and AI components should be included in a software bill of materials (SBOM).

4. AI security and privacy training

All awareness training should be updated to include GenAI threats, and when GenAI solutions are adopted, they should include training for DevOps and cybersecurity for the deployment pipeline.

5. Establish business cases

Solid business cases are essential to determining the business value of any proposed AI solution, balancing risk and benefits, and evaluating and testing return on investment. Explore how the AI solution can enhance the customer experience, improve operational efficiency, provide better knowledge management, and enhance innovation.

6. Governance

Corporate governance in LLM is needed to provide organizations with transparency and accountability.

Identifying AI platform or process owners who are potentially familiar with the technology or the selected use cases for the business is not only advised but also necessary to ensure adequate reaction speed that prevents collateral damages to well-established enterprise digital processes.

7. Legal

An IT, security, and legal partnership is critical to identifying gaps and addressing obscure decisions because many of the legal implications of AI are undefined and potentially very costly.

8. Regulatory

Government compliance requirements need to be determined, and how data is collected, stored, and used by AI systems must be defined.

9. Using or implementing LLM solutions

LLM components and architecture trust boundaries need to be threat-modeled. Explore how data is classified, protected, and accessed. Check pipeline security for training data and system inputs and outputs. Request third-party audits, penetration testing, and code reviews. Include LLM incidents in tabletop exercises.

10. Testing, evaluation, verification, and validation 

Establish continuous testing, evaluation, verification, and validation (TEVV) throughout the AI model lifecycle and provide regular executive metrics and updates on AI Model functionality, security, reliability, and robustness.

11. Model and risk cards

Model and risk cards increase the transparency, accountability, and ethical deployment of LLMs. Model cards contain standardized information on a system's design, capabilities, and constraints. Risk cards identify potential negative consequences, such as biases, privacy problems, and security vulnerabilities. These cards should be reviewed, if available, and a process should be established to track and maintain any deployed model, including models used by third parties.

12. RAG: LLM optimization

Retrieval-augmented generation (RAG) is a way to optimize an LLM for a user's needs by enabling the model to access information outside its training data. Ajitesh Kumar, director of data and technology products for clinical data access management for Novartis, wrote in Analytics Yogi: "The true prowess of the RAG LLM is evidenced in the quality of its outputs, especially in open-domain question-answering tasks."

"Traditional LLM models relying solely on pre-encoded knowledge often falter when faced with novel or niche queries. However, RAG bridges this gap by leveraging its retrieval mechanism to fetch relevant, up-to-date information, which the generator then skillfully incorporates into coherent and contextually rich responses."
Ajitesh Kumar

Before implementing RAG, organizations should be aware of the pain points in the process and how to solve them.

13. AI red teaming

Red team testing should be a standard practice for AI models and applications.

It's a starting point, not the end game in LLM security

The authors of the checklist caution that although it intends to support organizations in developing an initial LLM strategy in a rapidly changing technical, legal, and regulatory environment, it is not exhaustive and does not cover every use case or obligation. While using the list, organizations should extend assessments and practices beyond the scope of the provided checklist as required for their use case or jurisdiction, they recommended.

Nevertheless, they maintained that the checklist can be used to formulate strategies to improve accuracy, define objectives, preserve uniformity, and promote focused deliberate work, reducing oversights and missed details. Following a checklist not only increases trust in a safe adoption journey, they noted, but also encourages future organizations' innovations by providing a simple and effective strategy for continuous improvement.

Implementation matters

Implementing the new OWASP checklist can pose some challenges for organizations. "There will be challenges of prioritization," noted Chris Hughes, president of digital services company Aquia.

"Organizations need to look at strengths and weaknesses, risk appetite, industry environment, regulatory requirements, software production and consumption, data sensitivity. Those are all key considerations, and how organizations prioritize them will be different from organization to organization."
Chris Hughes

It may also be challenging to enlist stakeholders from other parts of the organization. "The checklist touches on compliance, legal, and other aspects of the organization that need to be engaged," Hughes said. "Getting their involvement and commitment to tackle some of these risks is going to take time and attention from them. And they may not understand the technology, so there may be an educational aspect to it."

CMU's Lau agreed, saying that forming the right team can be difficult when it comes to cybersecurity and governance of LLMs. He noted that in looking at the major categories of the checklist, there is a need to understand risk, threat actions, AI assets, AI training, privacy, the business use case, governance and policy, legal and regulatory aspects, TEVV cybersecurity, and machine learning. "As a starting point, security teams would benefit by identifying relevant experts for each of these categories."

Lau said that more specific recommendations and steps would help in the future.

"The items in the checklist can differ dramatically in scale — establish a good culture versus check if specific AI compliance requirements apply, for example — which makes it difficult to use in practice. Many of the items are also non-actionable or difficult to action."
—Shing-hon Lau

Article Link: OWASP's LLM AI Security & Governance Checklist: 13 action items for your team