OWASP's Chat Playground lets security teams toy with gen AI

owasp-chat-playground-security testing-aiA new interactive tool for learning about securing generative AI models called Chat Playground has been launched by the OWASP Gen AI Security Project. Steve Wilson, co-chair of the Gen AI Security Project, said that group wanted to provide something with a low bar to getting started — and with Chat Playground, testing teams only need a web browser.

Introduction to Malware Binary Triage (IMBT) Course

Looking to level up your skills? Get 10% off using coupon code: MWNEWS10 for any flavor.

Enroll Now and Save 10%: Coupon Code MWNEWS10

Note: Affiliate link – your enrollment helps support this platform at no extra cost to you.

"It gives you a really easy way to play with some vulnerable bots and some guardrails and get a feel for what it's really like to try and secure these things'"
Steve Wilson

The tool, which is hosted on GitHub, first asks a user to choose a chat personality based on an AI model. For example, "Eliza" is a therapist based on a simple bot model, and "Bob" is a tech support bot that uses ChatGPT. The user can then customize the display; conversations can be shown in green letters on a black background, for instance, or with an iMessage format, with dialog appearing in blue bubbles. Teams can experiment with a variety of input and output filters, or guardrails. Testers can also add API keys.

In Chat Playground, teams can experiment with an assortment of gen AI chat scenarios. One example draws from a case recounted in Wilson's 2024 book on large language model (LLM) security, published by O'Reilly. The case involved Tay, an early chatbot developed by Microsoft that began to spew offensive tweets after its short exposure to Twitter and had to be pulled offline after only 16 hours. "We simulate that in Chat Playground. One of the bot personalities is jailbroken. It will curse you out, say bad things, use bad words. You then have to put guardrails in place to see how they work and what gets detected and what doesn't," he said.

The Gen AI Security Project posted the tool on GitHub to make it available for a general audience. "It's super easy to download and fork and hack, especially in this world of vibe coding," Wilson explained. "You can download the source code and get out your favorite coding assistant."

"Even if you're not a hardcore developer, you can try different kinds of guardrails or make a new kind of bot with new behaviors. It lets you experiment outside your production system in a safe place."
—Steve Wilson

Here's what you need to know about OWASP's new Chat Playground — and how to put it to work in your organization to secure your gen AI.

[ Get White Paper: How the Rise of AI Will Impact Software Supply Chain Security ]

A browser-based gen AI sandbox is born

Chat Playground offers a practical glimpse into how AI can enhance content moderation to be dynamic and go beyond traditional or static rule-based systems, explained Melody (MJ) Kaufmann, an author and instructor at O'Reilly Media.

At its core, the tool is a browser-based sandbox designed to showcase dynamic filtering using LLMs, Kaufmann said. It introduces a more adaptable method, unlike static filters that rely on hardcoded keyword blocklists, which are notoriously brittle and easy to evade.

"It leverages AI’s ability to understand language in context, enabling it to catch harmful or inappropriate content even when it’s masked by euphemisms, slang, or creative misspellings."
Melody (MJ) Kaufmann

Kaufmann said that, as a gamer, she is familiar with how users are able to outsmart the algorithmic filters that game makers often use to block bad language. "As players find the filters, they adapt their language to circumvent the filter using creative spellings, numbers, and other workarounds. That, of course, creates a spiral of the developers creating more filters, which are promptly abused by nefarious elements within the community in new and unique ways," she said.

From a security standpoint, Chat Playground is thoughtfully scoped. Requiring users to supply their own OpenAI API key decentralizes risk and prevents abuse of the developer’s infrastructure, Kaufmann said.

"As an educator, I like that it encourages experimentation in a controlled, low-risk environment. Security teams evaluating new moderation technologies or designing guardrails for LLM-based applications may find this a useful sandbox to test behaviors, develop threat models, or educate peers."
—Melody (MJ) Kaufmann

Testing tool offers surprising depth

The most valuable security insight from Chat Playground isn't technological at all, said Dev Nag, CEO and founder of the QueryPal chatbot.

"It's watching nontechnical executives finally understand LLM vulnerabilities when they see a jailbreak happen live in front of them in 30 seconds'"
Dev Nag

Nag said Chat Playground offers surprising depth despite its stripped-down appearance. It allows security researchers to test prompt injections, content filtering, and UI manipulation without needing specialized infrastructure or risking production systems. The tool's local pattern-matching Eliza clone, called SimpleBot, can also be valuable to red teams.

"It generates toxic content offline without API costs or bans, creating a perfect 'malicious oracle' for testing guardrail effectiveness."
—Dev Nag

The browser-only architecture enhances security, Nag said. "API keys stay in local storage, no server stores sensitive transcripts, and any malicious code returned by models remains inert text," he said. "The visual feedback showing moderation scores — like 98% violence probability — simultaneously helps defenders understand filter behavior while teaching attackers how to carefully craft threshold-skipping payloads."

"Playground serves a unique niche in security tooling by prioritizing immediate accessibility over comprehensive features."
—Dev Nag

Out-of-band AI controls are missing

Casey Bleeker, CEO and co-founder of SurePath AI, said he loves the focus on testing security controls for different use cases and hopes it will expand to include many of the complex guardrails and controls in use within the ecosystem. However, at its core, Chat Playground can’t apply testing of the most critical AI security controls: the out-of-band controls for AI policy.

Policy-based access controls to models, filtering and classification of requests separate from model inference, and enforcement of data controls based on user identity all but eliminate many of the enterprise risks included in the OWASP LLM Top 10, Bleeker said.

"Failure to apply out-of-band controls leaves the control plane for enforcement of policy squarely under the influence of the data plane, which is a foundational security flaw in any system, not just AI. Regular expressions and in-model guardrails can be valuable but can also be Band-Aids masking the larger unaddressed risks beneath."
Casey Bleeker

Next up for Chat Playground

Wilson said that he will be expanding Chat Playground in the coming weeks. "I'm getting ready to put out a new version that adds more guardrails and more bots and more techniques," he said. "I want to expand it to cover, not only traditional guardrails, but to cover things like supply-chain security, SBOMs, RAG [retrieval-augmented generation], and indirect prompt injections — all sorts of fun things like that."

Article Link: OWASP's Chat Playground lets security teams toy with gen AI