#main-nav-header { display:none; }

Securing LLMs

Understanding the risks of embedded LLMs in applications

While artificial intelligence (AI) large language models enhance how users interact with your applications, they also introduce risk to your digital assets.

Large language models — such as those provided by OpenAI (ChatGPT), Google (Bard), and Meta (LlaMA) — accelerate the processing of vast amounts of text data and are trained to learn how to continuously improve their outputs. But because the use of AI has exploded, these LLM models are also a prime target of cybercriminals. For instance, Darktrace researchers found a 135% increase in novel social engineering attacks from January to February 2023, corresponding with the widespread adoption of ChatGPT.

AI LLMs outpaced security protocols

Businesses wanting to harness AI are rapidly integrating the technology into their internal operations and client-facing services. However, the accelerated pace at which AI is being adopted can leave applications vulnerable if security protocols are not upgraded.

Large language models are just like any other component of your application supply chain. They are subject to cyberattacks that could exploit your IT infrastructure to compromise and manipulate sensitive data.

This is no surprise since applications that accept user inputs have long been vulnerable to attacks like SQL injection and malicious links in user-generated content. Since AI accepts user inputs like commands and queries, attackers who gain access can then manipulate the model.

10 types of LLM attacks and the risk they introduce

Attacks on AI large language models come in many forms and introduce risk in a variety of ways, for example:

Invisible text that injects prompts can induce models to produce phishing emails, extract training data that reveal sensitive information, or use backdoors to embed malicious code.
Manipulating a model into producing misleading outputs can lead to false conclusions for other users.
Copying a model's file system can lead to stolen intellectual property that might be sold to a competitor, resulting in economic losses or compromising a market advantage.
Using natural language makes it easier to mislead users and exploit the model.
Deliberately crafted information can be placed into consumed documents, which could result in an attacker taking over a user session.
Prompt injection manipulates models with direct injections that overwrite system prompts or indirect injections that manipulate user inputs.
Insecure output handling exposes backend web systems to malicious code that’s inserted into front-end applications with the hopes of tricking end-users into clicking on the code.
Resource-heavy operations on AI models can lead to service degradation and high compute costs.
Software supply chains are also a threat if you rely on LLM model components from a third party, which can compromise an application by introducing additional model datasets and plugins.
Models that trick end-users into revealing confidential data when they submit a response.

To enable your applications to continue increasing the value they deliver to end-users by applying AI, it’s critical to implement the right security strategies to keep those applications safe. To help CISOs assess the risk of LLM vulnerabilities, The Open Worldwide Application Security Project (OWASP) published a Top 10 for LLM advisory.

Defending against these risks is a largely untested area. While many companies rush to incorporate generative AI with LLMs into their applications, some, such as Samsung and Apple, have banned the models entirely, at least temporarily.

Securing LLMs

To protect your organization from attacks on the large language models used by AI tools, apply a security strategy that protects against unsafe application components. As a starting point, here are a few tactics to prevent application breaches that could lead to data leaks that may compromise your organization:

Analyze network traffic for attack patterns that indicate a breached LLM that could compromise applications and user accounts.
Establish real-time visibility into transport-layer traffic patterns to visualize packets and data interacting with LLMs at the bit level.
Apply data loss prevention techniques to secure sensitive data in transit.
Verify, filter, and isolate traffic to protect users, devices, and applications from compromised LLMs.
Isolate remote user browsers by running code at the edge to insulate them from an LLM with injected malicious code.
Use WAF-managed rulesets (such as OWASP core rules and vendor rules) on your web application firewall to block LLM attacks based on SQL injection, cross-site scripting, and other web attack vectors while also avoiding false-positive alerts.

As you apply these strategies, consider your end-users. While it’s critical to mitigate vulnerabilities, application interfaces should still be easy to navigate and not force users to go through too many steps to access applications. Also, test your mitigation efforts to see if they take up precious bandwidth.

It’s also important to integrate your approach within an overall Zero Trust strategy. By default, never trust and always validate users and devices, even if connected to a corporate network and even if previously verified. With Zero Trust, you can create an aggregation layer for access to all your self-hosted, SaaS, and non-web applications to shrink your attack surface — by granting context-based, least-privilege access per resource rather than network-level access.

Protection without compromising the user experience

Cloudflare can help organizations safely experiment with AI while following best practices and without compromising the user experience. Using Data Protection, organizations can protect data everywhere — across web, SaaS, and private applications. AI Gateway helps organizations gather insights on how people are using AI applications and control how the application scales with features like caching and rate limiting.

Roughly 20% of all Internet traffic transits the Cloudflare network — resulting in Cloudflare blocking an average of ~227 billion cyber threats per day. Analyzing this massive body of intelligence gives Cloudflare unparalleled insight into the AI threat landscape.

This article is part of a series on the latest trends and topics impacting today’s technology decision-makers.