Organizations are embracing genAI are facing severe security challenges, with many of their LLM deployments riddled with serious vulnerabilities, most of which remain unresolved.
According to Cobalt’s State of LLM Security Report 2025, 32% of vulnerabilities uncovered during LLM-focused penetration tests were rated as high or critical risk. Alarmingly,only 21% of those vulnerabilities had been remediated, marking the lowest fix rate of any category tested.
The Concern-Action Disconnect
While 72% of survey respondents named genAI-related attacks as their top IT risk, only 66% reported conducting regular security assessments of their AI deployments. The remaining third are flying blind – despite the growing integration of LLMs into customer-facing and backend systems.
Data security topped the list of concerns about LLM and AI application vulnerabilities, with 46% citing sensitive information disclosure and 42% highlighting risks like model poisoning or theft. However, Cobalt’s pentest findings suggest these concerns are misplaced: traditional application vulnerabilities such as SQL injection and stored XSS still ranked among the common weaknesses in LLM-powered apps.
Fast Fixes, But Only for Simple Issues
For the serious LLM vulnerabilities they do fix, organizations act quickly – on average within 19 days, the fastest mean time to resolution across all categories. But this speed may be misleading. The report suggests that companies are only addressing the easiest issue, typically those not tied to third-party model providers or architectural design flaws, leaving more complex threats unmitigated.
These findings point to a “fix-what’s-easy” mindset that prioritizes speed over long-term security, especially as organizations race to put AI features to market.
Case Studies: Prompt Injection and Excessive Agency
Cobalt’s report includes anonymized case studies that demonstrate the real-world implications of these vulnerabilities:
- In one instance, a prompt injection flaw in an education AI tutor allowed pentesters to bypass safety filters and elicit inappropriate content from the model.
- In a fintech deployment, a similar flaw enabled testers to extract PII and database schema details.
- Another test uncovered excessive agency in an LLM designed for financial analytics, with testers manipulating the model into performing functions beyond its scope, potentially leading to data exfiltration or unauthorized activity.
Industry Blind Spots
The Cobalt report suggests that certain sectors are particularly vulnerable. The education sector showed one of the highest rates of serious findings (17.6%) yet reported the lowest frequency of genAI security testing (33%). Manufacturing also suffered, with 18.1% serious vulnerabilities and a self-reported need for more testing capacity.
By contrast, the financial sector had relatively low ratees of critical LLM flaws (11.2%) and demonstrated a high level of concern for third-party risks and AI governance, likely due to tighter regulation and more mature security practices than other sectors.
Interestingly, security leaders and frontline practitioners expressed different priorities. Leaders were more concerned with long-term threats from genAI, while practitioners focused on present-day risks like inaccurate data and insecure outputs.
Why LLM Security is So Hard
The report identifies multiple barriers to effective LLM security, including:
- A shortage of skilled personnel with experience in LLM security
- Speed-to-market pressures that limit time for testing or review
- Dependencies on external model providers, which can delay or complicate remediation
- The interactive and data-centric design of LLMs, which expands the bast radius of any compromise
These issues mean that traditional security controls often fall short. As the report puts it, “what worked for cloud apps won’t cut it for LLMs.”
Recommendations
Cobalt recommends that organizations adopt a dedicated, offensive security mindset for LLMs. This includes:
- Continuous, human-led penetration testing using frameworks like OWASP’s LLM Top 10.
- Implementing robust input/output filtering and least-privilege access controls.
- Managing AI-specific supply chain risks through contractual and technical safeguards.
- Fostering collaboration between security and AI development teams from day one.
Josh is a Content writer at Bora. He graduated with a degree in Journalism in 2021 and has a background in cybersecurity PR. He's written on a wide range of topics, from AI to Zero Trust, and is particularly interested in the impacts of cybersecurity on the wider economy.
The opinions expressed in this post belong to the individual contributors and do not necessarily reflect the views of Information Security Buzz.


