Meta Launches LlamaFirewall Framework to Stop AI Jailbreaks, Injections, and Insecure Code

Apr 30, 2025Ravie LakshmananSafe Coding / Vulnerability

Meta on Tuesday announced LlamaFirewall, an open-source framework designed to safe synthetic intelligence (AI) methods in opposition to emerging cyber risks comparable to immediate injection, jailbreaks, and insecure code, amongst others.

The framework, the corporate stated, incorporates three guardrails, together with PromptGuard 2, Agent Alignment Checks, and CodeShield.

PromptGuard 2 is designed to detect direct jailbreak and immediate injection makes an attempt in real-time, whereas Agent Alignment Checks is able to inspecting agent reasoning for potential purpose hijacking and oblique immediate injection eventualities.

CodeShield refers to a web based static evaluation engine that seeks to stop the era of insecure or harmful code by AI brokers.

“LlamaFirewall is constructed to function a versatile, real-time guardrail framework for securing LLM-powered functions,” the corporate said in a GitHub description of the venture.

“Its structure is modular, enabling safety groups and builders to compose layered defenses that span from uncooked enter ingestion to ultimate output actions – throughout easy chat fashions and complicated autonomous brokers.”

Alongside LlamaFirewall, Meta has made obtainable up to date variations of LlamaGuard and CyberSecEval to raised detect numerous widespread forms of violating content material and measure the defensive cybersecurity capabilities of AI methods, respectively.

CyberSecEval 4 additionally features a new benchmark referred to as AutoPatchBench, which is engineered to judge the flexibility of a big language mannequin (LLM) agent to robotically restore a variety of C/C++ vulnerabilities recognized by way of fuzzing, an method generally known as AI-powered patching.

“AutoPatchBench gives a standardized analysis framework for assessing the effectiveness of AI-assisted vulnerability restore instruments,” the corporate said. “This benchmark goals to facilitate a complete understanding of the capabilities and limitations of varied AI-driven approaches to repairing fuzzing-found bugs.”

Lastly, Meta has launched a brand new program dubbed Llama for Defenders to assist companion organizations and AI builders entry open, early-access, and closed AI options to deal with particular safety challenges, comparable to detecting AI-generated content material utilized in scams, fraud, and phishing assaults.

The bulletins come as WhatsApp previewed a brand new know-how referred to as Personal Processing to permit customers to harness AI options with out compromising their privateness by offloading the requests to a safe, confidential setting.

“We’re working with the safety neighborhood to audit and enhance our structure and can proceed to construct and strengthen Personal Processing within the open, in collaboration with researchers, earlier than we launch it in product,” Meta stated.

Discovered this text attention-grabbing? Comply with us on Twitter and LinkedIn to learn extra unique content material we submit.

Advertise here

Source link

Meta Launches LlamaFirewall Framework to Stop AI Jailbreaks, Injections, and Insecure Code

Moldovan Police Arrest Suspect in €4.5M Ransomware Attack on Dutch Research Agency

Türkiye Hackers Exploited Output Messenger Zero-Day to Drop Golang Backdoors on Kurdish Servers

ASUS Patches DriverHub RCE Flaws Exploitable via HTTP and Crafted .ini Files

Zero-Day Exploits, Developer Malware, IoT Botnets, and AI-Powered Scams

As measles outbreak grows, HHS secretary says vaccination is a personal decision that can protect individuals and communities

Trump Has Raised Nearly $1 Billion From His Various Cryptocurrency Schemes

‘Gonna Be Some Upset Seniors At Town Halls,’ Says Mark Cuban. He Calls DOGE Social Security Actions ‘A Back Door Move To Cut Payments’

‘This is not normal’: Acts of protest at Donald Trump’s address – National

US citizen carrying cannabis gummies detained in Moscow, charged with narcotics smuggling: Russia media

Meta Launches LlamaFirewall Framework to Stop AI Jailbreaks, Injections, and Insecure Code

Related Posts