Even the most permissive corporate AI models have sensitive topics that their creators would prefer they not discuss (e.g., weapons of mass destruction, illegal activities, or, uh, Chinese political ...
Amidst equal parts elation and controversy over what its performance means for AI, Chinese startup DeepSeek continues to raise security concerns. On Thursday, Unit 42, a cybersecurity research team at ...
Even the tech industry’s top AI models, created with billions of dollars in funding, are astonishingly easy to “jailbreak,” or trick into producing dangerous responses they’re prohibited from giving — ...
Companies that offer AI services to the public, like Anthropic and OpenAI, try to prevent out-of-pocket behavior from their AI models by establishing "guardrails" on them, hopefully preventing their ...
Even the tech industry’s top AI models, created with billions of dollars in funding, are astonishingly easy to “jailbreak,” or trick into producing dangerous responses they’re prohibited from giving — ...
Three years into the "AI future," researchers' creative jailbreaking efforts never cease to amaze. Researchers from the Sapienza University of Rome, the Sant’Anna School of Advanced Studies, and large ...
Much like me, AI models can be manipulated by poetry. Credit: Photo by Philip Dulian/picture alliance via Getty Images Well, AI is joining the ranks of many, many people: It doesn't really understand ...