Vulnerabilities in Hugging Face’s model-loading libraries allow for malicious model uploads leading to code execution. The use of deprecated methods and a lack of robust validation for legacy model formats create opportunities for attackers to inject and execute arbitrary code. This affects various ML frameworks integrated with Hugging Face.
The use of Artificial Intelligence (AI) to automatically discover vulnerabilities in code is becoming increasingly prevalent, with researchers developing new methods to effectively scan source code and find zero-days in the wild. Companies like ZeroPath are combining deep program analysis with adversarial AI agents to uncover critical vulnerabilities, often in production systems, that traditional security tools struggle to detect. While AI-based vulnerability discovery is still in its early stages, its potential to enhance security measures is undeniable. This development could significantly improve the effectiveness of security testing and lead to the identification of vulnerabilities earlier in the development cycle, reducing the risk of exploitation.
Mindgard security researchers have found two vulnerabilities in Microsoft Azure’s content safety filters for AI, namely AI Text Moderation and Prompt Shield. These vulnerabilities allow attackers to bypass these safeguards and inject malicious content into protected large language models (LLMs). Mindgard’s testing involved exposing ChatGPT 3.5 Turbo with Azure OpenAI to these filters and then using character injection and adversarial ML evasion techniques to circumvent them. The first method, character injection, involved adding specific characters and text patterns to prompts, leading to a significant drop in jailbreak detection effectiveness. The second, adversarial ML evasion, further reduced the effectiveness of both filters by finding blind spots in their ML classification systems. Microsoft acknowledged the issue and has been working on fixes for upcoming model updates. However, Mindgard emphasizes the seriousness of these vulnerabilities, as attackers could exploit them to compromise sensitive information, gain unauthorized access, manipulate outputs, and spread misinformation.
Researchers at Protect AI plan to release a free, open-source tool that can find zero-day vulnerabilities in Python codebases with the help of Anthropic’s Claude AI model. This tool leverages the power of LLMs to analyze code and identify potential security issues, potentially improving the speed and efficiency of vulnerability detection. The tool is designed to help developers identify and mitigate vulnerabilities early in the development cycle, improving the overall security of Python applications. This highlights the potential of AI to be used for proactive security measures and to enhance the security posture of software applications.