- Nous Research launches Hermes 4 open-source AI models that outperform ChatGPT on math benchmarks with uncensored responses and hybrid reasoning capabilities.
- Salesforce launches CRMArena-Pro, a simulated enterprise AI testing platform, to address the 95% failure rate of AI pilots and improve agent reliability, performance, and security in real-world business deployments.
- Anthropic launches a limited pilot of Claude for Chrome, allowing its AI to control web browsers while raising critical concerns about security and prompt injection attacks.
- Take this blind test to discover whether you truly prefer OpenAI's GPT-5 or the older GPT-4o—without knowing which model you're using.
- Walmart CISO Jerry Geisler on securing agentic AI, modernizing identity, and Zero Trust for enterprise-scale cybersecurity resilience.
- A new MIT report reveals that while 95% of corporate AI pilots fail, 90% of workers are quietly succeeding with personal AI tools, driving a hidden productivity boom.
- China's DeepSeek has released a 685-billion parameter open-source AI model, DeepSeek V3.1, challenging OpenAI and Anthropic with breakthrough performance, hybrid reasoning, and zero-cost access on Hugging Face.
- Russia's APT28 tested LLM-powered malware on Ukraine. The same tech that breaches enterprises is now selling for $250/month on the dark web.
- New research reveals how OS agents — AI systems that control computers like humans — are rapidly advancing while raising serious security and privacy concerns.
- Anthropic's Claude Opus 4.1 achieves 74.5% on coding benchmarks, leading the AI market, but faces risk as nearly half its $3.1B API revenue depends on just two customers.


