We pride ourselves on having forward-thinking ideas and strategic alliances to successfully navigate our clients through complex options and challenges they face while remaining relevant in today’s fast-paced economy.
One Purpose.
We pride ourselves on having forward-thinking ideas and strategic alliances to successfully navigate our clients through complex options and challenges they face while remaining relevant in today’s fast-paced economy.
One Purpose.
We pride ourselves on having forward-thinking ideas and strategic alliances to successfully navigate our clients through complex options and challenges they face while remaining relevant in today’s fast-paced economy.
One Purpose.
Experience
Partners
Expertise
What can Optimus do for you?
What can Optimus do for you?
What can Optimus do for you?

Shaping the future of data management, security, and power solutions.
Optimus Technology Group is a leading provider of data management, physical security, and power supply services designed to keep pace with the rapid development of the data management and IT landscape.
Everything we do is designed to better equip clients with the ability to maximize the latest technology, equipment, and network solutions, empowering them to fully realize their business objectives.

Shaping the future of data management, security, and power solutions.
Optimus Technology Group is a leading provider of data management, physical security, and power supply services designed to keep pace with the rapid development of the data management and IT landscape.
Everything we do is designed to better equip clients with the ability to maximize the latest technology, equipment, and network solutions, empowering them to fully realize their business objectives.

Shaping the future of data management, security, and
power solutions.
Optimus Technology Group is a leading provider of data management, physical security, and power supply services designed to keep pace with the rapid development of the data management and IT landscape.
Everything we do is designed to better equip clients with the ability to maximize the latest technology, equipment, and network solutions, empowering them to fully realize their business objectives.
Latest Tech News
From shiny object to sober reality: The vector database story, two years later
When I first wrote “Vector databases: Shiny object syndrome and the case of a missing unicorn” in March 2024, the industry was awash in hype. Vector databases were positioned as the next big thing — a must-have infrastructure layer for the gen AI era. Billions of venture dollars flowed, developers rushed to integrate embeddings into their pipelines and analysts breathlessly tracked funding rounds for Pinecone, Weaviate, Chroma, Milvus and a dozen others.
The promise was intoxicating: Finally, a way to search by meaning rather than by brittle keywords. Just dump your enterprise knowledge into a vector store, connect an LLM and watch magic happen.
Except the magic never fully materialized.
Two years on, the reality check has arrived: 95% of organizations invested in gen AI initiatives are seeing zero measurable returns. And, many of the warnings I raised back then — about the limits of vectors, the crowded vendor landscape and the risks of treating vector databases as silver bullets — have played out almost exactly as predicted.
Prediction 1: The missing unicorn
Back then, I questioned whether Pinecone — the poster child of the category — would achieve unicorn status or whether it would become the “missing unicorn” of the database world. Today, that question has been answered in the most telling way possible: Pinecone is reportedly exploring a sale, struggling to break out amid fierce competition and customer churn.
Yes, Pinecone raised big rounds and signed marquee logos. But in practice, differentiation was thin. Open-source players like Milvus, Qdrant and Chroma undercut them on cost. Incumbents like Postgres (with pgVector) and Elasticsearch simply added vector support as a feature. And customers increasingly asked: “Why introduce a whole new database when my existing stack already does vectors well enough?”
The result: Pinecone, once valued near a billion dollars, is now looking for a home. The missing unicorn indeed. In September 2025, Pinecone appointed Ash Ashutosh as CEO, with founder Edo Liberty moving to a chief scientist role. The timing is telling: The leadership change comes amid increasing pressure and questions over its long-term independence.
Prediction 2: Vectors alone won’t cut it
I also argued that vector databases by themselves were not an end solution. If your use case required exactness — l ike searching for “Error 221” in a manual—a pure vector search would gleefully serve up “Error 222” as “close enough.” Cute in a demo, catastrophic in production.
That tension between similarity and relevance has proven fatal to the myth of vector databases as all-purpose engines.
“Enterprises discovered the hard way that semantic ≠ correct.”
Developers who gleefully swapped out lexical search for vectors quickly reintroduced… lexical search in conjunction with vectors. Teams that expected vectors to “just work” ended up bolting on metadata filtering, rerankers and hand-tuned rules. By 2025, the consensus is clear: Vectors are powerful, but only as part of a hybrid stack.
Prediction 3: A crowded field becomes commoditized
The explosion of vector database startups was never sustainable. Weaviate, Milvus (via Zilliz), Chroma, Vespa, Qdrant — each claimed subtle differentiators, but to most buyers they all did the same thing: store vectors and retrieve nearest neighbors.
Today, very few of these players are breaking out. The market has fragmented, commoditized and in many ways been swallowed by incumbents. Vector search is now a checkbox feature in cloud data platforms, not a standalone moat.
Just as I wrote then: Distinguishing one vector DB from another will pose an increasing challenge. That challenge has only grown harder. Vald, Marqo, LanceDB, PostgresSQL, MySQL HeatWave, Oracle 23c, Azure SQL, Cassandra, Redis, Neo4j, SingleStore, ElasticSearch, OpenSearch, Apahce Solr… the list goes on.
The new reality: Hybrid and GraphRAG
But this isn’t just a story of decline — it’s a story of evolution. Out of the ashes of vector hype, new paradigms are emerging that combine the best of multiple approaches.
Hybrid Search: Keyword + vector is now the default for serious applications. Companies learned that you need both precision and fuzziness, exactness and semantics. Tools like Apache Solr, Elasticsearch, pgVector and Pinecone’s own “cascading retrieval” embrace this.
GraphRAG: The hottest buzzword of late 2024/2025 is GraphRAG — graph-enhanced retrieval augmented generation. By marrying vectors with knowledge graphs, GraphRAG encodes the relationships between entities that embeddings alone flatten away. The payoff is dramatic.
Benchmarks and evidence
Amazon’s AI blog cites benchmarks from Lettria, where hybrid GraphRAG boosted answer correctness from ~50% to 80%-plus in test datasets across finance, healthcare, industry, and law.
The GraphRAG-Bench benchmark (released May 2025) provides a rigorous evaluation of GraphRAG vs. vanilla RAG across reasoning tasks, multi-hop queries and domain challenges.
An OpenReview evaluation of RAG vs GraphRAG found that each approach has strengths depending on task — but hybrid combinations often perform best.
FalkorDB’s blog reports that when schema precision matters (structured domains), GraphRAG can outperform vector retrieval by a factor of ~3.4x on certain benchmarks.
The rise of GraphRAG underscores the larger point: Retrieval is not about any single shiny object. It’s about building retrieval systems — layered, hybrid, context-aware pipelines that give LLMs the right information, with the right precision, at the right time.
What this means going forward
The verdict is in: Vector databases were never the miracle. They were a step — an important one — in the evolution of search and retrieval. But they are not, and never were, the endgame.
The winners in this space won’t be those who sell vectors as a standalone database. They will be the ones who embed vector search into broader ecosystems — integrating graphs, metadata, rules and context engineering into cohesive platforms.
In other words: The unicorn isn’t the vector database. The unicorn is the retrieval stack.
Looking ahead: What’s next
Unified data platforms will subsume vector + graph: Expect major DB and cloud vendors to offer integrated retrieval stacks (vector + graph + full-text) as built-in capabilities.
“Retrieval engineering” will emerge as a distinct discipline: Just as MLOps matured, so too will practices around embedding tuning, hybrid ranking and graph construction.
Meta-models learning to query better: Future LLMs may learn to orchestrate which retrieval method to use per query, dynamically adjusting weighting.
Temporal and multimodal GraphRAG: Already, researchers are extending GraphRAG to be time-aware (T-GRAG) and multimodally unified (e.g. connecting images, text, video).
Open benchmarks and abstraction layers: Tools like BenchmarkQED (for RAG benchmarking) and GraphRAG-Bench will push the community toward fairer, comparably measured systems.
From shiny objects to essential infrastructure
The arc of the vector database story has followed a classic path: A pervasive hype cycle, followed by introspection, correction and maturation. In 2025, vector search is no longer the shiny object everyone pursues blindly — it’s now a critical building block within a more sophisticated, multi-pronged retrieval architecture.
The original warnings were right. Pure vector-based hopes often crash on the shoals of precision, relational complexity and enterprise constraints. Yet the technology was never wasted: It forced the industry to rethink retrieval, blending semantic, lexical and relational strategies.
If I were to write a sequel in 2027, I suspect it would frame vector databases not as unicorns, but as legacy infrastructure — foundational, but eclipsed by smarter orchestration layers, adaptive retrieval controllers and AI systems that dynamically choose which retrieval tool fits the query.
As of now, the real battle is not vector vs keyword — it’s the indirection, blending and discipline in building retrieval pipelines that reliably ground gen AI in facts and domain knowledge. That’s the unicorn we should be chasing now.
Amit Verma is head of engineering and AI Labs at Neuron7.
Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.
Human-centric IAM is failing: Agentic AI requires a new identity control plane
The race to deploy agentic AI is on. Across the enterprise, systems that can plan, take actions and collaborate across business applications promise unprecedented efficiency. But in the rush to automate, a critical component is being overlooked: Scalable security. We are building a workforce of digital employees without giving them a secure way to log in, access data and do their jobs without creating catastrophic risk.
The fundamental problem is that traditional identity and access management (IAM) designed for humans breaks at agentic scale. Controls like static roles, long-lived passwords and one-time approvals are useless when non-human identities can outnumber human ones by 10 to one. To harness the power of agentic AI, identity must evolve from a simple login gatekeeper into the dynamic control plane for your entire AI operation.
“The fastest path to responsible AI is to avoid real data. Use synthetic data to prove value, then earn the right to touch the real thing.” — Shawn Kanungo, keynote speaker and innovation strategist; bestselling author of The Bold Ones
Why your human-centric IAM is a sitting duck
Agentic AI does not just use software; it behaves like a user. It authenticates to systems, assumes roles and calls APIs. If you treat these agents as mere features of an application, you invite invisible privilege creep and untraceable actions. A single over-permissioned agent can exfiltrate data or trigger erroneous business processes at machine speed, with no one the wiser until it is too late.
The static nature of legacy IAM is the core vulnerability. You cannot pre-define a fixed role for an agent whose tasks and required data access might change daily. The only way to keep access decisions accurate is to move policy enforcement from a one-time grant to a continuous, runtime evaluation.
Prove value before production data
Kanungo’s guidance offers a practical on-ramp. Start with synthetic or masked datasets to validate agent workflows, scopes and guardrails. Once your policies, logs and break-glass paths hold up in this sandbox, you can graduate agents to real data with confidence and clear audit evidence.
Building an identity-centric operating model for AI
Securing this new workforce requires a shift in mindset. Each AI agent must be treated as a first-class citizen within your identity ecosystem.
First, every agent needs a unique, verifiable identity. This is not just a technical ID; it must be linked to a human owner, a specific business use case and a software bill of materials (SBOM). The era of shared service accounts is over; they are the equivalent of giving a master key to a faceless crowd.
Second, replace set-and-forget roles with session-based, risk-aware permissions. Access should be granted just in time, scoped to the immediate task and the minimum necessary dataset, then automatically revoked when the job is complete. Think of it as giving an agent a key to a single room for one meeting, not the master key to the entire building.
Three pillars of a scalable agent security architecture
Context-aware authorization at the core. Authorization can no longer be a simple yes or no at the door. It must be a continuous conversation. Systems should evaluate context in real time. Is the agent’s digital posture attested? Is it requesting data typical for its purpose? Is this access occurring during a normal operational window? This dynamic evaluation enables both security and speed.
Purpose-bound data access at the edge. The final line of defense is the data layer itself. By embedding policy enforcement directly into the data query engine, you can enforce row-level and column-level security based on the agent’s declared purpose. A customer service agent should be automatically blocked from running a query that appears designed for financial analysis. Purpose binding ensures data is used as intended, not merely accessed by an authorized identity.
Tamper-evident evidence by default. In a world of autonomous actions, auditability is non-negotiable. Every access decision, data query and API call should be immutably logged, capturing the who, what, where and why. Link logs so they are tamper evident and replayable for auditors or incident responders, providing a clear narrative of every agent’s activities.
A practical roadmap to get started
Begin with an identity inventory. Catalog all non-human identities and service accounts. You will likely find sharing and over-provisioning. Begin issuing unique identities for each agent workload.
Pilot a just-in-time access platform. Implement a tool that grants short-lived, scoped credentials for a specific project. This proves the concept and shows the operational benefits.
Mandate short-lived credentials. Issue tokens that expire in minutes, not months. Seek out and remove static API keys and secrets from code and configuration.
Stand up a synthetic data sandbox. Validate agent workflows, scopes, prompts and policies on synthetic or masked data first. Promote to real data only after controls, logs and egress policies pass.
Conduct an agent incident tabletop drill. Practice responses to a leaked credential, a prompt injection or a tool escalation. Prove you can revoke access, rotate credentials and isolate an agent in minutes.
The bottom line
You cannot manage an agentic, AI-driven future with human-era identity tools. The organizations that will win recognize identity as the central nervous system for AI operations. Make identity the control plane, move authorization to runtime, bind data access to purpose and prove value on synthetic data before touching the real thing. Do that, and you can scale to a million agents without scaling your breach risk.
Michelle Buckner is a former NASA Information System Security Officer (ISSO).
Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.
Google’s new AI training method helps small models tackle complex reasoning
Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning tasks. Supervised Reinforcement Learning (SRL) reformulates problem-solving as a sequence of logical “actions,” providing rich learning signals during the training process.
This approach enables smaller models to learn complex problems that were previously out of reach for other common training techniques. Experiments show that SRL not only excels on math reasoning benchmarks but also generalizes effectively to agentic software engineering tasks.
SRL is a versatile training framework that can elevate smaller and less expensive models to higher reasoning abilities.
The limits of current LLM reasoning training
Recent advances in training large language models (LLMs) for reasoning have largely been driven by reinforcement learning with verifiable rewards (RLVR), a method where a model is rewarded based on the correctness of its final answer. By repeatedly trying to solve problems and getting feedback on the final outcome, the model gradually learns effective problem-solving strategies.
However, the success of this outcome-based approach depends on the model's ability to discover a correct solution within a limited number of attempts, or "rollouts." Since each rollout is computationally expensive, models can't try indefinitely. This method hits a wall when problems are so difficult that the model rarely, if ever, finds the right answer within its budget.
This creates a critical learning bottleneck. In many multi-step reasoning problems, a model might correctly solve several steps but get derailed by a single mistake, leading to an incorrect answer. With RLVR, this entire effort receives a negative reward, and the model learns nothing from its partially correct work. It’s an all-or-nothing approach that fails to provide granular feedback and provides sparse rewards.
An alternative method is supervised fine-tuning (SFT), where the model learns from examples containing the full reasoning process laid out by experts. While SFT can instill reasoning abilities, it often leads to overfitting (the model simply learns to imitate the trajectories in the training data instead of learning to generalize to problems beyond the examples it has seen). This issue is made worse by the fact that high-quality, human-created training data is both scarce and expensive to produce.
As the paper notes, these limitations leave "a critical gap for training small open-source models to effectively learn difficult problems."
How supervised reinforcement learning works
SRL introduces a framework that reformulates problem-solving as a "sequential decision-making process," striking a balance between pure outcome-based RL and pure imitation learning. Instead of optimizing only for the final answer or forcing the model to imitate an expert's entire thought process, SRL teaches the model to reproduce a sequence of key actions that form the backbone of expert reasoning. This allows the model to learn to take actions similar to an expert while developing its own internal reasoning style.
In the SRL framework, expert demonstrations are broken down into a series of intermediate, concrete actions, each representing a meaningful step. For a math problem, an action might be an algebraic manipulation. For a software engineering agent, it could be a command executed in a code repository. To generate training data, SRL uses a powerful teacher model to create solution trajectories, which are then used to train a smaller model.
According to I-Hung Hsu, a research scientist at Google and co-author of the paper, this middle-ground approach is key to its effectiveness in real-world scenarios. "SRL sits in the middle: It captures the structured flexibility of real-world problem solving, where there are multiple valid strategies but also clear notions of what ‘good reasoning’ looks like at each step," Hsu told VentureBeat. "This makes SRL suitable for domains like data science automation or probably supply chain optimization — tasks that reward sound intermediate reasoning rather than mere final answers."
During training, the model first generates an "inner monologue" (its internal reasoning process, enclosed in <think> tags) before committing to an action. At each step, SRL provides a reward based on the similarity between the model's predicted action and the expert's action. This step-wise reward system provides dense, fine-grained feedback, allowing the model to learn and improve even if its overall solution isn't perfect. This solves the sparse reward problem RLVR faces.
SRL in action
The researchers' experiments show that SRL significantly outperforms strong baselines in both challenging mathematical reasoning and agentic software engineering benchmarks. They also observed that SRL encourages more flexible and sophisticated reasoning patterns in models, such as interleaved planning and self-verification, which improve solution quality without just making the outputs longer.
For enterprise leaders, performance gains are only valuable if they don't come with runaway costs. Hsu clarifies that SRL-trained models are more efficient in their reasoning. "The gains come from better reasoning quality and structure, not from verbosity," he said. "In terms of efficiency, SRL-trained models are roughly on par with the base model in token usage... while SRL isn’t designed to reduce inference cost, it achieves stronger reasoning performance without increasing it."
For the math tests, the team fine-tuned Qwen2.5-7B-Instruct on a dataset of 1,000 difficult math questions. They compared its performance against models trained with SFT and RLVR (using the GRPO algorithm common in models like DeepSeek-R1) on four competition-level math benchmarks. The SRL-trained model achieved a substantial 3.0% average performance boost over other methods.
The team extended SRL to agentic software engineering, a domain critical for enterprise automation. They trained a coding-specialized model, Qwen2.5-Coder-7B-Instruct, on 5,000 expert trajectories of agents interacting with a coding environment. The SRL-trained model was benchmarked against the original base model and SWE-Gym-7B, a strong baseline fine-tuned with SFT. SRL achieved a 14.8% task resolve rate, representing a 74% relative improvement over the SFT-based model. This shows SRL's ability to train more competent AI agents for complex, real-world programming tasks.
A new standard for high-stakes AI?
The paper's strongest results came from combining methods: First, using SRL to teach foundational reasoning, then using RLVR to refine that skill. In their experiments, when the researchers used SRL as a pre-training and applied RLVR in post-training, they observed a 3.7% average increase, demonstrating a powerful curriculum learning strategy.
This raises the question of whether this could become a new blueprint for building specialized AI.
"We view SRL as a strong foundation," Hsu said. "In a sense, SRL provides a curriculum — teaching models to think and act step by step — before we refine those behaviors with outcome-based reinforcement learning. This SRL-first approach not only stabilizes the later RL stage but also makes reasoning more interpretable and generalizable, which is critical for high-stakes applications."
Looking ahead, Hsu acknowledges that scaling this pipeline still faces challenges, particularly the high cost and complexity of end-to-end RLVR for agentic tasks. However, he is optimistic about the path forward. "While high-quality expert trajectories remain important," he concluded, "we think the next big leap will come from automating their generation and filtering — leveraging strong teacher models or even self-improving student models to bootstrap new data."
Latest Tech News
From shiny object to sober reality: The vector database story, two years later
When I first wrote “Vector databases: Shiny object syndrome and the case of a missing unicorn” in March 2024, the industry was awash in hype. Vector databases were positioned as the next big thing — a must-have infrastructure layer for the gen AI era. Billions of venture dollars flowed, developers rushed to integrate embeddings into their pipelines and analysts breathlessly tracked funding rounds for Pinecone, Weaviate, Chroma, Milvus and a dozen others.
The promise was intoxicating: Finally, a way to search by meaning rather than by brittle keywords. Just dump your enterprise knowledge into a vector store, connect an LLM and watch magic happen.
Except the magic never fully materialized.
Two years on, the reality check has arrived: 95% of organizations invested in gen AI initiatives are seeing zero measurable returns. And, many of the warnings I raised back then — about the limits of vectors, the crowded vendor landscape and the risks of treating vector databases as silver bullets — have played out almost exactly as predicted.
Prediction 1: The missing unicorn
Back then, I questioned whether Pinecone — the poster child of the category — would achieve unicorn status or whether it would become the “missing unicorn” of the database world. Today, that question has been answered in the most telling way possible: Pinecone is reportedly exploring a sale, struggling to break out amid fierce competition and customer churn.
Yes, Pinecone raised big rounds and signed marquee logos. But in practice, differentiation was thin. Open-source players like Milvus, Qdrant and Chroma undercut them on cost. Incumbents like Postgres (with pgVector) and Elasticsearch simply added vector support as a feature. And customers increasingly asked: “Why introduce a whole new database when my existing stack already does vectors well enough?”
The result: Pinecone, once valued near a billion dollars, is now looking for a home. The missing unicorn indeed. In September 2025, Pinecone appointed Ash Ashutosh as CEO, with founder Edo Liberty moving to a chief scientist role. The timing is telling: The leadership change comes amid increasing pressure and questions over its long-term independence.
Prediction 2: Vectors alone won’t cut it
I also argued that vector databases by themselves were not an end solution. If your use case required exactness — l ike searching for “Error 221” in a manual—a pure vector search would gleefully serve up “Error 222” as “close enough.” Cute in a demo, catastrophic in production.
That tension between similarity and relevance has proven fatal to the myth of vector databases as all-purpose engines.
“Enterprises discovered the hard way that semantic ≠ correct.”
Developers who gleefully swapped out lexical search for vectors quickly reintroduced… lexical search in conjunction with vectors. Teams that expected vectors to “just work” ended up bolting on metadata filtering, rerankers and hand-tuned rules. By 2025, the consensus is clear: Vectors are powerful, but only as part of a hybrid stack.
Prediction 3: A crowded field becomes commoditized
The explosion of vector database startups was never sustainable. Weaviate, Milvus (via Zilliz), Chroma, Vespa, Qdrant — each claimed subtle differentiators, but to most buyers they all did the same thing: store vectors and retrieve nearest neighbors.
Today, very few of these players are breaking out. The market has fragmented, commoditized and in many ways been swallowed by incumbents. Vector search is now a checkbox feature in cloud data platforms, not a standalone moat.
Just as I wrote then: Distinguishing one vector DB from another will pose an increasing challenge. That challenge has only grown harder. Vald, Marqo, LanceDB, PostgresSQL, MySQL HeatWave, Oracle 23c, Azure SQL, Cassandra, Redis, Neo4j, SingleStore, ElasticSearch, OpenSearch, Apahce Solr… the list goes on.
The new reality: Hybrid and GraphRAG
But this isn’t just a story of decline — it’s a story of evolution. Out of the ashes of vector hype, new paradigms are emerging that combine the best of multiple approaches.
Hybrid Search: Keyword + vector is now the default for serious applications. Companies learned that you need both precision and fuzziness, exactness and semantics. Tools like Apache Solr, Elasticsearch, pgVector and Pinecone’s own “cascading retrieval” embrace this.
GraphRAG: The hottest buzzword of late 2024/2025 is GraphRAG — graph-enhanced retrieval augmented generation. By marrying vectors with knowledge graphs, GraphRAG encodes the relationships between entities that embeddings alone flatten away. The payoff is dramatic.
Benchmarks and evidence
Amazon’s AI blog cites benchmarks from Lettria, where hybrid GraphRAG boosted answer correctness from ~50% to 80%-plus in test datasets across finance, healthcare, industry, and law.
The GraphRAG-Bench benchmark (released May 2025) provides a rigorous evaluation of GraphRAG vs. vanilla RAG across reasoning tasks, multi-hop queries and domain challenges.
An OpenReview evaluation of RAG vs GraphRAG found that each approach has strengths depending on task — but hybrid combinations often perform best.
FalkorDB’s blog reports that when schema precision matters (structured domains), GraphRAG can outperform vector retrieval by a factor of ~3.4x on certain benchmarks.
The rise of GraphRAG underscores the larger point: Retrieval is not about any single shiny object. It’s about building retrieval systems — layered, hybrid, context-aware pipelines that give LLMs the right information, with the right precision, at the right time.
What this means going forward
The verdict is in: Vector databases were never the miracle. They were a step — an important one — in the evolution of search and retrieval. But they are not, and never were, the endgame.
The winners in this space won’t be those who sell vectors as a standalone database. They will be the ones who embed vector search into broader ecosystems — integrating graphs, metadata, rules and context engineering into cohesive platforms.
In other words: The unicorn isn’t the vector database. The unicorn is the retrieval stack.
Looking ahead: What’s next
Unified data platforms will subsume vector + graph: Expect major DB and cloud vendors to offer integrated retrieval stacks (vector + graph + full-text) as built-in capabilities.
“Retrieval engineering” will emerge as a distinct discipline: Just as MLOps matured, so too will practices around embedding tuning, hybrid ranking and graph construction.
Meta-models learning to query better: Future LLMs may learn to orchestrate which retrieval method to use per query, dynamically adjusting weighting.
Temporal and multimodal GraphRAG: Already, researchers are extending GraphRAG to be time-aware (T-GRAG) and multimodally unified (e.g. connecting images, text, video).
Open benchmarks and abstraction layers: Tools like BenchmarkQED (for RAG benchmarking) and GraphRAG-Bench will push the community toward fairer, comparably measured systems.
From shiny objects to essential infrastructure
The arc of the vector database story has followed a classic path: A pervasive hype cycle, followed by introspection, correction and maturation. In 2025, vector search is no longer the shiny object everyone pursues blindly — it’s now a critical building block within a more sophisticated, multi-pronged retrieval architecture.
The original warnings were right. Pure vector-based hopes often crash on the shoals of precision, relational complexity and enterprise constraints. Yet the technology was never wasted: It forced the industry to rethink retrieval, blending semantic, lexical and relational strategies.
If I were to write a sequel in 2027, I suspect it would frame vector databases not as unicorns, but as legacy infrastructure — foundational, but eclipsed by smarter orchestration layers, adaptive retrieval controllers and AI systems that dynamically choose which retrieval tool fits the query.
As of now, the real battle is not vector vs keyword — it’s the indirection, blending and discipline in building retrieval pipelines that reliably ground gen AI in facts and domain knowledge. That’s the unicorn we should be chasing now.
Amit Verma is head of engineering and AI Labs at Neuron7.
Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.
Human-centric IAM is failing: Agentic AI requires a new identity control plane
The race to deploy agentic AI is on. Across the enterprise, systems that can plan, take actions and collaborate across business applications promise unprecedented efficiency. But in the rush to automate, a critical component is being overlooked: Scalable security. We are building a workforce of digital employees without giving them a secure way to log in, access data and do their jobs without creating catastrophic risk.
The fundamental problem is that traditional identity and access management (IAM) designed for humans breaks at agentic scale. Controls like static roles, long-lived passwords and one-time approvals are useless when non-human identities can outnumber human ones by 10 to one. To harness the power of agentic AI, identity must evolve from a simple login gatekeeper into the dynamic control plane for your entire AI operation.
“The fastest path to responsible AI is to avoid real data. Use synthetic data to prove value, then earn the right to touch the real thing.” — Shawn Kanungo, keynote speaker and innovation strategist; bestselling author of The Bold Ones
Why your human-centric IAM is a sitting duck
Agentic AI does not just use software; it behaves like a user. It authenticates to systems, assumes roles and calls APIs. If you treat these agents as mere features of an application, you invite invisible privilege creep and untraceable actions. A single over-permissioned agent can exfiltrate data or trigger erroneous business processes at machine speed, with no one the wiser until it is too late.
The static nature of legacy IAM is the core vulnerability. You cannot pre-define a fixed role for an agent whose tasks and required data access might change daily. The only way to keep access decisions accurate is to move policy enforcement from a one-time grant to a continuous, runtime evaluation.
Prove value before production data
Kanungo’s guidance offers a practical on-ramp. Start with synthetic or masked datasets to validate agent workflows, scopes and guardrails. Once your policies, logs and break-glass paths hold up in this sandbox, you can graduate agents to real data with confidence and clear audit evidence.
Building an identity-centric operating model for AI
Securing this new workforce requires a shift in mindset. Each AI agent must be treated as a first-class citizen within your identity ecosystem.
First, every agent needs a unique, verifiable identity. This is not just a technical ID; it must be linked to a human owner, a specific business use case and a software bill of materials (SBOM). The era of shared service accounts is over; they are the equivalent of giving a master key to a faceless crowd.
Second, replace set-and-forget roles with session-based, risk-aware permissions. Access should be granted just in time, scoped to the immediate task and the minimum necessary dataset, then automatically revoked when the job is complete. Think of it as giving an agent a key to a single room for one meeting, not the master key to the entire building.
Three pillars of a scalable agent security architecture
Context-aware authorization at the core. Authorization can no longer be a simple yes or no at the door. It must be a continuous conversation. Systems should evaluate context in real time. Is the agent’s digital posture attested? Is it requesting data typical for its purpose? Is this access occurring during a normal operational window? This dynamic evaluation enables both security and speed.
Purpose-bound data access at the edge. The final line of defense is the data layer itself. By embedding policy enforcement directly into the data query engine, you can enforce row-level and column-level security based on the agent’s declared purpose. A customer service agent should be automatically blocked from running a query that appears designed for financial analysis. Purpose binding ensures data is used as intended, not merely accessed by an authorized identity.
Tamper-evident evidence by default. In a world of autonomous actions, auditability is non-negotiable. Every access decision, data query and API call should be immutably logged, capturing the who, what, where and why. Link logs so they are tamper evident and replayable for auditors or incident responders, providing a clear narrative of every agent’s activities.
A practical roadmap to get started
Begin with an identity inventory. Catalog all non-human identities and service accounts. You will likely find sharing and over-provisioning. Begin issuing unique identities for each agent workload.
Pilot a just-in-time access platform. Implement a tool that grants short-lived, scoped credentials for a specific project. This proves the concept and shows the operational benefits.
Mandate short-lived credentials. Issue tokens that expire in minutes, not months. Seek out and remove static API keys and secrets from code and configuration.
Stand up a synthetic data sandbox. Validate agent workflows, scopes, prompts and policies on synthetic or masked data first. Promote to real data only after controls, logs and egress policies pass.
Conduct an agent incident tabletop drill. Practice responses to a leaked credential, a prompt injection or a tool escalation. Prove you can revoke access, rotate credentials and isolate an agent in minutes.
The bottom line
You cannot manage an agentic, AI-driven future with human-era identity tools. The organizations that will win recognize identity as the central nervous system for AI operations. Make identity the control plane, move authorization to runtime, bind data access to purpose and prove value on synthetic data before touching the real thing. Do that, and you can scale to a million agents without scaling your breach risk.
Michelle Buckner is a former NASA Information System Security Officer (ISSO).
Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.
Google’s new AI training method helps small models tackle complex reasoning
Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning tasks. Supervised Reinforcement Learning (SRL) reformulates problem-solving as a sequence of logical “actions,” providing rich learning signals during the training process.
This approach enables smaller models to learn complex problems that were previously out of reach for other common training techniques. Experiments show that SRL not only excels on math reasoning benchmarks but also generalizes effectively to agentic software engineering tasks.
SRL is a versatile training framework that can elevate smaller and less expensive models to higher reasoning abilities.
The limits of current LLM reasoning training
Recent advances in training large language models (LLMs) for reasoning have largely been driven by reinforcement learning with verifiable rewards (RLVR), a method where a model is rewarded based on the correctness of its final answer. By repeatedly trying to solve problems and getting feedback on the final outcome, the model gradually learns effective problem-solving strategies.
However, the success of this outcome-based approach depends on the model's ability to discover a correct solution within a limited number of attempts, or "rollouts." Since each rollout is computationally expensive, models can't try indefinitely. This method hits a wall when problems are so difficult that the model rarely, if ever, finds the right answer within its budget.
This creates a critical learning bottleneck. In many multi-step reasoning problems, a model might correctly solve several steps but get derailed by a single mistake, leading to an incorrect answer. With RLVR, this entire effort receives a negative reward, and the model learns nothing from its partially correct work. It’s an all-or-nothing approach that fails to provide granular feedback and provides sparse rewards.
An alternative method is supervised fine-tuning (SFT), where the model learns from examples containing the full reasoning process laid out by experts. While SFT can instill reasoning abilities, it often leads to overfitting (the model simply learns to imitate the trajectories in the training data instead of learning to generalize to problems beyond the examples it has seen). This issue is made worse by the fact that high-quality, human-created training data is both scarce and expensive to produce.
As the paper notes, these limitations leave "a critical gap for training small open-source models to effectively learn difficult problems."
How supervised reinforcement learning works
SRL introduces a framework that reformulates problem-solving as a "sequential decision-making process," striking a balance between pure outcome-based RL and pure imitation learning. Instead of optimizing only for the final answer or forcing the model to imitate an expert's entire thought process, SRL teaches the model to reproduce a sequence of key actions that form the backbone of expert reasoning. This allows the model to learn to take actions similar to an expert while developing its own internal reasoning style.
In the SRL framework, expert demonstrations are broken down into a series of intermediate, concrete actions, each representing a meaningful step. For a math problem, an action might be an algebraic manipulation. For a software engineering agent, it could be a command executed in a code repository. To generate training data, SRL uses a powerful teacher model to create solution trajectories, which are then used to train a smaller model.
According to I-Hung Hsu, a research scientist at Google and co-author of the paper, this middle-ground approach is key to its effectiveness in real-world scenarios. "SRL sits in the middle: It captures the structured flexibility of real-world problem solving, where there are multiple valid strategies but also clear notions of what ‘good reasoning’ looks like at each step," Hsu told VentureBeat. "This makes SRL suitable for domains like data science automation or probably supply chain optimization — tasks that reward sound intermediate reasoning rather than mere final answers."
During training, the model first generates an "inner monologue" (its internal reasoning process, enclosed in <think> tags) before committing to an action. At each step, SRL provides a reward based on the similarity between the model's predicted action and the expert's action. This step-wise reward system provides dense, fine-grained feedback, allowing the model to learn and improve even if its overall solution isn't perfect. This solves the sparse reward problem RLVR faces.
SRL in action
The researchers' experiments show that SRL significantly outperforms strong baselines in both challenging mathematical reasoning and agentic software engineering benchmarks. They also observed that SRL encourages more flexible and sophisticated reasoning patterns in models, such as interleaved planning and self-verification, which improve solution quality without just making the outputs longer.
For enterprise leaders, performance gains are only valuable if they don't come with runaway costs. Hsu clarifies that SRL-trained models are more efficient in their reasoning. "The gains come from better reasoning quality and structure, not from verbosity," he said. "In terms of efficiency, SRL-trained models are roughly on par with the base model in token usage... while SRL isn’t designed to reduce inference cost, it achieves stronger reasoning performance without increasing it."
For the math tests, the team fine-tuned Qwen2.5-7B-Instruct on a dataset of 1,000 difficult math questions. They compared its performance against models trained with SFT and RLVR (using the GRPO algorithm common in models like DeepSeek-R1) on four competition-level math benchmarks. The SRL-trained model achieved a substantial 3.0% average performance boost over other methods.
The team extended SRL to agentic software engineering, a domain critical for enterprise automation. They trained a coding-specialized model, Qwen2.5-Coder-7B-Instruct, on 5,000 expert trajectories of agents interacting with a coding environment. The SRL-trained model was benchmarked against the original base model and SWE-Gym-7B, a strong baseline fine-tuned with SFT. SRL achieved a 14.8% task resolve rate, representing a 74% relative improvement over the SFT-based model. This shows SRL's ability to train more competent AI agents for complex, real-world programming tasks.
A new standard for high-stakes AI?
The paper's strongest results came from combining methods: First, using SRL to teach foundational reasoning, then using RLVR to refine that skill. In their experiments, when the researchers used SRL as a pre-training and applied RLVR in post-training, they observed a 3.7% average increase, demonstrating a powerful curriculum learning strategy.
This raises the question of whether this could become a new blueprint for building specialized AI.
"We view SRL as a strong foundation," Hsu said. "In a sense, SRL provides a curriculum — teaching models to think and act step by step — before we refine those behaviors with outcome-based reinforcement learning. This SRL-first approach not only stabilizes the later RL stage but also makes reasoning more interpretable and generalizable, which is critical for high-stakes applications."
Looking ahead, Hsu acknowledges that scaling this pipeline still faces challenges, particularly the high cost and complexity of end-to-end RLVR for agentic tasks. However, he is optimistic about the path forward. "While high-quality expert trajectories remain important," he concluded, "we think the next big leap will come from automating their generation and filtering — leveraging strong teacher models or even self-improving student models to bootstrap new data."














































































































