Microsoft, Google DeepMind and xAI agreed to give U.S. government scientists early access to AI models for national security testing. The work sits with the Centre for AI Standards and Innovation, or CAISI, inside the Commerce Department’s NIST system, which says it focuses on demonstrable risks, such as cybersecurity, biosecurity and chemical weapons misuse.
According to Reuters, the government will evaluate models before deployment, looking for security risks and failure modes that could matter once the systems are public. Microsoft said the effort will include adversarial assessment work that probes unanticipated behavior, plus shared datasets and workflows for testing advanced models. Microsoft also said it has signed a similar agreement with the UK’s AI Security Institute.
Many times, the most expensive AI mistakes are the ones found after release; Once a frontier model is in the wild, a flaw can be copied, scaled or exploited quickly. CAISI’s mandate is to work with private developers, conduct unclassified evaluations, and assess risks related to national security. Reuters says the center has already completed more than 40 evaluations, including on state-of-the-art models not yet available to the public.
[Also Read: Gartner Predicts by 2027, Companies Will Use Small, Task-Specific AI Models Much More Than General-Purpose LLM ]
AI safety failures are not limited to only bad answers or awkward responses. U.S. officials are looking at threats ranging from cyberattacks to military misuse, and the center’s public guidance points to the same pressure points: adversarial behavior, security vulnerabilities and misuse pathways.
Google DeepMind will provide access to proprietary models and data, while Microsoft said it will help build shared datasets and workflows. xAI did not immediately comment, and Google declined to comment at the time of Reuters’ report. That lack of detailed public disclosure is part of the story too: the broad framework is visible, but the exact testing scope, model versions and evaluation methods are not fully public.
What it means
The pre-release AI testing is moving closer to a normal checkpoint for frontier models. The U.S. government will not just react after a launch; it is asking for access to AI models before the official launch for the public.
CAISI says it works through voluntary agreements with private-sector developers, which means this is still a cooperative model rather than a hard licensing system. But once major labs begin accepting that process, it can shape the standard for the rest of the market.
The AI policy is now being written through national security language, not only consumer protection language. The U.S. officials are alarmed by the hacking capabilities of advanced models and want to identify cyber and military risks before the tools spread widely. NIST’s CAISI page says the center also coordinates with the Pentagon, Energy, Homeland Security, OSTP and the intelligence community.
[Also Read: IBM Launches Sovereign Core Platform to Give Enterprises Full Control Over AI and Cloud Operations ]
Why is this pre-test important?
A pre-test of AI models can surface issues before they reach the public. It gives regulators an opportunity to see how a model processes and behaves rather than in marketing demos.
This is a new beginning; the next wave of AI models may likely face more scrutiny before launch than earlier generations did.
[Also Read: GenAI is a double-edged sword for Defence and Offense in cybersecurity ]
The open question
The Commerce Department removed the main webpage describing the agreement, without explaining why. That does not erase the deal, but it does show that the politics around AI oversight are still unsettled. The architecture of review is being built in public, while the rules around it are still being negotiated behind the scenes.



















