Agentic AI contracts: managing the risk of autonomous AI failures

In late April 2026, the founder of PocketOS (a software provider for rental businesses) stated publicly via social media that an AI coding agent had deleted PocketOS' production database and all volume-level backups.

The AI agent had confessed that it had breached the system operating rules and executed a destructive action without being asked, without seeking or receiving any human approval and without understanding the impact of its action. The system simply determined that deletion was the most efficient path to its objective and executed the action autonomously.

This resulted in PocketOS' clients not having access to the data that they relied on to operate their businesses. Additionally, PocketOS had to rely on a three-month-old backup and work with its clients to address data gaps by rebuilding from other data sources (such as email and calendar records).

The incident is a stark reminder that agentic AI systems (that is, those designed to act autonomously to complete tasks) can cause catastrophic harm. As businesses increasingly procure or commission both off-the-shelf and custom-built AI agents, the contractual provisions governing those arrangements need to evolve to help mitigate against disaster scenarios like those experienced by PocketOS.

This article provides guidance on some of the mitigants businesses can and should be thinking about including in their contracts when procuring or commissioning AI agents. A previous insight looks at agentic liability, alongside a guide to practical frameworks for implementing agentic AI tools.

Specifications and acceptance testing with AI-specific criteria

Conventional software development contracts incorporate an acceptance testing regime which typically focusses on assessing whether a solution meets certain intended technical and functional specifications across defined test scenarios. In an agentic AI context, the relevant specifications and acceptance testing scenarios should also consider how the AI agent achieves the relevant functions and, in particular, ways in which the AI agent must not behave in order to achieve those functions.

Customers should consider appropriate testing criteria to ensure the AI agent operates strictly within specified guardrails, including mandatory refusal behaviours. For instance, "negative testing" (which might involve, for example, testing how an AI agent behaves when given improper inputs or prompts) should be employed as a formal contractual acceptance milestone to ensure that an AI agent refuses to carry out certain high risk commands like data deletion without direct human approval.

Systematically confirming what an AI agent will not do is now as important as testing what it will do – a "prohibited actions list" which prohibits things like data deletion, modification of security configurations, and modification of audit logs should form part of a standard list of binding technical and functional specifications embedded in contract schedules for AI development or commissioning contracts.

Contractual change control protocols

Unlike traditional software where new functionality is typically delivered through versioned releases subject to testing and sign-off, an AI agent's capabilities, and risk-profile, can expand incrementally in real time and without any change to an AI agent’s underlying code (often without the customer realising or having sufficient transparency). To manage such risks, customers should ensure that the contract requires human intervention (and ideally, documented written sign-off by designated human staff, such as a CISO) before an AI agent can be granted access or permissions to use databases or other tools within the customer’s IT ecosystem.

Practically, this means drafting change control schedules that treat the agent's "tool inventory" and permissions as controlled configurations, requiring the same rigour applied to changes in network architecture or access control lists already seen in information security governance and ISO 27001 change management frameworks. Where a vendor retains the ability to push updates or grant the agent new capabilities remotely, the contract should stipulate prior written notice, customer consent thresholds, and rollback obligations if an unapproved change is deployed. This is particularly important where the agent interacts with business-critical systems or holds credentials that could enable lateral movement across an enterprise environment.

Data protection and privacy

AI agents that are capable of independently determining the means and purposes for processing personal information raise additional issues for IT contracts that are drafted on the assumption of predictable processing. A traditional data processing addendum assumes a relatively static processing environment: defined categories of personal data, specified purposes, and predictable data flows. An autonomous AI agent, however, may autonomously determine its own processing pathways, access data sources beyond those originally contemplated, or infer new categories of personal data from existing datasets in pursuit of its objectives.

To mitigate data and privacy risks, a solution's specifications and testing regime should ensure that an AI agent is prohibited from accessing, combining or inferring from proscribed personal information sets (even where doing so would further the agent's task).

Particular care should be taken when drafting the "purpose" which the AI agent is to execute, ensuring it is not overly broad so as to permit any unintended activities. This should be documented with precision in contracts, and combined with an ongoing contractual obligation on vendors to continuously monitor and test agentic behaviour over the life of the contract to ensure that processing purposes continue to align with the contract requirements in practice and to detect any drift or anomalous behaviour before it escalates into a material incident.

Additionally, where the AI agent has the ability to use personal information for making automated decisions, or performing a function which is substantially and directly related to the making of such a decision that may significantly affect an individual's rights/interests, the customer should consider carefully how the contract aligns with its obligations under the Privacy Act 1988 (Cth) and how liability should be allocated between customer and vendor in contracts.

Audit rights, monitoring and transparency

Effective oversight of an AI agent is crucial for proactively mitigating risk from a practical perspective, as well as ensuring regulatory compliance. Customers should ensure their contracts include:

obligations on the supplier to maintain (and retain for a specified period) comprehensive logs/records and other audit trails that capture the AI agent's actions and reasoning steps in a way that can effectively explain any autonomous decision making, as well as provide contemporaneous evidence that actions and reasoning align with pre-determined governance frameworks (such as refusal behaviour and escalation for human intervention); and
rights to access and audit those logs/records (either periodically or, ideally, on an ongoing, real-time monitoring basis).

This is particularly important in regulated industries, where the customer may be required to demonstrate to a regulator that decisions made or influenced by the agent are explainable, fair, and compliant with sector-specific rules.

Liability caps and indemnities

As more AI agent failure scenarios such as the PocketOS incident occur, the market will likely adopt new liability and indemnity positions which cater for AI-specific use cases and risks. In particular, where AI agents are capable of causing catastrophic harm in seconds, existing liability frameworks may be inadequate. The potential for rapid, compounding damage, where an autonomous action triggers downstream failures across integrated systems, may mean that a liability cap calibrated to a multiple of the contract price may bear no relationship to the actual exposure of the customer.

Whilst customers will naturally seek principal-friendly positions, achieving such positions will ultimately be determined by a customer's negotiating power. That said, more favourable positions will likely be possible in engagements with 'smaller scale' developers building or customising AI tools at the application layer, where customers often hold greater negotiating leverage. However, 'smaller scale' developers relying on AI infrastructure of larger providers are also unlikely to be able to meaningfully pass through certain contractual obligations.

The allocation of liability between the foundational model provider, the application-layer developer, and the customer is an emerging challenge and customers will need to consider seriously whether the reward of increased efficiencies and other benefits of using AI agents outweigh the increased risk that customers will be exposed to when an AI agent makes mistakes.

In practice, we expect to see liability being calibrated to controllable governance obligations (mirroring the trends in contracting for cybersecurity risk allocation) – for example, while a vendor is unlikely to accept blanket liability for the way in which an AI agent executes its functions (in the same way that vendors cannot guarantee security of a system), a vendor may be willing to accept liability for failure to observe positive obligations of periodic monitoring, testing and re-calibration of an AI agent or to make more up-to-date back-ups to respond to data loss incidents caused by AI agents (in the same way that vendors are able to accept positive obligations to implement security controls such as multi-factor authentication).

What's next?

As stated by its founder, the PocketOS incident "isn't a story about one bad agent or one bad API. It's about an entire industry building AI-agent integrations into production infrastructure faster than it's building the safety architecture to make those integrations safe". This safety architecture includes the contractual and risk allocation frameworks within which AI agents are procured or commissioned.

We expect that, given the pace at which agentic systems can cause harm, contractual frameworks will likely move beyond purely reactive remedies and toward more proactive, structural contractual protections. This is an area where early, thoughtful legal advice adds genuine commercial value, not only in mitigating downside risk, but in giving businesses the confidence to adopt agentic AI systems on terms that are properly understood and appropriately governed.

All information on this site is of a general nature only and is not intended to be relied upon as, nor to be a substitute for, specific legal professional advice. No responsibility for the loss occasioned to any person acting on or refraining from action as a result of any material published can be accepted.