AI-on-AI Perfidy and the Law of Armed Conflict

by Samuel White, Jonathan Kwik, Dora Velenczei | Jun 8, 2026

Picture the following scenario: a forward-deployed artificial intelligence (AI) enabled surveillance platform detects a convoy moving along a contested highway at dusk. Its classification system flags the convoy as medical; the vehicles emit recognised humanitarian transponder codes, their movement profile aligns with evacuation patterns, and their digital signatures match entries in a protected registry. The system recommends non-engagement. An integrated fires platform, downstream in the kill chain, accepts the classification and withholds strike authority.

Minutes later, the convoy deploys loitering munitions, raining devastation on the unprepared defenders and causing human casualties. The convoy was composed of unmanned ground-based carriers, deployed with a simple aim: to penetrate as far as possible into contested space before releasing the loitering drones.

No human has misread a white flag. No soldier has feigned surrender. Instead, one machine has induced another to comply with the law and then exploited that compliance. The deception was not aimed at a human mind but at an enemy algorithm trained to recognise legally protected statuses. The question that follows is as stark as it is unfamiliar: is this perfidy?

A New Type of Perfidy

At first glance, the legal framework appears settled. Article 37 of 1977 Additional Protocol I (AP I) prohibits killing, injuring, or capturing an adversary by resort to perfidy. While non-party States, such as Israel and the United States, dispute the scope of the harm threshold of the prohibition under Article 37, the provision defines perfidy as acts inviting the confidence of an adversary to lead them to believe they are entitled to protection under international law, with the intent to betray that confidence. The paradigm examples are well known: feigning surrender or incapacitation; or simulating protected status by misusing signs, emblems, or uniforms (such as the red cross or symbols of the UN, neutral States, or States not party to the armed conflict). As wider studies show, the prohibition against perfidy is a norm of conflict that has evolved concurrently across multiple legal systems. The prohibition of perfidy is among the foundational norms upon which the modern law of armed conflict (LOAC) develops. Without trust, there can be no restraint (Corn 2023, p. 443).

The prohibition is not peripheral. It is vital to preserve combatants’ confidence in signals of protection without rendering them fatally vulnerable. Yet each element of perfidy—long consistent in human-centred warfare—becomes unsettled when translated into machine-mediated interactions.

The opening scenario is not science fiction. It reflects a plausible trajectory, recently discussed at the National University of Singapore’s Centre for International Law workshop on AI and IHL (April 2026), in which military systems increasingly rely on automated classification to make or inform targeting decisions. In that environment, the traditional elements of perfidy must be reconsidered; not abandoned but interrogated for their capacity to operate in a domain where cognition is no longer exclusively human.

Legal & Definitional Frameworks

A useful starting point is definitional clarity. We define “AI-to-AI perfidy” as the situation where one system deliberately simulates protected status or signals in a manner designed to trigger law-compliant restraint in an opposing system, with the intent to exploit such restraint for attack. This framing preserves the structure of Article 37 AP I while translating its components into a technologically mediated setting. It requires, first, the simulation of a protected cue, whether through digital identifiers, electromagnetic signatures, or other machine-readable indicators. Second, the deception must be system-targeted; i.e., crafted to exploit the recognition architecture of the adversary’s AI system, rather than (or in addition to) human perception. Third, it must induce compliance behaviour, such as withholding fire or rerouting assets. Finally, that compliance must be betrayed and exploited to cause human harm.

This definition is deliberately narrow. It focuses on deliberate, bad-faith conduct attributable to the AI system’s human designers or operators, rather than emergent behaviour discovered autonomously by learning systems without foresight (cf. Kwik & Wiese 2025). That limitation is not because the latter is unimportant, but because the former most closely tracks the established legal category of perfidy, which has always been anchored in intent. The question is not whether an AI system can form “intent” in any meaningful sense, but whether those who design, deploy, or direct it do so with the purpose of exploiting legal protections.

Even within this narrower frame, however, a central difficulty emerges. Article 37(1) AP I speaks of “inviting the confidence of an adversary to lead him to believe that he is entitled to, or is obliged to accord, protection” under LOAC, with the intent of betraying that confidence. Confidence, and the reliance it entails, are inherently cognitive concepts. They presuppose belief, trust, and the capacity to be deceived in a way that carries normative weight.

A fooled AI system does not believe. It does not trust. It processes inputs and produces outputs based on statistical inference. Can such a system be said to “rely” on a signal in the legal sense, or does it merely weigh that signal probabilistically within a classification model? Can a system be led to “believe” that it obliged to award protection, or is it simply executing a predefined process? (Boothby 2025, p. 272) Is a human required to be in the loop of the “fooled” AI system for perfidy to be effectuated, or should it be interpreted in a wider, more protective manner?

This is not a semantic quibble. If reliance is understood strictly as attaching to a human mental state, then machine-mediated deception risks falling outside the scope of perfidy altogether. The law would, on this reading, regulate only those deceptions that operate on human cognition, leaving a gap where deception targets algorithmic processes (Kwik 2024, p. 756). That outcome would be difficult to reconcile with the underlying purpose of the prohibition: to “prevent the abuse, and the consequent undermining, of the protection afforded by the law of armed conflict” (UK 2004, §5.9.3). Perfidy is not concerned with protecting feelings of trust for their own sake. It is concerned with preserving the functional reliability of legal signals that enable restraint in warfare (Sandoz et al. 1987, para. 1500). Whether the recipient of those signals is a human or a machine acting on behalf of a human should not, in principle, determine whether the protection applies.

A Move Towards Functionality

A more persuasive approach is to treat reliance functionally. On this view, the relevant question is whether the adversarial target, human or machine, is induced to act in a manner consistent with the legal obligation to honor protection on the basis of the transmitted signal. If an AI system classifies an object as protected and, as a result, suppresses an otherwise lawful attack, it has, in effect, “relied” on the signal. The cognitive substrate differs, but the operational consequence is the same. The invitation of confidence is translated into an invitation of compliance within a decision architecture designed to operationalise the law.

Adopting a functional understanding of reliance does not stretch the law beyond recognition. LOAC has long been interpreted in light of its object and purpose, accommodating technological change without constant textual revision. The prohibition on perfidy has been applied across diverse contexts, from battlefield ruses to misuse of protective emblems in complex operational environments. Extending it to cover machine-mediated deception is a continuation of that interpretive tradition rather than a departure from it.

If that is correct, then deliberate AI-to-AI perfidy is already captured by existing law. The medium has changed, but the essential wrong has not. A party that designs a system to simulate protected status in order to induce legally compliant restraint, and then exploits that restraint, is engaging in the very conduct Article 37 AP I seeks to prohibit. The fact that the deception operates through digital signals rather than visual cues, and is processed by algorithms rather than human perception, does not alter its character.

Recognising this continuity is not merely doctrinally satisfying. It is practically necessary. To conclude otherwise would create what might be described as a compliance paradox. The more faithfully a military designs its AI systems to recognise and respect protected status—as imperatively required by LOAC, particularly the principle of distinction (Boothby 2025, p. 102)—the more vulnerable those systems become to exploitation by an adversary willing to simulate such status. If that exploitation were not legally prohibited as perfidy, it would create a powerful incentive for States to degrade or disregard protective signals in automated systems (Roff 2016, p. 205). The net effect would be to erode the very protections the law seeks to preserve.

Novel Features of Machine-Mediated Deception

This paradox mirrors the historical rationale for the prohibition of perfidy (Watts 2014). The law restricts certain forms of deception not because deception is inherently objectionable in warfare, but because these deceptions undermine the possibility of restraint itself. If signals of surrender or medical protection cannot be trusted, they cease to function, and the protective regime of LOAC collapses. AI-enabled systems may amplify this dynamic. They can process signals at machine speed and scale, making them both highly responsive to lawful indicators, but also highly susceptible to systematic exploitation.

The technological context introduces genuinely novel features that complicate the analysis. One is scalability. Human-perpetrated perfidy is constrained by the number of individuals willing and able to engage in it. AI systems, once trained and deployed, can replicate and execute deceptive strategies across entire fleets of platforms. A spoofing technique that is known to succeed against enemy systems can be rapidly scaled, creating widespread effects across the battlespace. This amplifies both the potential harm and the strategic significance of such conduct, but also its corrosive effect on trust.

Another novel feature is the nature of the deception itself. Traditional perfidy involves the transmission of signals that are recognised and understood within established frameworks, such as flags, uniforms, or emblems. In AI-mediated contexts, deception may operate at a deeper level, targeting the learned models of the adversary’s systems. An attacker might craft inputs that cause a classifier to misidentify a target as protected, not by presenting an obvious signal like a white flag, but by exploiting the statistical patterns the system has learned to associate with protected status. The system is induced, through signal inputs or “adversarial perturbations,” to classify the target as protected.

This raises further questions about the elements of perfidy. Has the attacker “invited confidence” if it has not transmitted a recognisable signal, but still caused the adversary system to misclassify an object? For example, AI-based cyber-agents can be fooled into thinking that an entity is protected because the cyber-agent associates the entity’s traffic metadata (which is not a protected signal or symbol) with a humanitarian organisation. Is there a betrayal of confidence if this confidence arises from an internal statistical model rather than an external representation? Here again, a functional approach is instructive. If the attacker deliberately engineers inputs to cause the adversary system to treat a target as protected, and then exploits that treatment, the core mischief is present. The mechanism differs, but the effect is analogous to displaying a false emblem. Thus, as in “standard” cases of perfidy, the intent and knowledge to betray would be determinative (Kwik 2024, p. 754-757).

A related line of argument suggests that AI systems may, in some respects, be less susceptible to perfidy than humans. They do not experience fear, stress, or empathy. They are not moved by the sight of wounded personnel or the emotional cues associated with surrender (Boothby, 2025, p. 271). To the extent that perfidy exploits human psychology, AI systems might appear more resilient. There is some force in this observation, but it risks focusing on the wrong dimension. The relevant vulnerability is not emotional susceptibility, but rule-based compliance. AI systems are designed to implement legal and doctrinal constraints consistently. That consistency, while desirable, creates predictable pathways for exploitation. A system that reliably applies the rule is a system whose behaviour can be mapped and probed.

The comparison with human-to-AI or human-to-human perfidy also deserves attention. A human pilot who broadcasts protected signals to deceive an adversary’s air defence system while conducting a bombing run resulting in human casualties is engaging in a form of perfidy that already falls within the prohibition. What, then, is distinctive about AI-to-AI perfidy? Part of the answer lies in evidentiary and attribution challenges. When deception is mediated through complex systems, it may be more difficult to establish intent. A State might claim that a deceptive outcome was an unforeseen byproduct of an autonomous system’s learning process, rather than a designed feature. Assessing such claims will likely require forensic access to the offending system, technical understanding of the system’s capabilities, and insight into prior design choices.

More substantively, AI-to-AI interactions also enable forms of deception that operate below the level of human perception, targeting the internal logic of systems in ways that are not readily observable or comprehensible by humans. For example, instead of sending out recognised humanitarian transponder codes, an AI system may broadcast seemingly random noise that is nevertheless identified by the opposing AI system as indicative of a “medical convoy” (Comiter 2019, pp. 20-25). This creates a different kind of battlespace, one in which the line between lawful ruses, electronic warfare, and perfidy may be harder to draw in practice. It also raises questions about how legal reviews of weapons systems, such as those required by Article 36 AP I, should account for adversarial manipulation of recognition functions (an issue Dora Velenczei explores in a forthcoming article in the Asia-Pacific Journal of International Humanitarian Law).

Conclusion: Avoiding a Doctrinal Vacuum

These challenges do not undermine the core conclusion. Deliberate, bad-faith AI-to-AI perfidy fits comfortably within the existing prohibition of AP I when interpreted in light of its purpose. The law does not require a human to be deceived; it requires that protected confidence be invited and good faith be betrayed. Where machine-mediated processes stand in for human decision-making, they should be treated as extensions of that decision-making for legal purposes. The more difficult questions lie at the margins. How should the law respond to scenarios where perfidious outcomes emerge without clear human intent? What standards of foreseeability or recklessness should apply when deploying systems capable of discovering and exploiting such tactics? Where is the line between electronic warfare and perfidy? These are important issues extending beyond the core case and better addressed in more detailed fora.

For present purposes, the imperative is to avoid creating a doctrinal vacuum at the heart of the law’s protective regime. If AI systems become central to targeting decisions, then the integrity of the signals they rely on must be preserved. Recognising AI-to-AI perfidy as falling within Article 37 AP I is a necessary step in that direction, as demonstrated by the introductory scenario. It reinforces the principle that the prohibition is not tied to a particular technology or mode of cognition, but to the preservation of trust in the legal markers of protection.

The risk is not that machines will ignore the law. It is that they will follow it with such consistency that adversaries can turn that compliance into a weapon. In that sense, the challenge posed by AI is not to the validity of international humanitarian law, but to its operationalisation. The law has always depended on a fragile equilibrium: parties restrain themselves in reliance on the expectation that certain signals will be respected. If that expectation collapses, so too does the restraint it enables. The task, then, is both interpretive and practical. Interpretively, to ensure that existing prohibitions are understood to apply in technologically mediated contexts. Practically, to design systems and doctrines that are resilient to exploitation without abandoning the protections the law requires. That balance will not be easy to achieve. But the alternative—a world in which the markers of protection are systematically gamed by machines and disregarded in response—would represent a far more profound erosion of the legal order governing armed conflict.

***

Dr Samuel White is the Scientia Senior Researcher in Military Law and War Studies at UNSW Canberra (based within the Australian Defence Force Academy).

Dr Jonathan Kwik is Researcher in international law at the Asser Institute in The Hague, Netherlands, specialised in international humanitarian law and militarised emerging technologies.

Dora Velenczei is a PhD candidate and sessional academic at Monash University, specialising in “cyber blockades” and the law on the use of force.

The views expressed are those of the authors, and do not necessarily reflect the official position of the United States Military Academy, Department of the Army, or Department of Defense.

Articles of War is a forum for professionals to share opinions and cultivate ideas. Articles of War does not screen articles to fit a particular editorial agenda, nor endorse or advocate material that is published. Authorship does not indicate affiliation with Articles of War, the Lieber Institute, or the United States Military Academy West Point.

Photo credit: Getty Images via Unsplash

AI-on-AI Perfidy and the Law of Armed Conflict

lieber institute

Connect

Share on Mastodon