Anti-AI Countermeasures in Warfare: Terra Incognita for IHL?

In a small town under occupation by Alpha, an utterly baffling bloodbath occurred. Dozens of civilians coming home from the Friday market were gunned down by Alpha’s artificial intelligence (AI)-powered sentry guns which, until that day, had never once mistaken the villagers for enemy soldiers. The aftermath has investigators scratching their heads. Only one thing tied the victims together: they were all holding unremarkable turtle figurines that a vendor was handing out at the market that morning.
The media blowback is immediate and forces Alpha to retire all the systems for fear of reoccurrence. Weeks later, following a meticulous investigation, Alpha comes to an even more shocking realization. All evidence suggests that Beta, Alpha’s opponent, had 3D-printed the turtles, smuggled them into town, and distributed them to civilians on purpose. The turtles’ shells had been layered with a pattern visible only to AI algorithms that inexplicably caused them to categorically recognize the turtles as rifles, leading Alpha’s sentry guns to engage the “combatants” carrying them accordingly.
This story is fictional. But it is not fantastical. AI-based recognition software is becoming increasingly sophisticated as it is incorporated into drones and perimeter defense. As for the turtles, Massachusetts Institute of Technology (MIT) researchers created this exact object more than five years ago. Wondering how easy it would be to fool AI image classifiers, they 3D-printed a seemingly-ordinary turtle. Google’s image classifier, however, saw something different. It consistently, and confidently, let the researchers know that it saw a “rifle.” Even with the turtle rotated and flipped, it remained staunchly convinced: “rifle,” “rifle,” “rifle.”
With the current state of weapons and AI technology, the incident above could take place today.
Adversarials
The turtle is an example of an anti-AI countermeasure, also known in the industry as an “adversarial attack” or “counter-AI (CAI).” These techniques are specifically developed to exploit vulnerabilities in AI systems and come in all shapes and sizes. Adversarial patterns printed on T-shirts can make persons invisible to security cameras (Fig. 1), and tiny stickers on a STOP sign can make self-driving vehicles see SPEED LIMIT 60 instead. Donning adversarial glasses can convince a face detection algorithm that you’re someone else. Using so-called “poisoning attacks” and “backdoors,” attackers can convince AI that the digit seven is actually an eight or make code autocompleters recommend vulnerable code to unsuspecting programmers.
Fig. 1: Stealth T-shirts. Notice how the person wearing the T-shirt “disappears” from the AI’s vision (indicated by the green frame) as long as the adversarial pattern is fully displayed.
Anti-AI adversarial attacks (“adversarials,” for short) are a common concern in civilian AI applications. Their ramifications for military systems, however, are less known. While many States, organizations, and commentators have warned of making AI-systems robust against countermeasures in general, few have explored how adversarials would interact with specific international humanitarian law (IHL) rules, such as precautions and perfidy.
The turtle scenario above, however, shows that discourse is necessary. Was that tactic lawful? Who is responsible for the civilian deaths? How do we legally assess such situations?
In a recent article, I delve deep into complicated legal questions that arise when a belligerent uses adversarials against an opponent’s AI-system on the battlefield. In this post, I place that article in context. I look at the unfolding Russo-Ukraine war to argue that adversarials will become more common, and that we would do well to start the dialog on their legal consequences now. I also look forward, highlighting a few advanced adversarials that belligerents might foreseeably use against emerging AI applications, such as decision-support systems and swarms.
A Predictable Response to Military AI
The battlefield is a contest between belligerents, each adapting and counter-adapting to gain an advantage. In such a competitive environment, evolutionary pressures force rapid technological change and innovation. The Russo-Ukrainian war is a prime example. An acoustic radar is developed by Ukraine to locate low-flying Russian drones. In response, Russia muffles its drones’ acoustic signatures, an act that spurs Ukraine to adjust its detection algorithms. It is a war built on countermeasures and counter-countermeasures. This dynamic is constant in military technology throughout history.
I foresee the development of military adversarials as a predictable reaction to the rapid spread of military AI; the latter being a trend we already observe in both Ukraine and Gaza.
Importantly, the ability to use adversarials is not dependent on the conflict being symmetrical. In the turtle incident, Beta did not need to possess expensive AI-powered systems of its own: a 3D-printer and a talented tech team sufficed. Any future conflict, ranging from wars between equals to counterinsurgencies against poor but crafty rebels, can therefore feature adversarials.
One valid question to ask is what military advantage adversarials can bring. For my hypothesis to hold, adversarials must not only be technically feasible, but also be an effective response to the technology being countered. I theorize that adversarials can indeed offer many tactical benefits, such as:
A. Dissimulating friendly forces or assets. For example, Beta can sow adversarial patches to its soldiers’ uniforms, rendering them “invisible” to an autonomous tank. This allows the soldiers to simply stroll up to the tank and destroy or capture it.
B. Legally immunizing friendly forces or assets. Alternatively, Beta can hand out adversarial sunglasses, making the tank’s AI misclassify the column of soldiers as members of the International Committee of the Red Cross (ICRC), for the same result.
C. Inducing friendly fire. Alpha’s tank can be poisoned to identify friendly tanks as Beta’s tanks, wreaking havoc.
D. Redirecting violence toward civilians. This is what Beta did in the turtle scenario. Using the turtles, they manipulated Alpha’s guns into opening fire against the townsfolk. If one wonders why this would be militarily useful, recall that victimizing civilians is often an effective way to compromise an opponent’s legitimacy, especially if caused by the occupier’s own weapons seemingly going haywire. In this case, Alpha did indeed withdraw the sentry guns due to public and media outrage.
E. Gaming targeting obligations. For example, an adversarial can be used to emulate a human shielding by making an AI surveillance system hallucinate dozens of civilians around a military asset, Alpha can be made to falsely assume that the target cannot be attacked because it would cause excessive collateral damage.
F. Making opponents lose faith in their system. This is partially what happened to Alpha as well. Worried that the system was malfunctioning, and being concerned that they could no longer use the system in compliance with IHL, Alpha pulled the fleet from the frontline (Boothby, p. 154).
Each example above shows why a belligerent would use adversarials. However, military effectiveness is only one part of the puzzle. It is also important to ask if employing adversarials is legally permitted.
In this respect, it is likely that some examples will already have raised eyebrows. Doesn’t (B) sound suspiciously like perfidy? Is (E) just emulating human shielding, or does it legally qualify as such? In the turtle incident, are the turtles themselves an attack, a means to effectuate an attack through Alpha’s system, or “just” a countermeasure?
It is important that belligerents, when preparing an adversarial, understand what they are legally getting themselves into. They need to know: is distributing these turtles an attack?
When is an Adversarial an Attack?
In most cases, a countermeasure remains a defensive response. For example, erecting a smokescreen or using lasers to blind a drone camera does not constitute an attack in itself. Some actions do, however, shift attacker status to the other party. Take the situation where Beta overrides control of Alpha’s drone through cyber-hacking, and uses it to shoot civilians. In this case, “legal responsibility for subsequent wrongful use of the system and of its munitions will transfer to the persons who have taken and retain control” (Boothby, p. 156). With this control comes the duty to comply with legal obligations in attack, including distinction, proportionality, and precautions (Nasu & McLaughlin, p. 250).
The problem with adversarials is that they often compromise the owners’ control over their systems, without reaching the point where an “override” occurs. In none of scenarios (A)-(F) was Beta in full control of the systems. Adversarials often result in a legal grey area where both parties can plausibly deny having had enough control over the afflicted system to be able to implement precautions in attack.
This is highly undesirable. As Professor Michael Schmitt also noted with respect to cyberattacks:
uncertainty with respect to the loss of functionality threshold leaves the legal characterization of certain cyber operations directed at or affecting the civilian population ambiguous. A party to the conflict could exploit such uncertainty… . From a humanitarian perspective, this situation is untenable.
To solve this problem, I propose a legal-analytical methodology (Fig. 2) inspired by the Oslo Manual’s approach to cyberoperations. My framework relies on classifying adversarials into one of five “strategies” (left column), based on the knowledge and intent of the adversary vis-à-vis the harm caused (center columns). This outputs a legal assessment whether the adversarial itself classifies as an attack (right column).
Fig. 2: Is an adversarial an attack?
To give one demonstration, the turtle incident would fall in the civilian false positive category because Beta desired (“purposely”) to victimize civilians by implementing this tactic. Accordingly, the action of handing out turtles was itself an attack, carried out “via” Alpha’s sentry guns. With any legal ambiguity as to the applicability of Beta’s obligations in attack now removed, we can confidently hold that they violated Article 51(2) of Additional Protocol I (AP I) and perhaps could be liable under Article 85(3)(a).
The matrix is straightforward to apply once understood. I encourage readers to peruse the legal reasoning behind this table in the full article.
Can Adversarials Breach Other Rules?
Another interpretative conundrum is whether certain adversarials can qualify as prohibited acts, such as perfidy or human shielding. (B) and (E) are examples of where this may be the case. This can be challenging to assess due to the legal fine print attached to each concept.
To take one example, an act is perfidious if it convinces the opponent to afford protection with the intent to betray that trust (AP I, art. 37(1)). In (B), Beta is indeed abusing the “ICRC” label to effectuate a perfidious attack, but Beta is misleading an AI system instead of a human. Is this still perfidy? What if the AI fooled was not an autonomous tank, but an AI intelligence network used by commanders to make targeting decisions. Would the presence of the commander change our legal assessment??
As a further complicating matter, it is possible that Beta does not factually know why their soldiers are being spared while wearing the sunglasses. Because the tank is not under Beta’s control, they cannot observe the output label “ICRC.” The only observation Beta’s tech team can make is that the adversarial “works.” Should this factor into our assessment of whether an adversarial can qualify as perfidy?
These questions, and many more, remain open. Arguments for different positions can reasonably be made on all the abovementioned issues. I warmly encourage the legal community to theorize how traditional IHL concepts would apply, or should be translated, when adversarials are involved.
Future Adversarials and Challenges
Adversarials will only evolve further as AI technology develops. Swarming and decision-support systems reflect two vectors of attack that seem particularly promising against emerging technologies, but also raise further legal questions.
Swarming has been used for intelligence gathering, locating targets and carrying out attacks. Its advantage comes from the swarm’s adaptability and robustness, particularly if a decentralized control structure is used. Intrusion attacks consist of inserting intruder units that send spoofed messages to other individuals in the swarm, which can compromise the swarm’s overall performance. Done against an swarm that conducts surveillance, for example, this could result in entirely erroneous targeting decisions. The danger is that such tactics could have chaotic effects; effects unpredictable by principle. Decentralized swarms rely on emergent properties that are very difficult to extrapolate from individual-level behavior. In the worst case, neither the swarm user nor the adversary will factually (i.e., even if they wanted to) be able to predict the outcome of the interaction. In my framework, this adversarial would classify as a “random drop.” However, the complete lack of foreseeability from both sides stretches the matrix’s ability to offer a definite legal determination as to who would bear primary responsibility for harmful results.
Decision-support systems use chat interfaces to streamline operator interaction, and adversarials can be used to target the large language models powering these interfaces. For example, the model can be poisoned to provide biased advice or suggestions, and adversarial prompts can be used to elicit specific outputs. The challenge is that unlike prior examples, these adversarials are not the proximate cause of any harm that results. They only harmfully influence the operator, who ultimately makes the decision. Here, we need to consider whether the operator’s mental state is also a legally relevant factor. Given the circumstances, was it reasonable for the operator to follow this compromised advice? How foreseeable was it for the adversary that the operator would fall for the adversarial?
Concluding Thoughts
Adversarials are an inevitable response to the rise of military AI. They are versatile and effective, allowing creative belligerents to implement a myriad of strategies. Their asymmetric potential is significant, permitting even disadvantaged parties to check technological superpowers.
Legally, adversarials pose challenges because they blur the lines of control. Unlike an override protocol or the capture of an enemy tank, the adversary never truly gains full control over the target system. The adversary can only indirectly influence its functioning. The result is a messy fumble between belligerents, where it is difficult to discern who (if anyone) is actually in possession of the ball. Legal ambiguity and plausible deniability from both sides is untenable and will leave the civilian as the ultimate victim. To remedy this, I propose a framework to offer a reasoned method to determine which party, system owner or adversary, is expected to implement precautions in attack.
Many questions remain ripe for legal discourse, and I have highlighted a few in this post. I invite the IHL community to theorize further on the conceptual and practical implications of adversarials. The historic arms race pertains not only to “the contest between projectile and armour, between mortar fire and fortification” (AP I, commentary, para. 1412). It also involves the legal community. As technologies and countermeasures evolve, so too must we be prepared to evolve our understanding of the law.
For more on adversarials from the perspective of the commander, and methods to minimize risks that adversarials pose to mission success, see also this chapter.
***
Dr Jonathan Kwik is Researcher in international law at the Asser Institute in The Hague, Netherlands.
Photo credit: Unsplash