Responsible AI Symposium – Legal Implications of Bias Mitigation

by | Dec 7, 2022

Bias mitigation

Editor’s note: The following post highlights a subject addressed at an expert workshop conducted by the Geneva Centre for Security Policy focusing on Responsible AI. For a general introduction to this symposium, see Tobias Vestner’s and Professor Sean Watts’s introductory post.


The presence of bias in artificial intelligence (AI) systems has been widely discussed in recent years following reports of racial and gender bias in applications destined for the civilian sector. In parallel, conversations have taken place on how bias could manifest in intelligent systems destined for military use. This concern has been reflected in several of the ethical guidelines and principles for responsible AI published by States and international organizations. Many of these publications include the principle of bias mitigation, which calls for the minimization of unintended bias in the development and use of AI applications.

Although bias is primarily a technological issue, it has wider consequences, including in the legal sphere. This post explores the legal implications of bias in AI systems used by armed forces. Specifically, it analyzes its possible ramifications for the international humanitarian law (IHL) principles of distinction, proportionality and precautions in attack.

What are Biases

When discussing bias, it is important to first understand that the term is not inherently negative. Indeed, in technical literature bias is simply defined as a deviation from a standard. For example, the average height of the players of a basketball team represents a deviation from the average height of the general population. This constitutes a statistical bias. Bias can also be moral, legal or social. Algorithmic bias encompasses different types of biases – both intrinsic and external – arising from AI systems.

In the military realm bias is defined as unintended bias that causes unwanted harm, based on characteristics such as race, gender or other personal attributes. Problems arise only when a bias precludes a system from fulfilling its goals or behaving according to relevant legal, moral or social standards.

Effect of Bias on Distinction in AI Development

AI technology has many possible applications for military use. Perhaps the most obvious and most widely discussed is its potential use for reconnaissance, target identification, and target engagement. At the core of targeting law lies the first legal challenge stemming from bias: the principle of distinction and its associated rules. Algorithmic bias can affect the ability of intelligent systems to properly distinguish between military and civilian persons and objects and thereby comply with this key principle and its related rules.

Bias is likely to first arise in the development phase of a system, appearing in the training data, algorithms, and criteria used for the classification of persons and objects. To comply with the principle of distinction, a system must be programmed to classify persons and objects into fixed categories according to IHL’s requirements. The main difficulty in training a system is determining which criteria to use in which context and subsequently succeeding in coding them efficiently.

One solution would be to train a system to recognize uniforms or insignias. This method, however, has limits and could not be used to identify members of non-State armed groups or civilians directly participating in hostilities who do not wear a uniform or recognizable insignia. In these contexts, other criteria, such as age, gender, and race, would have to be used to model the typical fighter. The ensuing risk is that humans’ conscious or unconscious bias as to what fighters generally look like would be reproduced in the algorithm.

For example, a 2009 American study found that during a rapid shooting exercise, Cadets responded more quickly and accurately when shown images of guns that were preceded by images of Middle Eastern men in traditional clothing. The study assessed that the United States’ engagement in protracted armed conflicts in the Middle East, and thus Cadets’ exposure to a specific group of fighters, created cultural stereotypes that influenced their response time and decision to shoot.

The same could happen with AI if a system is developed using data sets that overly focus on one group of people. For example, if a system is trained with mostly or only photos of Middle Eastern men carrying weapons, there is a risk that the algorithm will start basing its determinations on ethnicity and/or gender instead of on the presence of a weapon. In this sense, the United Nations Institute for Disarmament Research highlights that a system would be more likely to categorize men of military age as combatants because they represent a majority of the combatants in armed conflicts and consequently make up the majority of the data sets used to program a system. Hence, systems designed to recognize objective criteria could inadvertently introduce biases and base their decisions on the wrong criteria.

These biases could also be deliberately programmed into systems. For example, AI could be trained to identify “military-age men” – a category introduced by a few defense and intelligence agencies in the context of drone strikes that aims to model the typical appearance of a fighter. The issue is not that the criteria would not reflect reality. It is, after all, true that most fighters are men. Rather, the problem is that it would lead to an oversimplification of target determination based on the assumption that women and children are civilians and men are combatants. Research has shown that due to the introduction of this category, the U.S. armed forces’ threshold for killing individuals via drone attack was lower for men than for women during both the George W. Bush and Barack Obama administrations. This effectively erodes civilian protection for young men in combat areas simply because they belong to a high-risk group regardless of their actual legal status. Such a lack of nuance in the criteria used for target identification creates a high risk of misclassification.

At the training level, the Australian Army Research Centre warned that machine learning systems could be inefficient at categorizing minority groups because they tend to “reduce or remove the impact of outliers in training data.” This means that minority groups who are underrepresented in data sets face a higher risk of being misclassified than other groups. This could be especially problematic if the algorithm fails to correctly label members of a group programmed to fall into a “do not target” category.

The above examples illustrate how bias arising in data and algorithms can lead to inaccurate status determination assessments. These assessments, in turn, risk flouting the principle of distinction, to the detriment of civilian populations and objects.

Effect of Bias on Distinction in AI Deployment

Bias may also arise from the inappropriate deployment or use of systems, thereby decreasing their accuracy and impacting the principle of distinction. Bias can emerge if a machine is deployed in a context for which it was not trained. This is called “transfer context bias,” whereby a system trained with data gathered during one armed conflict may not be appropriate for deployment in another conflict due to differences in environment and actors. For example, a system trained to recognize a specific insignia would be unable to recognize insignias in another cultural or regional context. Creating a proper data set for system training is thus complex, especially considering that data from armed conflicts are a rare commodity. For this reason, setting a clear use case for each system is necessary.

Bias can also arise from the interaction between humans and machines. “Interpretation bias” refers to a human (or a system) incorrectly interpreting the output provided by an AI system. An example is an application designed to flag individuals who exhibit signs of belonging to an armed group where the operator, instead of running a check to verify the individuals’ status, interprets the system’s output as a clear determination that an individual is a member of the armed group. Similarly, “automation bias” – i.e. the tendency of humans to over-trust machines – could lead to a misclassification of persons or objects if the operator fails to identify mistakes made by the system. These forms of bias are usually discussed within the purview of human-machine teaming. In this context, a solution to biases intrinsic to AI systems is requiring humans to review of systems’ output. Yet, human oversight also introduces the risk of new biases.

Consequently, commanders should assess how the AI system they decide to deploy will interact with its environment and operator to avoid generating bias inconsistent with the principle of distinction.

Indirect Effect of Bias on Proportionality and Precautions

Biases that affect the classification of people and objects necessarily impact the IHL principles of proportionality and precautions in attack.

The misclassification of persons and the erroneous identification of military objectives inevitably distort proportionality calculations. Imagine a scenario in which a system misidentifies a group of civilians located near a legitimate military objective and categorizes them as combatants. In this case, the proportionality assessment would be falsified by the assumption that no civilians are present around the military objective. The proportionality determination would be a corollary of that made with respect to distinction.

The risk of misclassification of persons and objects deriving from the presence of bias would also affect the obligation to take precautions in attack. The requirement to take all feasible measures to avoid and minimize harm to civilians and damage to civilian objects should reasonably dictate that systems that are known to present intrinsic bias should not be deployed because of the risk they pose to civilians. Evidently, the identification of military objectives should not be performed by a system exhibiting undesired biases, nor should a targeting system be deployed if it is known to present unwanted biases. As a result, to undertake precautions in attack, commanders should assess criteria such as confidence levels, risks of automation bias, and the context in which the AI system will be used before its deployment. This would ensure compliance with legal obligations.


As AI is increasingly integrated into the armed forces, it is necessary to evaluate how its deployment will impact States’ respect for IHL. One issue is the presence of involuntary bias that can adversely affect States’ compliance with fundamental IHL principles. The most salient challenge is the threat AI poses to respect for the principle of distinction due to difficulties in ensuring accurate recognition and identification capabilities. Challenges to the principles of proportionality and precautions in attack additionally showcase how armed forces need to consider the risk of bias at all stages of the conduct of hostilities.

Understanding the potential sources of bias is crucial to improving systems. Fundamentally, possessing biased systems poses a clear disadvantage to armed forces. The fulfillment of the principle of bias mitigation will ultimately help develop trustworthy systems and mitigate related legal challenges.


Juliette François-Blouin is a Programme Officer within the Security and Law Programme at the Geneva Centre for Security Policy.


Photo credit: Unsplash