In 2024, a decade will have passed since an inaugural UN Meeting of Experts initiated discussions on the implications of the development and deployment of autonomy in weapons systems. While views on the achievements made in this setting and by its formal successor, the Group of Governmental Experts (GGE), may vary, there is one area where some progress is undeniably evident. In 2014, the discussion scarcely mentioned States’ obligation under international law to carry out legal reviews of weapons. As years passed, acknowledgment of the significance of this obligation has progressively and consistently gained prominence in the consensus reports of the GGE (see, for example, 2018, 2019, 2021, 2023.)
The ongoing sophistication of military capabilities, including the increasing integration of artificial intelligence (AI) technologies into military systems (such as machine learning and deep learning), poses novel and at times momentous challenges for legal review mechanisms. This post takes stock of some of the key hurdles that reviewing authorities may encounter in assessing AI-powered weapons systems. It urges States party to Additional Protocol I that have not yet done so to treat their international law obligation, as stipulated in Article 36, with the seriousness it deserves.
The Scope of the Review
The initial obstacle to meaningful legal appraisals is identifying the object of the review. Nearly a decade into debates on the future of this technology, no formal agreement as to what constitutes an “autonomous” weapon system (AWS) is in sight. In fact, military capabilities including weapon systems may have different (and often multiple) autonomous functionalities, ranging from ensuring frictionless navigation and intelligence gathering (such as detecting explosives or identifying the location of gunfire), to managing overall system health (including self-repair functionalities) or facilitating interoperability with other systems (as discussed in detail here).
However, States that conscientiously review their weapons all evaluate capabilities that are designed or intended to cause specific effects that the law of armed conflict (LOAC) seeks to limit, that is, injury or death to persons or damage or destruction of objects (see chapter 4 here). Because of this effects-based focus, the most fundamental legal concerns from the perspective of an Article 36 review arise when autonomy facilitates target detection, classification, selection, tracking, and engagement.
And yet, limiting the review focus only to software designed to autonomously perform target selection and engagement (where that software is coupled with the hardware delivering the payload) could be problematic without considering other relevant components of the system, such as the platform on which the weapon will be mounted or the accuracy of sensors that will enable the system to “read” its operational environment. Where the platform is unstable, the accuracy of the overall system’s performance may be affected despite correct performance of target recognition software.
Even where software performance is limited to target detection, prioritisation, and selection (with the ultimate decision to “engage the target” being left to an operator), it may be cavalier not to review that software to ensure that the operator could use the system discriminately, particularly in the case of complex systems with multiple software components each performing discrete yet interconnected tasks. Opinions diverge on this point though, with some suggesting that analysis of this kind is contingent on how a State has “scoped” its review process. They argue that where the focus of review is on weapons, as opposed to “means and methods of warfare,” as required under Article 36, “decision support systems” need not be reviewed.
In essence, given the inherent complexity of weapon systems powered by AI, the challenge is to identify what components of an AWS—including hardware, like sensors and platforms, and software necessary for its operation—may affect the capacity of the system’s operator to use the AWS in compliance with the State’s obligations under international law (see also here).
Assessing the Legality of AWS
Assessing conformity of a system with applicable law poses a task of equal if not greater complexity. Traditionally, the focus of legal reviews has been an examination of intrinsic characteristics of the weapon in the circumstances of its normal or intended use. Legal advice would not extend to assessing compliance with the law of targeting (such as rules on distinction, proportionality, and the requirement to take precautions in attack) because potential scenarios where weapons might be used in unintended ways are diverse and it is impossible to foresee the complete spectrum during the review process. Hence, the reviewing authority is allowed to presume that the weapon operator will respect LOAC targeting rules.
AWS are fundamentally different in that this presumption no longer applies, requiring analysis of targeting law—traditionally performed by a legal adviser to the commander—to be pushed back to the weapons review stage. The fundamental challenge that such assessments pose is that compliance with targeting law requires complex qualitative, context, and value-based judgments. The system’s capacity to independently choose and engage targets implies that the weapon operator might not have knowledge of the particular context, such as the timing of the engagement and/or the target’s location. As a result, where compliance with targeting law raises concerns, the reviewing authority must issue requisite restrictions.
Consider an autonomous system designed to engage hostile maritime vessels based on defined characteristics. If, during testing, this system inaccurately identifies friendly fishing boats as hostile vessels due to a set of shared features, a legal review may need to propose measures such as implementing additional friend-or-foe identification systems or mandating human intervention before any engagement to prevent such misclassifications.
The challenges do not end at this point. Rigorous testing, evaluation, verification, and validation (TEVV) is critical to ensure that the military capability fulfils its design purpose. Devising appropriate TEVV procedures, methods, and protocols for adaptive systems, particularly those that are not limited to optimising existing behaviour but extend to acquiring new behaviours, greatly complicates the process of legal reviews.
Determining what modifications to an adaptive, learning system might warrant additional review and who is best positioned to perform such functions is yet another daunting task. Perhaps, some minor or gradual adjustments to a learning system may be entrusted to a legal adviser situated closer to the operational environment where an AWS is deployed. A legal adviser, could, for example, review software updates that do not result in behaviours going beyond the scope of the preceding review. However, where modifications are substantial, such that the effects that the system was designed to produce have changed, there are limits to what an operational legal adviser can achieve. This is the case not least due to the potentially limited access to empirical information (such as the documentation from the manufacturer on the characteristics and performance of the AWS) or relevant subject-matter expertise (algorithmic assurance, computational system architecture, etc.). And yet, in certain urgent scenarios there may be no time for engaging a “standing” weapons review process (see also here).
Concluding Thoughts
Forty-five years since Additional Protocol I came into force, many of the States that are required by law to conduct legal reviews fail to do so for various reasons. From 174 States party to the Protocol, about 20 States are known to have a review mechanism in place with the United States, Israel, and Türkiye conducting legal reviews as non-parties. Yet, even in relation to weapons that are fairly simple to assess and operate, setting up a well-functioning review mechanism is not without difficulties. Bureaucratic demands, including coordination among departments or agencies involved in the weapons acquisition process, identification of relevant subject-matter expertise, consideration of applicable institutional policies, approval processes, documentation and reporting requirements may all slow down the efficiency of legal reviews. In addition, regular rotation of personnel (characteristic for military establishments) can disrupt the institutional memory and knowledge needed to develop and refine the review mechanism.
As shown above, AWS introduce an extra layer of complexity to “conventional” legal reviews requiring significant adjustments to the existing processes. These challenges notwithstanding, States with the capacity to develop or purchase such technology should equally have the means to appropriately review it. Most importantly, States party to Additional Protocol I that have not yet done so, would be well-advised stop postponing the establishment of an Article 36 mechanism. With the added complexity of AI-powered weapons, it is essential to start the process promptly to ensure that legal reviews can keep pace with advancing technology.
Dr Natalia Jevglevskaja is a Research Fellow at the Faculty of Law and Justice at the University of New South Wales (UNSW Sydney).
This article is republished from Articles of War. Read the original article, which derived from a presentation given at the 2023 Israel Defense Forces Military Advocate General’s 4th International Conference on the Law of Armed Conflict