Introduction
The International Network of Advanced AI Measurement, Evaluation and Science conducted this joint testing exercise of AI ‘agents’ to build better, shared ways to test them across different problem areas.
Agents are AI systems that can plan a course of action, use tools and carry out tasks. The problem areas include leakage of sensitive information, fraud and cybersecurity.
Australia contributed to the exercise. The United Kingdom, Singapore and Japan led the exercise alongside AI Safety Institutes and government mandated offices from Canada, the European Union, France, Kenya, South Korea, and the United Kingdom.
We drew on technical expertise from researchers in:
- CSIRO’s Data61
- Gradient Institute
- Harmony Intelligence
- Mileva Security Labs
- UNSW’s AI Institute.
This exercise aimed to improve how we test AI agents. These systems can take actions on their own in the real world, by planning, choosing steps and using tools. This can create new risks if testing only checks the final answer, instead of checking how the system got there. The main focus was improving testing methods, not ranking models.
These exercises aim to improve our ability to accurately measure AI capabilities and risks. This will help us better identify, understand and manage the risks of AI systems before they cause harm.