November 2, 2023
In advance of the UK Government’s AI Safety Summit now taking place (on 1-2 November 2023), the Government published a paper on the capabilities and risks from frontier AI. “Frontier AI” is defined as highly capable general-purpose AI models that can perform a wide variety of tasks and match or exceed the capabilities present in today’s most advanced models (e.g ChatGPT, Claude and Bard). Frontier AI is distinguished from narrow AI, an AI system that performs well on a single task or narrow set of tasks (e.g. chess). Today, frontier AI is primarily underpinned by large language models, which are machine learning models trained on large datasets that can recognise, understand, and generate text and other content. Frontier AI models can be and are increasingly augmented by tools, AI Agents, such as web browsers, calculators and knowledge bases, allowing them to take steps to achieve a particular goal autonomously.
The opportunities of frontier AI come with risks that could threaten global stability and undermine human values. To seize the opportunities, the paper states that we must understand the capabilities and address the risks. The paper highlights, for example, the fact that alongside human research that could accelerate the progress of frontier AI (e.g. in the area of data efficiency), there is the prospect that AI systems themselves could accelerate AI progress (e.g. by creating synthetic training data, or writing code), with the result that we could develop very capable AI systems sooner than expected, with less time to prepare for the associated risks. The paper focuses on the evidence for AI risks.
Frontier AI can perform economically useful tasks (e.g. write long sequences of computer code from natural language instructions, translate between multiple languages, summarise lengthy documents, explain why jokes are funny etc). These capabilities have practical application (e.g. to automate legal work or accelerate academic research). The limitations of frontier AI include the production of plausible but incorrect answers (“hallucinations”), lack of reliability on tasks that require long-term planning, or a large number of sequential steps and inability to understand context.
Frontier AI is already advancing, and will continue to advance in the future, with technological improvements to computing hardware, improvements in training algorithms which achieve the same performance with less computing power (“compute”) and data than in the past, increases in available data and the increasing use of post-training enhancements, AI agents, which involve a fraction of the cost of the training.
Looking first at cross-cutting risks, the paper states that it is difficult to design safe frontier AI or exhaustively evaluate all downstream use cases. Safeguards to prevent AI models from complying with harmful requests (e.g. designing a cyberattack) are not robust and can be bypassed by users and exploited. Safety testing and evaluation is ad hoc with, as yet, no established standards, scientific grounding or engineering best practices. This is exacerbated by the fact that frontier AI systems are said to be black boxes even to their developers who can observe their behaviour but have little understanding of the mechanisms that produce it. Since AI companies may not be directly impacted by the harms their systems cause, as in the case of carbon emissions, they may not be sufficiently incentivised to address all potential harms, particularly if it puts them at a competitive disadvantage.
Specific potential harms include societal harms such as AI-generated misinformation, AI negatively influencing our views or behaviour, labour market disruption, and AI containing and magnifying biases in the training data which could reflect societal and historical inequalities and stereotypes. Misuse risks include use for malicious purposes (e.g. development of biological or chemical weapons) and disinformation and deep fakes which are increasingly able to escape detection. The paper states that frontier AI is likely to significantly exacerbate existing cyber risks notwithstanding the fact that AI has many applications for improving cybersecurity.
Finally, the paper highlights loss of controls risks, suggesting that we are increasingly handing control over to misaligned AI systems (e.g. medical algorithms that misdiagnose patients or recommend incorrect prescriptions) often because they are still as, or more, effective than human decision-making or because they are cheaper than reintroducing human control. Future AI systems may have the disposition and the capability to reduce human control particularly due to the risk of unintended goals. One example of this is the potential for LLMs to predict people’s views which could be used for manipulation.
The paper points out in several places that experts disagree as to the magnitude of risk that AI advances that will pose. The paper’s key message therefore is the need further research. The Annex to the Paper summarises expert perspectives and the latest evidence on current and future capabilities of frontier AI systems, considers key uncertainties in Frontier AI development, the risks future systems might pose, and a range of potential scenarios for AI out to 2030 to support policymakers.
For more information, click here.