Open Problems in Cooperative AI.
Dafoe, Allan, et al. “Open Problems in Cooperative AI.” arXiv preprint arXiv:2012.08630 (2020).
It becomes real once you give it a name. “Cooperative AI” gives a name to the collection of ideas and interests I’ve had for a while now, connecting work in multi-agent learning, value alignment, safety, human preferences, and social good under one umbrella.
Beyond specific capabilities, I am most excited about its applications to help us out of global cooperation deadlocks: Problems of policy, economics, governance, etc. It seems that for many global problems, humans struggle to get out of the current equilibrium by ourselves - in which case we can use the help of AI to change the game. For example, institution and mechanism design to facilitate cooperation and common welfare sound promising; I can imagine that AI-led discovery may allow us to find supercharged Nudge-like mechanisms (see Nudge theory from behavioural economics and the eponymous book by Sunstein and Thaler).
There is only so much I can fit into these slides, but if you read the full 30-page publication, you will also find many more golden nuggets - machines that teach humans is an interesting one - and tricky questions like “when should we be satisfied with a system merely on the grounds of empirical performance and when should we push for an explanation of its decision?”
The importance of risk mitigation in this line of inquiry is immense. I can imagine that gains in cooperative capability may be unbalanced: It seems likely that we may improve significantly in mechanical understanding of behaviours and reactions in cooperative games, but struggle a lot with abstract parts like value understanding and gaining agreement/trust of humans. In that case, we may end up with incorrigible systems in huge positions of authority and power, highlighting the “Coercive Capabilities” problem.
In any case, this paper feels like a glimpse of the future. With some optimism, it can be a beautiful future where AI and humans live in synergy (yes, even human-human!) - but we need to work hard to make it so.
View in Google Slides.
Join the conversation on: