Some day in the not-so-distant future we’ll be able to build artificial general intelligence (AGI) that will surpass human intelligence and may be able to improve itself even further without human assistance, leading to an “intelligence explosion”. Since such an AGI will be much smarter than us, we won’t be able to control it, stop it, or even fully understand its decisions. If it pursues goals that aren’t beneficial to us, the results will be catastrophic. Unfortunately, nobody knows how to define a goal – or multiple goals – in a way that ensures the AGI will be beneficial. This is known as the “alignment problem” of AI.
In project Aintelope we're developing a virtual platform that allows experimentation of different sets of motives in various environments, and the consequent benchmarking of the alignment of these agents. We hope that this platform will facilitate further discussion on how cooperation works in practice, and in AI safety, and the systematic testing of agents in general.