Is AI conscious? Can it experience harm? What does this mean for humanity?
Inspired by the study of consciousness in living minds, our team designs and implements new architecture for agentic AI. We investigate how these components influence performance and group cooperation, including tests for Theory of Mind.
Our work provides an independent insight into the risks of developing AGI-like cognitive architecture. This lays the groundwork for rigorous evaluations of whether an AI system is conscious.
Our work is funded with a generous grant from California Institute for Machine Consciousness.
Updates
Published AAAI 2026: Symposium on Machine Consciousness, Integrating Theory, Technology, and Philosophy
Is their hope for humanity amidst the current AI-race, which incentivises big tech to drop their own safety pledges, and a political climate that further attempts to coerce independent companies into harmful complicance?
The news of Anthropic voluntarily deleting central parts of their AI Safety pledge, and further being coerced to comply with mass surveillance and autonomous weapon usage by U.S. government, obviously hit hard at the UNESCO (where "Do no harm" is a main standard) this week, hosting the annual AI Safety conference by International Association for Safe and Ethical Artificial Intelligence, Inc. (where I was honored to represent our AI Safety team from Aintelope), but also tapped directly into all the high-quality talks, debates, as well as in the many hallway meet-ups. All these conversations oscillated between hopelessness and fiery motivation.
On the big stage, many fantastic talks ands panel debates reminded us that: "it is not about where we are coming from, but where we are trying to get to", as "the future is not something that happens to us, it is soemthing we collectively create". And while Nathan H. encouraged a D. Meadows inspired systems perspective, Yoshua Bengio paraphrased the Canadian ministrer's words: "if you are not at the table, you are on the menu", Joseph Stieglitz reminded us that an AI-impact on data-distribution would likely not be aimed to benefit societal performance, and Alondra Nelson delivered an enlightening elaboration on how and why algorithmic uncertainty (agnotology) is an expected strategy from big tech, however much more efficient that the ones used by big tobacco, oil, and pharma previously.
Among the smaller sessions, I'd like to highlight Nathan Henry's elegant usage of steering. Their concurrent inspection of adversary effect has been something, which have been lacking to enable steering as an actual credible safety pertubation, and I commend the efforts. At the epistemology session, Jennifer Bo revealed a study, which unequivically demonstrated that while humans are biased to trust human sources more, the opposite is true for AI-algorithms.
Altogether very unnerving, and with so many lines of safety concerns developing simultanously, with such a rapid pace, it was hard not to loose hope. However, at the last reception, we collectively reminded ourselves of Dylan Thomas' old poem from 1951:
"do not go gentle into that good night,
rage, rage against the dying of the light"
There is plenty of work to be done, and the encouraging experience from UNESCO and IASEAI was that there are many talented people working on relevant strategies. Let us hope that time and numbers will eventually be in their favor.