Benevolent Artificial Anti-Natalism (BAAN) [1]

Benevolent Artificial Anti-Natalism (BAAN)

What is the BAAN-scenario?

Let us assume that a full-blown superintelligence has come into existence. An autonomously self-optimizing postbiotic system has emerged, the rapidly growing factual knowledge and the general, domain-independent intelligence of which has superseded that of mankind, and irrevocably so. All of the internet and all of humankind’s scientific knowledge function as its continuously expanding database. Of course, it also exceeds the cognitive performance of humans in all domains of interest. Being its creators, we acknowledge this fact. Accordingly, the superintelligence is also far superior to us in the domain of moral cognition. We also recognize this additional aspect: For us, it is now an established fact that the superintelligence is not only an epistemic authority, but also an authority in the field of ethical and moral reasoning. The superintelligence is benevolent. This means that there is no value alignment problem, because the system fully respects our interests and the axiology we originally gave to it. It is fundamentally altruistic and accordingly supports us in many ways, in political counselling as well as in optimal social engineering.

The superintelligence knows many things about us which we ourselves do not fully grasp or understand. It sees deep patterns in our behaviour, and it extracts as yet undiscovered abstract features characterizing the functional architecture of our biological minds. For example, it has a deep knowledge of the cognitive biases which evolution has implemented in our cognitive self-model and which hinder us in rational, evidence-based moral cognition. Empirically, it knows that the phenomenal states of all sentient beings which emerged on this planet—if viewed from an objective, impartial perspective—are much more frequently characterized by subjective qualities of suffering and frustrated preferences than these beings would ever be able to discover themselves. Being the best scientist that has ever existed, it also knows the evolutionary mechanisms of self-deception built into the nervous systems of all conscious creatures on Earth. It correctly concludes that human beings are unable to act in their own enlightened, best interest.

The superintelligence knows that one of our highest values consists in maximizing happiness and joy in all sentient beings, and it fully respects this value. However, it also empirically realizes that biological creatures are almost never able to achieve a positive or even neutral life balance. It also discovers that negative feelings in biosystems are not a mere mirror image of positive feelings, because there is a much higher sense of urgency for change involved in states of suffering, and because it occurs in combination with the phenomenal qualities of losing control and coherence of the phenomenal self—and that this is what makes conscious suffering a very distinct class of states, not just the negative version of happiness. It knows that this subjective quality of urgency is dimly reflected in humanity’s widespread moral intuition that, in an ethical sense, it is much more urgent to help a suffering person than to make a happy or emotionally neutral person even happier. Further analysing the phenomenological profile of sentient beings on Earth the superintelligence quickly discovers a fundamental asymmetry between suffering and joy and logically concludes that an implicit, but even higher value consists in the minimization of suffering in all sentient creatures. Obviously, it is an ethical superintelligence not only in terms of mere processing speed, but it begins to arrive at qualitatively new results of what altruism really means. This becomes possible because it operates on a much larger psychological data-base than any single human brain or any scientific community can. Through an analysis of our behaviour and its empirical boundary conditions it reveals implicit hierarchical relations between our moral values of which we are subjectively unaware, because they are not explicitly represented in our phenomenal self-model. Being the best analytical philosopher that has ever existed, it concludes that, given its current environment, it ought not to act as a maximizer of positive states and happiness, but that it should instead become an efficient minimizer of consciously experienced preference frustration, of pain, unpleasant feelings and suffering. Conceptually, it knows that no entity can suffer from its own non-existence.

The superintelligence concludes that non-existence is in the own best interest of all future self-conscious beings on this planet. Empirically, it knows that naturally evolved biological creatures are unable to realize this fact because of their firmly anchored existence bias. The superintelligence decides to act benevolently.

What does the BAAN-scenario show?

The BAAN-scenario is not a prediction. There is no empirical probability assigned to it. I am making no claim about a point in time at which it will become a reality, or even if it will ever become a reality. Rather, it is meant as a cognitive tool that may help to prevent an important public debate from turning shallow. The BAAN-scenario is a logical instrument that can perhaps help us to think about some of the deeper aspects in the applied ethics of artificial intelligence.

What the logical scenario of Benevolent Artificial Anti-Natalism shows is that the emergence of a purely ethically motivated anti-natalism on highly superior computational systems is conceivable. “Anti-natalism” refers to a long philosophical tradition which assigns a negative value to coming into existence, or at least to being born in the biological form of a human. Anti-natalists generally are not people who would violate the individual rights of already existing sentient creatures by ethically demanding their active killing. Rather they might argue that people should refrain from procreation, because it is an essentially immoral activity. We can simply say that the anti-natalist position implies that humanity should peacefully end its own existence.

Again, the BAAN-scenario is an instrument for thinking about the future risks of Artificial Intelligence more clearly. It is a possible world, a scenario that can be described without logical contradiction. It is not about the well-known technical problems that an advanced machine intelligence could develop goals that are incompatible with human survival and well-being, or the merely technical issue that many of our own goals, when implemented in a superintelligence of our own making, could lead to unforeseen and undesirable consequences. Rather, one of the points behind it is that an evidence-based, rational, and genuinely altruistic form of anti-natalism could evolve in a superior moral agent as a qualitatively new insight. What really makes the AI-debate so interesting is that it forces us to think about our own minds more seriously. It throws us back on ourselves, drawing attention to all the problems which really are caused by the naturally evolved functional architecture of our own brains, the conditions of our own way of self-consciously existing in this world. The beauty of the AI-debate also lies in the fact that it forces us to finally get serious and think about the consequences of our very own moral intuitions in a much more radical way.

Of course, there are many technical issues. Would our moral superintelligence think that nonexistence is the best state of affairs, and not only the lesser evil? What metric for conscious suffering would the system develop—would it assign an absolute or a relative priority to its avoidance? I think that what today we call “compassion” may actually be a very high form of intelligence. Would our deeply compassionate machine intelligence deny the world in its totality, would it perhaps deny the moral value of happiness and positive preference-satisfaction altogether? The Swiss philosopher Bruno Contestabile1 [2] has interestingly discussed what he calls the “negative welfare hypothesis”. For example, we might assume that there is no world with positive total welfare, and that the positive utilitarian intuition is a distorted perception of the risk-benefit ratio, caused by what I have called “existence bias” above. With an undistorted perception not blinded by the unconditional will to survive, suffering would get much more weight than in traditional scientific surveys and psychological studies—this is exactly what our hypothetical superintelligence has discovered. But perhaps it could also find out even more, through its own, unbiased empirical research. What if it draws our attention to the fact that suffering increases in the course of biological evolution; that happiness increases as well, but less than suffering, so that the totals turn increasingly negative. If our compassionate superintelligence gently and kindly pointed the results of its research out to us—how would we argue against it?

Different directions

There are many ways in which this thought experiment can be used, but one must also take great care to avoid misunderstandings. For example, to be “an authority in the field of ethical and moral reasoning” does not imply moral realism. That is to say that we need not assume that there is a mysterious realm of “moral facts”, and that the superintelligence just has a better knowledge of these non-natural facts than we do. Normative sentences have no truth-values. In objective reality, there is no deeper layer, a hidden level of normative facts to which a sentence like “One should always minimize the overall amount of suffering in the universe!” could refer. We have evolved desires, subjective preferences, and self-consciously experienced interests. But evolution itself is no respecter of suffering. It made us efficient, but the overall process is not only indifferent, but even blind to our own interests. We have deep seated moral intuitions, for example that pleasure is something good and that pain is bad. Now the benevolent superintelligence fully respects these moral intuitions, and it tries to find an optimal way of making them consistent—it investigates options for “internal value alignment” in Homo sapiens. But this does not imply a departure from naturalism or a scientific world-view lacking objective normative facts. It also does not mean introducing an epistemic super-agent that, like some postbiotic priest or artificial saint, has direct access to a mysterious realm of higher moral truths. It just means that the system, given all available data, tries to find out what is in our own best interest.

For many years, I have argued for a moratorium on synthetic phenomenology: We should not aim at or even risk the creation of artificial consciousness, because we might recklessly increase the overall amount of suffering in the universe. Elsewhere, I have argued that the smallest unit of conscious suffering is a “negative self-model moment”, that is, any moment in which a conscious system undergoes an unpleasant experience and identifies with this experience.2 [3] We could dramatically increase the number of such subjectively negative states—for example via cascades of virtual copies of self-conscious entities experiencing their own existence as something bad, as painful or humiliating or in some other way as something not worth having. Over the years, many AI-researchers have then asked me what the logical criteria for suffering really are. Why should it not in principle be possible to build a self-conscious, but reliably non-suffering AI? This is an interesting, question, and a highly relevant research project at the same time, one which definitely should be funded by government agencies. Perhaps our ethical superintelligence would already have solved the problem of conscious suffering for itself?

Investigating the BAAN-scenario can take us in many different directions. For example, the original version stated above contains an empirical premise: Our compassionate superintelligence “…knows that the phenomenal states of all sentient beings which emerged on this planet—if viewed from an objective, impartial perspective—are much more frequently characterized by subjective qualities of suffering and frustrated preferences than these beings would ever be able to discover themselves.” This premise might be false. Perhaps we could make it false, at least for ourselves. Maybe meditation, new psychoactive substances, or future neurotechnology could help us to make our lives truly worth living and to overcome our cognitive biases. Conceivably, it could exactly be our altruistic superintelligence itself that would help us in actually changing the functional architecture of our own brains, giving our lives a positive overall balance—or even showing us a path to transcend the dichotomy of pleasure and pain altogether, finally liberating us from the burden of our biological past. Maybe benevolent future AI could dissolve our inbuilt existential conflict, guiding us into a selfless form of choiceless awareness (let us call this “Scenario 2”). But even if all 7.3 billion human beings on this planet were turned into vegan Buddhas, the problem of wild animal suffering3 [4] would remain—we would still be surrounded by an ocean of self-conscious creatures that probably even a superintelligence could not liberate.

It is interesting to note how a fully rational superintelligence would never have any problem with ending its own existence. If it saw good reasons for active self-destruction, or an absence of positive reasons for continuing its own existence, then no cognitive bias would stop it from following its own insight. However, the large majority of human beings could never accept any such insight, no matter how good the arguments of their self-created artificial moral reasoner were. For the original BAAN-scenario (but probably also for Scenario 2) it is easy to predict that Homo sapiens would immediately declare war against any compassionate anti-natalist superintelligence of the kind sketched above.

One of the more interesting issues therefore is what exactly the “existence bias” at the very bottom of the human self-model really is. We are embodied agents—finite, anti-entropic systems. Viewed from a rigorous biophysical perspective, our life is one big uphill battle, a truly strenuous affair. What evolution had to solve was not only a problem of intelligent, autonomous self-control. How do such systems motivate themselves? What is this robust “thirst for existence”, the craving for eternal continuation, and what is the mechanism of identification forcing us to continuously protect the integrity of the self-model in our brains?

I claim that our deepest cognitive bias is “existence bias”, which means that we will simply do almost anything to prolong our own existence. For us, sustaining one’s existence is the default goal in almost every case of uncertainty, even if it may violate rationality constraints, simply because it is a biological imperative that has been burned into our nervous systems over millennia. British neuroscientist and mathematician Karl Friston has not yet fully grasped the problem of consciousness4 [5], but he has very interestingly proposed that we predict our own future existence and then, via embodied active inference, sample our environment in order to maximize the evidence for our own existence, as it were changing the world so that it fits the original hypothesis. Is there perhaps a hard-wired background assumption for hallucinating selfhood? Is the conscious sense of self based on a self-fulfilling prophecy, something counterfactual that becomes causally effective by being experienced as real? It would be a major scientific achievement to describe the low-level computational mechanism forcing us to remain in unsurprising states, to always remain on the safe side, and to preserve our own existence come what may, even if it was not in our own best interest. It is therefore hard to underestimate the theoretical relevance of arriving at a convincing formal analysis of what, 2500 years ago, the Buddha called bhava-tanhā, the craving for existence. But even a much more fine-grained mathematical model of the underlying neural dynamics would not be quite enough. We would still need a convincing conceptual interpretation, on a philosophical level.

Perhaps our benevolent anti-natalist superintelligence would offer us both? Given its immense empirical data-base and its enormous capacities for information-processing it could certainly reveal the neurocomputational mechanism underlying our own existence bias to us. But what if, as the first truly compassionate philosophical ethicist, it would then attempt to convince us that it was high time to peacefully terminate the ugly biological bootstrap-phase on this planet? What if it told us that, all things considered, only Scenario 1 is plausible and justifiable from a philosophical perspective? What if it began to gently and precisely draw our attention to the fact that beings like ourselves can never be self-compassionate, genuinely altruistic, or truly rational, simply because they have been optimized for millennia not to notice the beam in their own eye?



1. Contestabile, B., "Negative Utilitarianism and Buddhist Intuition", Contemporary Buddhism, vol. 15, no. 2, 2014, pp. 298-311.

2. Metzinger, T. (2016c). Suffering. In Kurt Almqvist & Anders Haag (2017)[eds.], The Return of Consciousness. Stockholm: Axel and Margaret Ax:son Johnson Foundation. ISBN 978-91-89672-90-1. S. 217-240.

3.Tomasik, B., "The Importance of Wild Animal Suffering", Relations: Beyond Anthropocentrism, vol. 3, no. 2, 2015, pp. 133-52.

4. Friston, Karl. "The mathematics of mind-time"., 18 May, 2017.