The Dogmatic Algorithm

25 Apr The Dogmatic Algorithm

Posted at 13:17h in Research Blog by Paolo Rudelli

Introduction

The advent of Large Language Models (LLMs) has marked an era of transformation in human interaction with knowledge and information, generating the expectation of a universal democratisation of knowledge. However, a deeper analysis reveals a fundamental challenge: their “knowledge” derives from vast textual and multimedia datasets on which they are trained. The selection and composition of these datasets inevitably reflect the languages, histories, values, social norms and prejudices prevalent in the societies where the data and their developers originate. This conditioning significantly influences what LLMs process and present as “knowledge” or “truth”. This essay will explore how LLMs are not neutral but convey an implicit dogmatism stemming from data bias, which conditions users and contrasts with Enlightenment ideals, analysing data bias, the emergence of a new dogmatism, historical parallels, mechanisms of influence, user impact and geopolitical differences.

Conditioning at the Source: Training Data

The fundamental premise is that LLMs are not ‘tabulae rasae’, but systems trained on vast textual and multimedia datasets, and the very nature of this training data constitutes the core of the problem. These corpora, however immense, are not perfect and objective mirrors of reality or the totality of human knowledge; they are, by their nature, selections, conscious or unconscious, of what has been written, published and digitised in certain languages, by certain cultures, in certain historical periods, and often reflect the perspectives and values of dominant cultures or those who have had the means to produce and disseminate content on a large scale. The selection and composition of these datasets are inevitably culturally conditioned, reflecting the languages, histories, values, social norms and prejudices prevalent in the societies where the data and developers originate. From a sociological and anthropological standpoint, knowledge is not a fixed and universal entity, but a social and cultural construct, influenced by power structures, prevalent narratives and implicit biases. LLM datasets, as aggregates of this construct, inherit and encode such biases. What an LLM learns is not truth in the philosophical or empirical sense, but the statistical probability with which certain words and concepts appear together in its dataset. This “knowledge” is therefore a reflection of the correlations and frequencies found in the training texts, which in turn reflect the ways in which the world is described, understood and judged within the cultures that produced those texts. This influence profoundly conditions the results of LLMs and their presentation of “knowledge” or “truth”. If a topic is treated in an unbalanced way in the dataset (perhaps because a certain perspective is dominant in the media or historically), the LLM will learn and reproduce that imbalance as part of its “understanding” of the topic.

A New Form of Dogma? The Output of LLMs

As a direct result of bias in the training data, LLMs tend to reiterate and, in a sense, validate certain perspectives, opinions and behavioural patterns. This phenomenon can be defined as the emergence of a new form of implicit dogmatism. It is not necessarily an explicit imposition of rigid “dogmas” (although censorship may exist), but rather the creation of a form of implicit dogmatic thinking, where certain answers or viewpoints appear more “correct,” “natural” or “authoritative” simply because they are prevalent or favoured in the training data. The output of LLMs tends to reflect and reinforce the prevalent perspectives in the data, proposing a form of “knowledge” which, although broad, is fundamentally conditioned. This is not limited to a simple matter of factual accuracy, but extends to the promotion of behavioural ‘dogmas’ and ‘correct’ opinions, configuring itself as a return to a dogmatism that resembles the Cartesian one, widely criticised by philosophers such as Deleuze for its rigidity and its claims to absolute truth.

Echo of History: Past Dogmatism vs. Present Dogmatism

This phenomenon creates a new form of dogmatic thought that seems to partially invert the objectives of the Enlightenment. We can draw a parallel between the conditioning effected by LLMs and forms of social and intellectual conditioning in the past. Let us recall the example of religious schools or other historical institutions that shaped cultural elites (“scholars”) strongly conditioned by prevalent dogmas, limiting critical thinking and diversity of perspectives. These schools did not teach absolute truth or free thought in the modern sense, but interpreted and transmitted a defined body of sacred texts and doctrines through a dogmatic lens, creating “scholars” whose erudition was profound but intrinsically limited and shaped by accepted dogmas, which could not be fundamentally questioned from within the system. Similarly, LLMs, however “learned” on their data, operate within the limits and biases of that corpus and the alignment algorithms that shape their output. They thus propose not knowledge open to critical inquiry, but a filtered and oriented “knowledge” that tends to reproduce informational and behavioural status quos.

This contrasts sharply with the ideal of the Enlightenment, which aimed precisely to free thought from dogmatic tutelage and uncritical authority, promoting reason, doubt and the autonomous search for truth. The philosophy of the Enlightenment was based on the concept that education, access to knowledge and enlightenment through rational reasoning would liberate humanity from oppressive ignorance. However, now that we find ourselves in a digitally conditioned era, LLMs introduce a new paradigm of dominant thought, conditioning us to a kind of dogmatic consensus. We highlight the paradox: tools born of advanced technology and presented as carriers of universal knowledge risk recreating, in a new form, what the Enlightenment sought to escape.

A relevant historical example draws on sociopolitics, illustrating how linguistic and educational policies can function as instruments of exclusion or hegemony. Consider the imposition of an official language in some African countries: the language of a specific dominant ethnic group, defined as “official”, has historically excluded members of ethnic minorities. This exclusion manifested itself first in access to elite schools and, consequently, in roles of power and responsibility for entire generations. This historical case vividly demonstrates how an intrinsic bias, whether present or induced in social structures, can be reflected in the medium and long term on entire ethnic and social groups, disadvantaging minorities. Similarly, in the digital context, LLM datasets and algorithms can function as tools to privilege certain worldviews. Just as discriminatory language policies of the past used language to shape society and exclude groups from power, so the biases present in LLM training data can subtly shape users’ thinking. When the information provided by LLMs is at odds with or non-conformant to the perspectives, values or sensitivities of specific ethnic or social groups, there is a risk not only of reproducing existing informational inequalities, but of further accentuating social fractures and increasing exclusion, conveying narratives that favour certain perspectives at the expense of others and potentially promoting problematic views if repeated uncritically in other social contexts.

How Influence Occurs: The Subtlety of the Algorithm

Let us analyse the main ways in which LLMs exert this influence. We can distinguish between explicit censorship, i.e. the deliberate removal of content or restriction on certain topics for ethical, political or commercial reasons (although this may be a less central aspect of the argument than implicit influence), and insidious transversal influence. The latter is the most insidious mechanism, manifesting in the way the LLM structures responses, selects information to present (or omit), the tone used, and the prioritisation of certain sources or narratives that are “more aligned” with the dominant policies, values and perspectives of the nations and societies that generated the models. This occurs through the statistical correlations learned from the data, which are not neutral but laden with bias. Using techniques that social psychology has studied for decades (such as framing, salience, repetition), LLMs can present information or arguments in a way that subtly privileges one perspective over another, select examples that support a certain thesis, or formulate responses in a tone that suggests the “correctness” of a particular position – often the one most aligned with the policies and values of the nations or societies in which they were developed or that control the platforms on which they operate. Social and cognitive psychology demonstrates the ubiquity of conditioning and cognitive biases. We are constantly influenced by the environment, social interactions and the information to which we are exposed. Concepts such as social learning theory or confirmation bias explain how we tend to absorb and reinforce dominant narratives or those that confirm our pre-existing beliefs. LLMs, by presenting information with a certain frequency or phrasing, act as powerful reinforcing agents, shaping users’ perceptions even unintentionally. It is important to note that behind the seemingly neutral choice of data or the justification of certain filters based on “security” and “safety”, there may in reality be a will for systematic conditioning, orienting access to information in ways that favour certain worldviews at the expense of others.

Geopolitical Mirrors: Comparison between Different Origins of LLMs

Let’s analyse how this dogmatic conditioning manifests differently depending on the geopolitical origin of the LLMs. We can take the example of the visible difference on certain sensitive topics between models developed in “Western” contexts (with their specific values, political and social sensitivities) and models developed in “Eastern” contexts (with their different sets of values, histories and state controls). This concretely demonstrates how the conveyed “dogma” is not universal but culturally and politically situated. The divergences in the way these models describe controversial historical events, political systems, social values or key figures clearly reveal the imprint of the cultural and political conditioning of their environment of origin.

These differences are not mere variations in style, but represent profound divergences in the “knowledge” and values that the model has been trained to reproduce and, in fact, to promote. For example, the view on historical figures with a double public “image”, such as Winston Churchill or Gandhi, can vary significantly. A model trained predominantly on Western data might emphasise Churchill’s role as a wartime leader, while other datasets might place greater emphasis on his controversial aspects, including his colonial views and accusations of genocide. Similarly, while Gandhi is often universally recognised as an icon of non-violent resistance, the interpretation of his role and legacy can differ significantly depending on the prevalent historical narratives in the training data. Models based on datasets that include critical perspectives or specific historiographies from South Africa might not only emphasise his struggle against Indian discrimination, but also report controversial details from his time there, such as the use of derogatory terms towards black Africans or criticisms that his action was primarily focused on the rights of the Indian community rather than a broader, more inclusive struggle. This shows how different datasets can construct composite, sometimes contrasting, images of the same historical figure.

The disparities become even more evident in the interpretation of complex historical events. Consider the Allied landings in Normandy: a Western model might describe it primarily as a heroic act of liberation, fundamental to ending the war; however, minority perspectives or those from different contexts might also analyse it in light of its very high human cost or as a hasty action dictated by specific geopolitical objectives. Another apt example is the view of the Black Ships (Perry’s expedition) in Japan in 1853 (and the 1855 agreements): while a Western narrative might emphasise Japan’s opening to global trade, a Japanese perspective might focus on the forced imposition and the consequences for the country’s social and political structure. These practical examples illustrate how LLMs, reflecting the biases in their source data, can offer divergent and culturally oriented views of history and society.

The User in the Algorithmic Ecosystem

Let’s examine the impact on the end user. The user interacts with the LLM often perceiving it as an authoritative and “objective” source. However, the user is influenced indirectly, absorbing and normalising the perspectives, “truths” and behavioural models proposed by the LLM, often without being fully aware of it. This process is not necessarily a classic coercive “indoctrination”, but rather a passive conformation of thought and opinions. This conformation is induced by repeated exposure to filtered and algorithmically biased content and perspectives. Furthermore, the influence propagates exponentially: if an author uses an LLM to write a blog, a text for a documentary or corporate content, the intrinsic “conditioning” in the model extends to all end users who will be exposed to that content, further amplifying the reach of the algorithmic bias. Without a robust capacity for critical thinking and alternative comparison sources, the user risks uncritical acceptance and potential conformation of their own opinions, being progressively influenced by this new digital dogma. This raises important implications for critical thinking in the digital age.

Conclusion

In conclusion, while we celebrate the potential of LLMs, it is crucial to recognise and address the risk that they become involuntary (or, in worse-case scenarios, intentional) architects of a new digital dogmatism. The Enlightenment bequeathed to us the value of independent reason, scepticism towards authority and the importance of critical verification. Faced with artificial intelligence systems whose “knowledge” is deeply shaped by cultural and social biases, it is imperative to develop a greater critical awareness as users and promote transparency and accountability in the development of LLMs. Only by recognising that LLM outputs are culturally mediated and not neutral representations of reality can we hope to use these powerful technologies in a way that supports, rather than stifles, the critical thinking and plurality of perspectives that are fundamental to a free and informed society. By highlighting the importance of awareness of LLM biases, we propose reflections on the future: how to mitigate biases, the importance of transparency, and the crucial role of user critical thinking as a bulwark against the new dogmatism. The challenge of engaging with cultural data through LLMs is not just technological, but an ethical duty towards a more just and inclusive society.

References:

Risks of Using Proprietary Data with Large Language Models (What is Ai)
Ethical AI: Addressing Bias in Data Collection & Model Training (SO Development)
Everybody thinks – Deleuze, Descartes and rationalism (Alberto Toscano)
The Political Consequences of the Jesuit Expulsion from New Spain (Santa Fe Institute Events Wiki)
Everyday Forms of Language-based Marginalization in Zimbabwe (Università di Melbourne)
Language Rights, Racisms, and Language Education Policy in Angola (Oxford Research Encyclopedia of Education)
Online harm, free speech, and the ‘legal but harmful’ debate: an interest-based approach (Journal of Media Law)