How Large-Language Models (LLMs) Expose Political Biases

Image via National Library of Medicine 

***

We keep reaching for the word ‘neutral’ to describe artificial intelligence, as if it sits outside the ideological mess it helps us wade through. However, these systems are not neutral in any meaningful sense—they are often trained and adjusted by flawed constraints of our own behaviors.

What Models Learn to Hide

A growing body of scholarship bears out similar results: LLMs—Large Language Models (ChatGPT, Gemini, Grok, BERT, etc.)—experience measurable political leanings in the textual content they generate, with the degree of bias varying across different systems. An MIT study observed that left-leaning LMs demonstrate better performance in identifying hate speech directed at minority groups, whereas right-leaning LMs are more adept at identifying hate speech targeting dominant groups like men and white individuals. BERT variants—Google’s AI language model—were found to be more socially conservative, whereas OpenAI models generally exhibit more libertarian tendencies.

Although these are not fixed ideological identities, numerous research from Brown University, and others, prove that fine-tuning models on politically skewed datasets can amplify existing biases. Working with polarized corpora—such as post-Trump data— has led to increased polarization within the models themselves. Not particularly in the sense of advocacy, but in tone and emphasis on which responses are ultimately pushed out. 

All this to say: although a model that reflects a certain perspective is not inherently problematic, a model that subtly obscures the existence of another might be. 

Image via National Library of Medicine 

***

Take a moment to view the preceding photo which aggregates an NLM study

This particular study investigates how LLMs maintain deep-seated implicit biases despite being coded to appear neutral and ethical. By applying psychology techniques like the Word Association Test— “a test in which the participant responds to a stimulus word with the first word that comes to mind”—it is shown that even value-aligned models—like GPT-4— link specific social groups to negative stereotypes regarding race, gender, religion, and health. Hidden microaggressions and associations create discriminatory decisions across relative scenarios where the models favor certain demographics for high-status roles while marginalizing others. In a nutshell, there is a superficial layer of fairness that fails to root out the systemic prejudices embedded within the model’s behavior. With rhyme and reason, the propagation of bias in these models inevitably leads to skewed results. 

The Myth of “Better Data”

Across the board, unregulated AI advocates double down on the sentiment that AI will only ‘get better from here,’ arguing that ‘more data’ is the simple fix for LLM bias.

This is true-ish.

This claim is largely manichean in what that data represents. More data will not remove latent bias, because its bias is not a data-quantity issue; it is a data-quality issue. Not to mention, knowing that most models use records of human behavior only seems to convolute the matter. It raises an obvious question: is human behavior—in media, history, or even the sciences—ever a neutral record, devoid of inequality or historical conditioning? (No, unsurprisingly) Therefore, increasing the volume of such data would not dissolve all of these biased associations; it could even reinforce them. 

(Suppose you study for a test using the wrong studyguide, reading it ten times doesn’t fix anything; you just become more confident in the same mistakes.) 

Although existing refining techniques like Reinforcement learning from human feedback, RLHF, can and have reduced harmful outputs and pushed models toward safer, neutral answers, doing so is exactly why psychologists can still draw out deep, hidden biases using Word Association Tests. All this to say, even diverse data or tweaks can give rise to imbalances that are later amplified. Subtle caveats do exist, but the key takeaway is that users need to bear in mind that LLMs do not claim clean hands. Matter of fact, AI’s bias problem may be its users. 

Owning the Bias

LLM bias sounds like a glitch, but it carries real political implications. They are becoming popular mediators of how people access information from news to policy explanations. The aforementioned biases are selling us a particular worldview nudging the needle just enough to matter, but not enough to notice. For example, in high-stakes contexts like voting advice applications, the smallest effects can tip the scales of user perceptions. 

This critique is not a blanket condemnation of AI or even the institutions that build it. Rather, it is an invitation to reflect: Is there any such thing as “clean” or “non-toxic” data capable of producing genuinely fair models? After all, an algorithm is only as good as the data it works with. Not to mention, if we are to live with the inevitability of its growth, then we have to match its global ambit by renewing bipartisan commitments to transparency and regulation.

***

This article was edited by Ria Mukherjee. 

Related Post

Leave a Reply