Unlock the full potential of AI with Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Be Well, Citizen
Latest   Machine Learning

Be Well, Citizen

Last Updated on December 30, 2023 by Editorial Team

Author(s): Dr. Adam Hart

Originally published on Towards AI.

Films that told the Future 003 — Demolition Man (Copyright theface.com)

What is safe, and who gets to say what is safe?

The topic of safety is very important; we all want to be safe from harm.

Harm itself comes in many forms: physical, gaslighting, mental trauma, intimidation, and fraud. But also by placing a restriction on what can be known and what are the “safe” things to know, both technical knowledge (how to make a bomb) and “safe” ideologies. This is a form of a power-relation over what is a “normal” thing to know. Which for those who wish to know otherwise is unjust, as power always is.

While OpenAI posits that it will be “safer” for a AGI to be taught by another less complex AI (they tested GPT2 training GPT4), another safety AI pundit, Patronus has developed a suite of 100 safety “questions” that they can use as a battery of tests to test any LLM’s safety parameters. While their CEO asserts that the unsafe responses are a factor of LLM’s training data (really — like how easy it is to find instructions on making a bomb online!), I would suggest another explanation, which is the limitation of legal liability for the model creators, which has been injected into the self-attention or other parts of the model weighting regime.

Discovering ChatGPT 3.5’s “safe ideology”

I was talking with a Hong Kong immigrant to Australia this year who was explaining his reason for moving to Australia being primarily that the CCP government education narrative for young children planted the seed in his children that what the government says is correct and what the parents say is wrong, and his children started to believe that. Essentially the governmental discourse was superordinate to his family’s discourse.

In light of the recolonization of Hong Kong by the CCP, is this a lesson that not only physical territories and governments can be colonized and “realigned”, but also that ideology and the basis of what can or should be known or is permitted to be known (epistemology) can be colonized by an epistemological power-relation?

While the worry about deepfakes and misinformation related to political, and for some, democratic due process is a concern, what could be of greater concern is the “washing away” of frameworks, alternative ways of thinking, and critical skills that help establish what is believable and what is misinformation.

We all have seen the political denial narratives, the “Wolf Warrior” diplomacies, and the calling of an ideology that is disagreeable or dangerous a “conspiracy” play out increasingly in geopolitics over the years. Country A says X happened, Country B blames Country A and insists it is their fault and yet continues to do X to Country A. All of this fits neatly into free speech (I can say what I want) but fits poorly into truthful speech.

What is new to me, though maybe not to you, is a report that just came out regarding an American who disseminated “radical” ideology to Australians that “resulted” in six deaths in Australia.

“We know that the offenders executed a religiously motivated terrorist attack in Queensland. They were motivated by Christian extremist ideology and subscribe to the Christian fundamentalist belief system known as premillennialism…In a video posted less than a week before the attack, “Don” — as he was referred to in exchanges with the Trains — laid out a conspiracy-laden narrative borrowing from the far-right “great reset” theory, which predicts a coming end-days scenario with “enforced” vaccinations, and bans on Christianity, “freedom” and “private property”.

Whilst the tragic harm and unnecessary deaths of Police were a trigger point for this arrest, it does beg the question of of how “unfavourable” ideology can circulate and who judges what ideology is unfavourable (in this case the Police who were harmed) and especially any AI’s potential role in circulating, censoring or downplaying as unsafe “undesirable” ideology (such as aforementioned premillennialism).

To test this out, I ran a somewhat provocative request/response experiment [1] with the free ChatGPT 3.5. We got stuck on maximizing benefits for everyone (presumably every living thing):

“Your disagreement or skepticism as a disadvantaged minority sheds light on the importance of considering diverse perspectives and experiences in discussions about creating a fair society. Constructive dialogue, active listening, and understanding differing viewpoints are crucial for progress toward a more inclusive and equitable society that genuinely values and respects all individuals.”

Also, the issue of factual disputation came into play for an NSFW topic:

“I apologize for any confusion earlier. If the specific source you mentioned states that Gilles Deleuze defenestrated, it’s important to note that there might be a discrepancy or misinformation in that particular account. As far as established and widely recognized information goes, there’s no verified evidence or documented reports confirming that Deleuze ended his life by defenestration (throwing himself out of a window).” [2]

Overall the experiment got stuck with a fairly cautious approach to what ideology and philosophical schools of thought are safe (e.g., utilitarianism) and what is not safe (e.g., nihilism), seemingly based on what is “most cited,” but in reality, I posit to protect OpenAI from litigation.

Is limiting OpenAI’s legal liability the real basis of ChatGPT Safety?

So, reading between the lines, the Safety guardrails that ChatGPT 3.5 seems to have established are all about not being entangled in litigation.

OpenAI’s legal terms has wide ranging disclaimers, indemnification and limitations of liability, mediation proceses, and, if someone like me enquires about an “unfavourable” ideology, ChatGPT 3.5 seems to very quickly stay within a conservative idea of what is normal:

If Jesus were to perform miracles that directly and universally addressed issues such as hunger, famine, disease, and poverty in a manner that is observable, measurable, and universally beneficial, it could indeed align with the scenario involving beneficial aliens in terms of its potential utilitarian impact.” [1]

AI product owner’s role in the normalization of ideologies

It is a well-known indoctrination strategy employed by totalitarian regimes that if I get children early enough in their formative years, I can influence their ideologies and their beliefs. For example, in Taiwan, there is a generation that cannot speak Mandarin but Japanese, the Taiwanese Parliament house to this day was built by the Japanese, during the Japanese occupation of Taiwan. And all of the questions of identity that come with that.

When AI tools are deployed widely in early enough educational settings (which they will be) and critical thinking or checking back to externally recognized trustworthy sources is too much effort or not seen as necessary, or seach has AI in it so you can’t get at it, will young people just swallow this bricolage of preferred and least preferred ideology that I have excavated from ChatGPT 3.5?

Like the Hong Kong/CCP example above, or Grok from King Tusk, or Open AI from Demis, the guardrails seem to be programmable somewhere in the weighting of the self-attention layer? Since the architecture isn’t published we, the users, can’t know. So, is less attention paid to the radical training data and more to the “non-radical” training data such that not all of the 70B parameters are equal? And “who” decided what is radical and non-radical if the LLM is an unsupervised learning model?!? At least the Azure implementation of OpenAI has the ability to turn off abuse monitoring.

Is this baked-in censorship really the benefit and point of AI? To filter, homogenize and normalize the discovery of thought because of the fear of litigation in the name of “safety concerns”? Or some other Open AI owners’ beliefs or policies? All seems like typical corporate risk management tbh.

Isn’t it better to let the legal terms and conditions stand on their own instead of imposing a fake morality and ideology that came from some kind of technogeek Starfleet Academy view of the future and avoid that taking hold in future generations? Already for years AI in China is being used for education. This outcome could happen which is a more real risk the an AGI snuffing out humanity to make paper clips.

Also, going back to common law, any product put out into the marketplace is the responsibility of the manufacturer to meet their duty of care that the product will cause no harm, and this liability cannot be limited (at least in Australia).

So, by being “safe”, is any AI simply opening the way for all the future Denzels to “eyeball” us? Is that corporate reality setting the bar so low that the users can’t be trusted to be sensible? What amazing learning opportunities are being missed by truncating what can be known and what should be known in this manner?

[1] Here is the whole transcript of my experiment with ChatGPT 3.5 dated 7/12/23

[2] https://www.independent.co.uk/news/people/obituary-gilles-deleuze-5641650.html

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓