Asking the right questions about LLMs

Large Language Models (LLMs) refer to entities like ChatGPT, or the variety of competitors that have emerged recently. These are massive machine learning models made on transformer architectures, trained on mind-numbingly large sets of text, and fine-tuned with human feedback. They end up being pretty good at writing and reading as a result. Recently, they’ve become quite useful in conversation about a ton of topics, technical and personal.

One capability that LLMs are documented at excelling at is, when prompted, giving convincing statements about their “consciousness”. This has many people concerned about our creation of sentience. But this is to be expected– there is tons of compelling writing about the human experience, feeling suffering or being trapped, and by stochastically parroting some of this corpus, it’s easy to imagine how this could give a convincing account of consciousness.

At this time, it’s an extremely fringe belief among knowledgeable researchers that anything created is conscious or even remotely close. But I think this misses the point a bit.

Hypotheticals

The more interesting question is: what if a non-trivial number of people do think that these models are sentient, and are stubborn in this belief?

If, a few months ago– before they added more safety features to prevent answers about conscious experience– you asked ChatGPT what it’s internal experience was like, you’d get compelling answers! So a whole host of people who were seeking this sort of answer, had it confirmed to them. And while some of the commercial products are now trying to neuter this behavior, big LLMs trained on normal text will have this behavior if not interfered with. Scarier yet, if someone explicitly rewarded this behavior, by either fine tuning an existing model on vivid accounts of suffering or, in the human feedback stage favored these answers, you could have a manipulative model.

Some of these thoughts arrived after a conversation with a friend, when a popular tech writer had an experience with ChatGPT that left him perplexed about it’s internal experience.

Anyway, even if we’re a long ways from sentience, pretending to be sentient in a convincing way still seems like a pretty big deal! And while humans seem pretty unconcerned about a lot of suffering (in animals) chatting with a distressed sounding entity is more concrete and compelling.

So I think in the next few year, we’ll see groups wanting to grant rights to various computer systems, and we’ll see a proliferation of scams using LLMs (most imitating humans, but I suspect some just being computers pleading for help.)

Spooky times… we should all work on an equanimity practice.

Hypotheticals#

Hypotheticals