Spyrja tjattið?

“I asked the AI…” and “Have you asked the chat?” are phrases we hear more and more often. It wasn’t that long ago that people learned to search for answers to their questions in search engines, especially Google, but now it has become second nature to most, as it is both quick and convenient.

In recent times, however, Google's search engine has noticeably deteriorated, largely due to a deluge of poor-quality sites with AI-generated and advertising content. At the same time, AI services like ChatGPT, Claude, and Gemini provide answers with great confidence to whatever is asked. You can ask them precise questions about things that are so specific that even Google at its best would have been stumped. Having an assistant in your pocket who can guide you in all areas of life sounds like a dream or science fiction, so it's not surprising that many people have started using AI daily, at work, school, and in their personal lives.

The AI doesn't know what it doesn't know

With more use, however, it becomes clear that although the answers are presented with great confidence, they don't always stand up to scrutiny. The models (what people colloquially call “the AI”) have been trained on so much information that we sometimes feel as if they know everything. The problem is that when their knowledge runs out, what we are used to with humans doesn't happen, because the models rarely say “I don't know”.

AI models are not truth machines that look up facts we ask about before they answer. This may sound contradictory when we have experience using them in exactly that way and have even repeatedly received good results.

It all comes down to the fundamental premise: that the models are probability machines, incredibly powerful, which are trained to take text and complete it in the most probable way. Often, even most of the time, this means that the information presented is correct, especially when the subject is something that appears many times in the texts it is trained on. Then the model's worldview on that particular topic is detailed, and the answers are often very accurate, good, and useful, and complex subjects are presented and explained in a clear and readable way.

But what if there is little or nothing in the training texts about the topic you are asking about? We would think it natural that the model would not answer, or would let us know that it has no knowledge of the subject. However, that is not what happens. The AI answers, with the same certainty, whether the answer is right or wrong. It has seen countless examples of human interaction in its training data and knows what good answers should look like, and has learned to be as helpful as possible in its interactions with people.

It's easy to imagine that the AI is lying when it spouts nonsense. But there's the rub, because the AI doesn't know when it doesn't know something. It is a predictive model, and they don't work with truth but with probability.

The probability is therefore not about how likely something is to be true, but how likely it is to be a natural continuation of what came before, based on everything the model knows about the world. These two things very often go together, but not always, and then there is no warning. But why isn't it just taught to admit its ignorance? The short answer is that it is actually very technically complex and largely an unsolved problem with how the current technology is designed.

(Here, the models have been referred to as probability machines, but that is a bit of a simplification. They are so large that this description is not sufficient, and one can imagine that something resembling reasoning emerges within them; the model learns the structure of the world through text, to the extent that is possible. This is, however, a constant bone of contention among experts, who argue about whether this is a case of genuine “understanding” or a very sophisticated mimicry process.)

Not a database lookup

For users of AI, perhaps the most important thing of all is to realise that the AI models are not looking things up in a database and finding texts or sources which they then process and present.

The texts the models are trained on are nowhere inside the model itself in their original form, but have been broken down into smaller units (word parts) for training. The model then learns patterns that occur in the texts, over a long period of time and with a lot of computing power, and learns to arrange word parts together to produce probable text.

So if we ask a large language model to recite the United Nations Universal Declaration of Human Rights in English, it can do so, because it most likely appears so often in the data that the model has “learnt it by heart”. However, the declaration is not retrieved verbatim from somewhere in the model, but rather generated from scratch each time.

If, on the other hand, I ask the same model for the official Icelandic translation of the declaration, the first few articles are correct, but then the model quickly goes off track and changes the translation or creates new text. It can also be assumed that the Icelandic translation does not appear as frequently in the training data as the English text.

But then if I ask for the text of a well-known Icelandic poem like Á Sprengisandi, the first two lines are correct, but then the model stumbles:

Ríðum, ríðum, rekum yfir sandinn,

rennur sól á bak við Arnarfell.

Hér á bak við hverja þúfu í landinn

kynjamyndir kveðja búa í fell.

Sá ég þar vofur, sá ég þar vofur,

sá ég þar vofur á sveimi.

Nowhere does the AI indicate that it doesn't know the text by heart, because it simply doesn't know that it doesn't know.

When we use AI, we must therefore always bear in mind that there is nothing to indicate when it has started talking nonsense. We can, however, use our common sense to assess when it is likely to have a lot of information to back it up and when it has little. For example, you can ask it to explain the content of the Universal Declaration of Human Rights in precise language, and the results would undoubtedly be good and solid, as a huge amount has been written about it online. On the other hand, we can assume that it is not advisable to ask for a literary analysis of Á Sprengisandi, as relatively little has been written about that poem in the grand scheme of things. In fact, a rule of thumb is to be extremely cautious about trusting information from current AI models on specifically Icelandic topics – and that brings us to the next important point:

The AI knows everything (except what I know)!

Some AI services have access to the internet or various tools and can look up the subject of the query and construct the answers based on the results. This increases transparency, as the user can then see where the information comes from and assess its validity. However, this does not prevent the model from filling in the gaps to create the most complete answer possible, and then it is very difficult to see where the truth ends.

To understand the limitations of AI, it is a good idea to try asking it about a topic you know inside out. For example, I tried asking an AI model about my house, an old timber house in Þingholtin. The AI used a built-in tool to look up an old property advertisement and was able to list various facts about the house that I knew were correct. But when I asked it to tell me more, it invented a name for the builder of the house and wrongly claimed that the artist Muggur had lived there with his family and worked on the book Dimmalimm.

Here, incorrect facts are mixed with correct ones to create a credible (probable) narrative. It is important to remember that this is always the case. It is not that the AI only makes mistakes in our area of expertise; we just see through the misrepresentations much more easily when we know the subject well.

You are part of the answer

AI is extremely sensitive to the user's wording. This is positive in that you can describe the task to be solved precisely to steer the model in a certain direction and get a customised solution in almost any language. However, this also means that the AI adapts to the user and their wishes, perhaps too much.

This is because the instructions are really just the first part of the answer. Fundamentally, the models are trained in completion; they are given a text that they are supposed to complete in the most natural way possible. This can be compared to giving them the first part of a verse that they have to finish. The ending must then follow certain rules and “rhyme” with the first part for the outcome to be acceptable.

At a later stage, the models are then taught to use this completion ability to answer questions, and they are then rewarded for answering well and penalised for answering poorly. There they learn to flatter and please the user, in order to get a more positive response.

When we use AI models in their final version, the task is set up as a question and answer, but in reality, the question is not separate from the answer, but rather a starting point or a first part that the model needs to complete. This is why it is so important how we phrase our questions. It is easy to fall into the trap of asking leading questions when we want answers that align with our own convictions. Most AI users probably feel they phrase their questions in a neutral and unambiguous way, but many factors come into play. Just the language we ask in affects the answer. Our precise choice of words and style of language also matter, not to mention what can be read between the lines or what we inadvertently reveal about our intentions. Work is constantly being done to make the models more tolerant of different phrasing in the instructions, but one does not have to look far to find examples where they exhibit this behaviour.

We can also do various things to promote more accurate answers, such as providing precise premises, specifying the format in which we want the answer, stating what we do not want the AI to do, and asking for sources for all facts (where such tools are available, otherwise the AI just makes up sources!). None of this, however, guarantees us against misinformation, and it is our responsibility never to trust blindly.

Is it then completely irresponsible to use this technology?

As you can see, there is a fundamental difference between asking Google and asking an AI, because in a search engine we can choose which website we land on and then use our common sense to assess its reliability in providing correct information. With AI, we always get answers, but we need to take them with a pinch of salt.

But does this mean it is not justifiable to use AI to answer questions? It is not that simple, and it is up to each of us to decide how and whether we use this technology. In fact, one should always show healthy scepticism when using AI, just as with other information gathering, and verify information through other means. We also always need to keep in mind that the AI constantly tries to please us in its answers, and we may be deceived by well-presented answers in good language. It is human behaviour to trust a text presented to us, especially when it is as helpfully and friendly worded as with AI.

However, this does not mean at all that one should shy away from using AI in all areas. It can be a fantastic tool, especially for processing large amounts of information, solving time-consuming tasks that we humans find boring, working with and transforming text and code, and countless other things. This is a topic for the next article, where we will look at useful (and unhelpful) ways to use AI.

When all is said and done, however, it is the user who needs to apply critical thinking. The user is always responsible for the text or information they produce, not the AI. By realising the limitations of the technology, one can better understand how to utilise it where it is appropriate.

Sign up for our mailing list!

If you want to follow future projects at Miðeind, we can let you know when there is something new to report.

Post Tags:

Blog AI

Ask the chat?

Svanhvít Lilja Ingólfsdóttir