We Demand Too Little of AI. Artificial Intelligence Can Be Closer to the Human
If we want AI to help us overcome serious civilizational problems, we need a model that learns about the world more like a human does. I have a proposal for how such a model might be developed — says Prof. Rafał Rzepka, a computer scientist at Hokkaido University.
By Łukasz Kaniewski and Rafał Rzepka
Łukasz Kaniewski: May I call you a forerunner of ChatGPT? Already a dozen or so years ago you presented a program that, on the basis of data collected on the internet, is capable of judging what is good and what is bad. At the time it sounded like science fiction, and today no one is surprised by it.
Prof. Rafał Rzepka: I wrote the first paper on possible ways of operating "ethical machines" drawing information from the internet two decades ago, and indeed for a long time many people considered it the domain of fantasy. Today's large models, such as GPT, also draw on knowledge from the web — and that is what they have in common with my algorithm. But there is also a fundamental difference. My approach consisted in searching for specific examples and drawing conclusions from them.
My program was of course far simpler and could not handle nearly as many tasks as today's models, but it was much easier to locate the source of an error. When, in response to the question of whether stealing a car is morally permissible, the program answered that it was not — but only at 76 percent — I was able to check what knowledge it had used and why in 24 percent of cases it had concluded it was something good. It turned out that among its data were numerous statements from users of the game Grand Theft Auto, in which players derive enjoyment from taking cars away from ordinary citizens. Hence the mistake. In the case of today's large models, such a procedure is impossible. The program makes an error, and we don't know why. From an engineering point of view, that is a failure.
It's true that models like ChatGPT sometimes make mistakes, but they demonstrate remarkable capabilities.
Large language models, while providing eloquent answers to our questions, are not nearly as intelligent as they may appear. Yes, they perform well in natural language processing; they help with work, programming, writing, and summarizing texts. But their linguistic fluency gives rise to a certain cognitive bias, because we assume that if someone uses language brilliantly, their thinking capacities are likewise at a high level. And conversely — someone who cannot express themselves clearly is considered unintelligent.
Research shows, however, that people with aphasia, though they do not understand or articulate language, are capable of planning, reasoning, using emotional and social intelligence, playing games, practicing sport, or using a computer. Many different areas of the brain are responsible for the functions required for reasoning. The human brain is modular — in contrast to models like GPT, which use one simple algorithm to compress enormous quantities of linguistic data. That is the whole of their intelligence. It is as if the brain consisted only of language centers.
Should we model AI development more closely on the human brain?
In principle I am not an advocate of looking to the brain, because firstly we know little about it, and secondly we, as the owners of brains, fall far short of any ideal. But the shortcomings of large language models set me thinking, and through them I began to wonder whether AI could not have a more modular and transparent structure. We have created a monster with a human mask, and now we must study it almost as we would study a human, because we cannot open the machine and check straightforwardly how it works.
A decade ago, when I was telling you about machines distinguishing good from evil, you asked me whether, since they would be drawing their knowledge from the internet, their morality might not turn out to be the morality of a game show. I replied that it would not, because the machine would know all past applications of the law and the mistakes that had been made. But exactly what you feared came to pass: the more we stuff into a model, the more naturally it speaks — but along the way we throw in tons of rubbish and errors that resonate throughout. Quantity produces quality in linguistic competence, but it does not guarantee the wisdom I was counting on. This garbage, hidden somewhere in the core model, corrupts the results, even when I fine-tune such a model using smaller, more specialized datasets.
Can this fine-tuning not remove the errors that crept into the model at the first stage of training?
It can, but only to a degree. Above all, we don't know where those errors are located, so we cannot simply introduce corrections. There are not even specific places where these errors exist, because they are dispersed throughout the entire core of the model. This results from the fact that large language models learn on their own from enormous datasets collected from the internet. They do not start from a primer of basic knowledge, and they do not subsequently distribute information into appropriate folders. There is no separate "sarcasm module" or "logic folder." There is no designed structure. The knowledge absorbed from the internet becomes that structure. So if information is incorrect, it soaks into the very fabric of the language model.
Yet if we ask a model that is several years old who the president of the United States is, it gives the correct, current answer. So it has somehow corrected its knowledge.
That may seem to be the case, but let us not be deceived. The model has not corrected its core — it has simply applied an appropriate overlay. What happens is that a large number of people sit and program rules such as "when asked about the president, don't answer on your own — check Wikipedia first," or "if asked for a bomb recipe, refuse." In the most recent models, an approach resembling self-analysis is also used: AI generates many possible answers, steps of plans, and checks itself, also using a search engine when needed. This works fairly well, so companies are no longer working on the cores — only updating the overlays. These are chains on the monster. But however hard one tries, the dark force of the model's core sometimes seeps through from beneath the overlays. The simplest and at the same time most costly solution is to retrain the model on current data so that it contains the latest information.
Can we then propose a different approach?
We certainly should be looking for an alternative. I have a proposal of my own. My inspiration came from a theory from outside the field of artificial intelligence. In 2017 I went on academic leave to Australia, and there I attended a lecture by Prof. Anna Wierzbicka on the semantic primitives she had developed. I thought that this theory might help develop a model of artificial intelligence that would be closer to human cognition.
What are semantic primitives?
They are the basic concepts through which any other concept can be described. The complete set currently consists of 65 fundamental categories. If we want to define the word "lie," for example, we can do so using building blocks such as "truth," "say," "much," "not," "want," and so on.
Professor Wierzbicka is a linguist, and in her work she drew on linguistic research; the semantic primitives are concepts theoretically present in all languages.
Why might semantic primitives be useful in the development of artificial intelligence?
Studies involving infants show that we are born with a package of algorithms that allow us, right from the start — before we have learned language — to distinguish up from down, inside from outside, closeness from distance. These algorithms give form to our perception. Today's large models do not possess anything of the kind. Large models learn only which words most frequently follow which, and what the distances are between words in texts found on the internet. But the proximity of words is not enough. Deeper, more fundamental hidden knowledge is always also required. Recently, ChatGPT planned a tourist trip for an acquaintance of mine to the town of Otaru from Sapporo as an hour-long outing — even though the train journey there and back alone takes an hour. Anyone who has used language models has certainly encountered similar blunders.
I understand that ChatGPT lacked fundamental, hidden knowledge about time.
And this is connected to the fact that today's large language models are based on prediction rather than perception. They do not experience the world. They collect only data in the form of word fragments or sets of pixels. When they collect hundreds of thousands of similar fragments in similar sequences, they can draw some conclusions from them — for example, which word should follow if another word has been used. In other words: prediction. I assume that human beings learn differently.
I think that ultimately artificial intelligence will gather information about the world by moving through it — as robots. But even now we can simulate perception — for example by using Prof. Wierzbicka's taxonomy. Simulating perception will allow artificial intelligence to come closer to how human cognition functions: as individuals we gather experiences, share them, and together arrange them into clearer forms. On this basis we prove or refute small and larger theories, driven to do so by the incompleteness of our knowledge.
An interesting aspect of Wierzbicka's theory is that it places the individual human being at the center of cognition. The "I" is the primary core of reasoning, and today's models can only embody the personas imposed upon them — they do not place themselves in the world, they experience nothing.
You would like artificial subjects — or agents, as they are called in technical terminology — to experience and share experiences?
Yes, and this thought has been with me for a long time. As far back as 2003 I proposed an algorithm I called Bacteria Lingualis. I wanted to release into the internet an entire swarm of microscopic cognitive architectures that would, as it were, live their own micro-lives, reading human blogs, and then sharing their experiences through mutual communication. These little creatures had only two basic functions: measuring the intensity of human emotions and the ordinariness of what they encountered.
As it happened, a year after publishing the idea I became seriously interested in ethics, and I only returned to the bacteria every few years. On one occasion I added the recognition of five senses; on another, my Stanford intern added the guessing of instincts that motivate people to act. What I am working on now is a development of those ideas.
How would these little organisms function?
Let us say that bacterium number 1022345 is "released" onto one randomly selected webpage. With its limited cognitive toolkit, it will collect knowledge about words or fragments of images placed there within a very limited range — it will remember, for example, whether something is close or far away, whether it is moving or not. Such a bacterium is like an infant for whom the word "lake" or an illustration of one means nothing at first. It will simply be organizing the world according to certain components of perception, which over time will lead to the formation of comparisons, analogies, and categorizations, and to the embedding of language in experience. Moving on to the next webpage, the bacterium will confront its existing knowledge with new experiences, and will over time remember — and forget — what its imperfect receptors taught it incorrectly when it still had too little data.
And unlike today's large models, the accumulated experience will be transparent?
That is a very important condition. The recording of knowledge in the form of sets of perceptual data must be readable by human beings. The ability to search and analyze accumulated knowledge in natural language is something sorely lacking in today's models. Moreover, such an approach will enable an exchange of information between the bacteria that is comprehensible to us. Of course, not everything can be put into words — but that is the direction we should be striving toward.
Experiences are unique to each individual — will the same be true of the bacteria?
They will differ from one another and influence one another, just as human beings do. New ideas arise from chance encounters; personal experiences and emotions affect our actions and thinking. This does not happen in the case of large language models, which are a single reflection of the knowledge accumulated on the web — a convergence point of the mass of human experiences.
Is that a flaw?
That is my impression. I think this is precisely why large models have yet to come up with any breakthrough idea. They can assist in supplementing knowledge, developing practical solutions, or organizing thought — but when asked to propose new paradigms, their existing knowledge evidently constrains them.
Today's artificial intelligence is based on quantity, on mass data — and so it still lacks the flash that allows one to hit upon an idea and think: yes! That's it! Let me use an example invented by Ben Goertzel, one of the most interesting figures in my field. Suppose we train an artificial intelligence on all music up to the year 1900. When we then ask it to generate a new musical genre, it will excellently combine Baroque sound with medieval choral singing. But it probably will not create jazz or heavy metal. The way knowledge is encoded and activated in today's models seems to make creative planning outside the framework of existing knowledge impossible.
Does the example of jazz and heavy metal not show how unrealistic our expectations of AI are, though? Jazz is the creation not of a single person but of many talented individuals, living in specific social conditions. The same is true of heavy metal — not to mention that it is unimaginable without specific technological achievements. In expecting AI to invent these musical genres, are we not demanding that it be not only a talented living individual, but an entire living society? Is that not too much?
Ben's remark is more of a metaphor, intended to show that from concrete data alone one can certainly create something new — but it will be quite similar to what already existed. It is hard for us humans, too, to come up with something no one has thought of before. When scientists publish their work, they mostly only correct the shortcomings of their predecessors, applying slightly better methods, and borrow ideas from neighboring fields.
But from time to time someone comes up with an idea that departs from the standard; sometimes someone overturns existing thinking, connects something with something that had no right to be connected before. And that kind of artificial assistant — unconventional, forward-looking, stimulating — would be very useful, not only for scientists. It does not have to be a great inventor: it would be enough if, possessing enormous knowledge, it were able to transpose the problem presented to it into entirely different fields.
I would like artificial intelligence to "understand" well what it plans to achieve and why — so that it could, for example, inform me that it has just read about the latest discovery in biology that it thinks should interest me. Or that a new optimization method proposed by a group of engineers from Wrocław could help me write the next section of code I have been puzzling over recently. In this way, artificial intelligence in partnership with a human being could create something new.
And yet it still would not create jazz or heavy metal on its own.
Not today — but in the future, I would not rule it out. You were right to point out that jazz is not the work of a single person and did not come into being on a single day. But the society of which it is the product could be simulated — and this is already happening, albeit on a small scale. In 2023, a team from Stanford University presented a simulation of a small village called Smallville, "inhabited" by 25 "people" with programmed personalities, professions, and habits. Language models "create" these characters and simulate communication between them.
In the future, scientists will model far larger organizations, cities, and states, in which large numbers of agents will serve to test new ideas — such as tax system reforms. These agents may make mistakes, have bad intentions, or overblown ambitions. In the same way, thousands of characters could be programmed to have expert knowledge in various fields. And if individual experience-gathering by each agent were added to that, entirely new ideas might emerge.
Like heavy metal?
Why not? If an artificial metallurgist were to listen to the stories of a music-loving historian, and a psychologist specializing in stress management were to add a few observations, perhaps several "people" — rather than whole generations of musicians — could invent metal music. These are of course only speculations. Simulating individual units need not be the best way of generating inventions. As yet we have failed to discover the principle of creative thinking. Perhaps the methods on which we are building today's systems — still far from the human level — will help us find it.
If you ask whether we are not demanding too much of artificial intelligence, my answer is that in my view we are still demanding too little. If we want AI to help us overcome serious civilizational problems, we must demand more of it.
Creative answers to questions?
More than that: the creative posing of questions. It is still we humans who pose the questions — but in doing so we may be shackling AI with the chains of our own limitations. Perhaps our questions still belong to a very narrow, anthropocentric circle. Moreover, might not too large and too broad a knowledge base constrain today's language models, just as years of experience constrain the imagination of an adult human being? Recently, one of my students and I began research into whether it is possible to "limit" a language model in such a way that its gaps in knowledge and its permission to "become a child" improve its "imagination."
Initial experiments show that simulating the naivety of a child is very difficult in the case of a model that has already been trained — even a small one. It is hard to demand creativity from a child after it has memorized the entire Wikipedia. It is possible that the moments of human insight when creating something new in adult life have their roots in the joyful manipulation of the world when we were small. Hence my interest in perception and in agents that, like little organisms, "go their own ways," gathering life experience and sharing it with other little organisms. This is a different approach from administering one colossal injection of knowledge to a single monster.
On the subject of virtual organisms that acquire experience on their own, I would note that human beings learn about life amid suffering and frustration. Moreover, if we wanted AI to be more autonomous, it would have to have emotions. And emotions — at least human ones — are something half-corporeal. Pain, suffering, emotions, corporeality — would scientists want to simulate these? Or somehow circumvent them?
These questions have preoccupied our field since its very beginnings. Let me start with corporeality. Is it necessary for creating artificial intelligence? As language models show, it is possible to operate effectively with knowledge by learning it from text alone. Combining text with image or sound gives us an additional layer of association. Does the same happen in our brains? Firstly, we do not know exactly what goes on inside our heads. Secondly, the results of experiments show that "thinking" by a machine and thinking by a human produce different effects. This is why, for example, if I ask a model to generate a white empty room without an elephant, it "paints" an empty room — but with a small elephant in the corner.
Even if our association algorithm is similar, a rich sensory apparatus provides us with a far greater number of stimuli; and on top of that we evidently possess the innate set of cognitive mechanisms I described a moment ago. On the other hand, corporeality in the case of artificial intelligence may be a fairly blurred concept, since even now agents can make use of sensory apparatus that we do not possess. One can assume that a dozen or so cameras mounted on drones constitute a single large eye of one program — so a "body" need not be a compact solid mass as in the case of a robot. In short: corporeality, which in my intuition is necessary, can be simulated and need not resemble the human version.
And what about suffering and emotions?
If we were to use human metaphors, machine learning algorithms experience pleasure and pain in turn. By giving the program positive or negative feedback, we train it — for example to distinguish a positive review from a negative one — and when through reinforcement the algorithm receives a "reward," we treat it like a child.
A philosophical question also arises here: whether from these positive and negative feedback signals, from these fundamental responses, complex emotions can be derived. My position on this question was once more — let us call it — romantic. But delving into semantic primitives gave me something to think about: perhaps it is the case that emotions derive from very simple reflexes, and we merely label them differently depending on the broader context.
In the case of machines, speaking of emotions is of course also burdened with anthropomorphization. Can we call an algorithm that stops a car upon detecting danger "fear"? Does a robot vacuum cleaner take pleasure in absorbing dust? It depends who you ask — some will laugh, others will nod and say yes, that interpretation is valid.
So should artificial intelligence have feelings or not?
It should certainly understand our feelings. That does not mean it must possess them itself. Because do we want the full spectrum of human feelings in our devices? Certainly not in a vacuum cleaner — because instead of cleaning it would be in a bad mood. In a companion robot? Perhaps to a greater degree, but an artificial lover, for example, would have to love always in the same way, or with ever-increasing intensity. Would the pain of jealousy be meant to teach it love? And what if it began to manipulate us so that we would not leave it — because the manufacturer wants to profit from the subscription?
Notice how many ethical problems are concealed in every layer of this. That is why I believe that in order to control artificial intelligence, we must create cognitive systems with separate components — which will make it easier to identify the causes of problems. Some of these components will simulate precisely the human feelings I mentioned, and it is these mechanisms that will require particularly close scrutiny. They will be subject to the strongest restrictions, because Homo sapiens is highly susceptible to manipulation, and feelings are the engine of our actions.
We must choose very carefully the goals that are to guide artificial intelligence. Should it be a list of virtues? A striving toward human happiness, toward eudaimonia as the general goal of human life? Can we be sure that the charter of human rights, or some other set of principles, would not prevent an artificial scientist from proposing entirely new medications — because there is a risk that even one person might be harmed? Each of us has their own recipe for happiness: is it ethical to generalize these recipes into a single formula? How much freedom can we give the user to fine-tune the machine to their own needs? On these questions we have more questions than answers.

Prof. RAFAŁ RZEPKA is a computer scientist at Hokkaido University in Sapporo. His work in the field of artificial intelligence focuses on natural language processing (including metaphorical language and humor), the understanding of human behavior by algorithms, and the processing and simulation of emotions and morality.

This project was co-financed from state budget funds granted by the Minister of Science and Higher Education under the "Social Responsibility of Science II" programme.