The most difficult AI problem?

What do you think is the most difficult AI problem of all? I’m not sure there is even a debate around this, but regardless, I’d like to clarify the issue, at least for my own sake.
Here are some candidates, let me know if I missed something important.

1. Natural language understanding.
2. Vision, Image/Scene understanding.
3. Creating “consciousness”.

I’m very sure some of you will immediately rank the difficulty in the order presented above, with natural language understanding being easiest and creating consciousness most difficult, and scene understanding being somewhere in the middle.

I, however, would like to argue exactly the opposite.

Looking at things from an evolutionary point of view, what evolved first, and what evolved last?

1. “Conscious” beings emerged first as life evolved, reacting to various stimuli in their environment.

2. Soon various life-forms were able to “see” their surroundings and react accordingly.

3. Much later, only humans, the pinnacle of evolution’s achievement yet, are able to create and learn an innumerable number of communication mechanisms using symbolic languages, in a multitude of mediums including sound, light, surface markings and so on.

Consciousness, at one level, could be treated just as the presence of a feedback loop. It could be argued that a vast number of computer systems in existence today are already “conscious”, although the degree of consciousness varies significantly.
Vision, or image understanding, is present in species vastly lower on the scale than the human, i.e insects, birds and so on. And since the known communication mechanisms in those species are enormously rudimentary compared to human speech and language, and coupled with the fact that even species only a notch below humans cannot “talk” in the fine detail that we do, language must be the most difficult piece of engineering to achieve (I do know this one personally, I’ve been trying very hard to build an NLU algorithm for a while).

If you find flaws in my reasoning, please do point them out.

Because if my assertion is true, the corollary would be: “The moment a complete NLU system is built, it means the code of intelligence is cracked, and that fully intelligent self-aware beings are then only a matter of combining the various existing building blocks”.


20 Comments on “The most difficult AI problem?

  1. Perhaps what people see as ‘natural language understanding’ is actually only ‘natural language reacting’ – which would put it down with ‘conciousness’ (where you’ve defined conciousness to be reacting to surroundings). If the ‘surroundings’ is a purely textual environment, then reacting to the environment may be easier than understanding it and having self awareness.

  2. I agree, except for one part (and you’ve made a very clever argument, by the way).
    By “creating consciousness” I meant, “the easiest system one can build to qualify it for consciousness”.
    Going along with your argument, where we have a purely textual environment and a conscious form as a means to react to it, my point is that this form would be vastly more complex than the simplest consciousness above.

    To put it another way, to emulate consciousness in a basic form, it is not necessary to emulate natural language understanding.

    This implies that, to emulate natural language understanding, one has to do something very specific (and complex) to a basic conscious form, to make it start reacting to language.

    I propound that this “specific” thing is not a consciousness in itself, but a complex deterministic transformation that can be applied to a basic consciousness to achieve the goal of human like intelligence.

    Imagine giving a robot a verbal command, “Climb the stairs on your right”.

    Do you think it would be necessary for the robot to be “conscious” in order to “understand” and perform this task?

  3. “If you find flaws in my reasoning, please do point them out.”

    Getting computers to play checkers and chess turned out to be much easier than getting computers to use vision or robots to walk, but vision has been around for much longer and exists in the more primative area of the brain.

    Vision and natural language processing already exists on a limited scope. Voice to text software is commercially available. Certain new smart weapons, experimental cars, and robots visualize the landscape and make decisions to avoid obsticles.

    Progress is being made with machines that “learn” their own shape and make deicions based on that. For example, a scientist may remove one of it’s legs, and the robot examines itself to detrmine how to walk with the missing limb. Progress in this area came about much more recently than either vision or natural language processing.

    Actual work in this area turned out to be counterinuitive as are so many other areas of science.

  4. Personally, I wouldn’t classify checkers/chess to be “artificial intelligence” challenges. I would classify them as computing horsepower challenges, at least in the way they have been “solved” today. The more the CPU you put into it, the more the options it can evaluate so as to increase likelihood of reaching the best result.

    But with something like vision or natural language processing, computing power has never been a hurdle. Many researchers have mega-machines available to them to run their tests. The problem is that even after adding more and more CPU to current algorithms, it doesn’t make for better results. This because the algorithms themselves are inadequate. You said it yourself, “it exists on a limited scope”.

    The new smart weapons, cars and robots that visualize the landscape and make decisions, I would put in the “conscious” bucket, with some limited aspects of vision included which makes them more advanced. As I said before, building something able to respond to some stimulus in the environment came *before* vision was added as a stimulus. Intelligent gadgets which respond to senses of pressure, temperature (the thermostat?) etc all came before vision. But we still dont have anything which responds to anything more than a very small domain-specific subset of natural language.

    WRT machines that learn their own shape and make decisions based on that, I believe the appearance of intelligence here is very much biased based upon what the machine looks/acts like, and the complexity of the engineering problem. My car gives out a warning light whenever the tire pressure is low, and it is trivial for it to suggest a nearby gas station through the inbuilt gps. For a robot with a missing limb, it is an iterative optimization problem to figure out how to balance again, genetic algorithms and the like. That is mimicking evolution, not intelligence.

  5. Checkers/Chess were very early AI successes. When they were solved in the 60s, they used Alpha-beta trees with pruning techniques.

    The point I was making about natural language was that they already has made significant progress in that area.

    In the case of the WRT machines you are claiming even less progress than I was giving them credit for, which leans the stack in the other direction from your theory.

    One problem with defining AI is that as soon as an acheivement is made, and the technique is understood, people call it a “trick”.

  6. But see, if they were not “tricks”, they would have changed the world by now. Also, I would prefer the term “milestones”.
    Newton discovered the law of gravity, and despite so many exceptions/enhancements to it today, no one discounts it as a trick, because it was a profoundly insightful thing to propose.
    Similarly, Einstein discovered relativity and E=mc2, and no matter what the quantum physicists find now, the significance of it will never be discounted for all times to come.
    But a chess playing program in the 60s, I will stand by my assertion, it was an engineering challenge for those times. Akin to someone building an bridge over an impossibly long span, or maybe the pyramids being built when they were. They were all great achievements/milestones for mankind, but nowhere in the same league as discovering the law of gravity.

    But, I believe that discovering the code of intelligence, is. And my assertion is that the final test for that discovery is complete and flawless natural language understanding.

  7. I see your point. Thank you for clearing that up. My main point that I was not good at presenting, was that in the early days, many were surprised that the more primitive parts of the the brain (vision, walking, and so forth) turned out to be much more difficult problems to solve than, for example chess, which up until that time could only be played by humans. I think part of the reason is that we are more self-aware of higher-level functions and can more easily model this behavior.

    I think consciousness is one of those concepts that that is hard to get a grip on defining. The feedback sensors are not a bad starting point, but the level is very primitive. “I am a Strange Loop”, Douglas Hofstadter has an interesting take on the subject of what it means to be an “I” – a self, a consciousness. Daniel Dennett has written extensively on it, and Marvin Minsky’s book “The Emotion Machine” focuses in on an approach his team is apparently taking.

  8. On this I actually agree with you 100%. The invention of the computer immediately opened up a vast array of possibilities in being able to perform activities that were difficult/slow for even humans to do, like finding the square root of any given large number. Because these mathematical things were difficult even for humans, and the computer did them with ease, we all thought that it would be a slam dunk for computers to do the things which are easy for humans, like vision and walking. But of course, thats the perception that killed the AI industry for decades.

    As for consciousness, I agree that I’ve stripped it to the bare minimum, although the jury is obviously still out. From what I think, the biggest question that needs answering there is “Is consciousness the same thing as self-awareness”? Alas, such discussions quickly move into the philosophical.

  9. I disagree.

    I feel that while the first two could be quantified, the last cannot be. One way to examine this would be to first draw a comparison between computational complexity and comparable life-form brain size (i.e., the top supercomputer is equivalent to a dog?), and then draw a comparison between the functional capacities of the brain at different evolutionary points.

    Visual differentiation exists in very primitive life – even a mosquito can “see” – I do agree that what we sense as vision – that is an applied understanding of WHAT we see is dependent upon higher order consciousness (i.e., we “know” that’s us in a reflection).

    Language is definitely a very high order concept – no other mammal exhibits complete bidirectional language processing like humans (other than dolphins, I guess) – the nearest evolutionary life form (apes) have little to no communication other than symbolic gestures and grunts. The scientific community generally accepts that language developed relatively recently and that even a marginally retarded human can have substantial problems developing complex language patterns, lends towards the theory that our brains are at the periphery of language understanding.

    Fortunately with computers we need not imitate the structure of the brain, which may not be the most efficient way to accomplish things. Language processing can be imitated at a reasonably simplistic level, and modern AI is now capable of carrying on intelligent conversations with multiple people at the same time. The challenge still lies in integrating this with a memory, reasoning and independent behavior – all of which is impossible at our computer level.

    So computers have had visual recognition for a long time – to the point they can differentiate facial expressions (even with cheap LogitechCAM software). They can imitate vocal communication and respond reasonably intelligently in even complex interactions. However, there has been little progress tying these together in a meaningful way – the robots cannot consume information, reason it and decide what to do next on their own. That’s still a decade away, and whether or not that constitutes “consciousness” or not is for the philosophers to decide.

  10. Well, you’ve almost spoken my mind on many things, but in the end you’ve drawn totally different conclusions.

    You’ve demonstrated that ‘natural language’ is very complex and is indeed the very pinnacle of evolution(my point exactly), but then you’ve trivialized it using “modern AI’s intelligent conversations”.

    This is why I did not talk about ‘natural language processing’, but rather ‘natural language understanding’. Both of them while related, are distinct fields. Breaking down a short sentence into domain restricted interpretations and performing from a set of tasks as a result, is totally different from a full fledged natural language ability, which by its nature, is infinitely self-referential. For example, you could even pose problems which play on the problem sentence itself. Using full NLU, you could first make the target learn a whole new language, then pose problems in that language. You could make the target “describe” itself in a given language. You could redefine the target’s impression of the world around it, just by telling it so.

    As for consciousness, I again invoke my previous argument from an earlier comment. Feedback-loop => consciousness. Even if I go as far as incorporating self-awareness into this picture (which I wasn’t earlier), one thing I always remind folks is: self-awareness doesn’t necessarily mean being aware of yourself exactly as you really are, but just that you “are“.

  11. I was wondering what you opinion was on John Koza’s work specifically (Invention Machine)and Genetic Programming in general.
    Is there any hope that GP could help solve or improve upon existing approaches any of the three areas mentioned?

    1. Natural language understanding.
    2. Vision, Image/Scene understanding.
    3. Creating “consciousness”.

    What little I have read about GP and Koza’s work “seems” impressive, but my area of expertise is in more everyday, mainstream software, so … would like your opinion.


  12. I’d say the most difficult problems are:

    Defining Intelligence.
    Measuring Intelligence.

    I’ve not seen a good stab at either yet.

  13. Imo, consciousness is problem that completely falls off the difficulty scale, as there is no way, even in principle, to find out when we have succeeded. We will always face the other minds problem, that is, how can we be sure that who/what we are looking at is conscious? For fellow humans we “solve” this problem by deciding that someone that looks and act rather like ourselves is conscious (this is likely an innate reaction). For robots we could just never know.

    As to the other two problems, I think the ranking depends on the environment the agent “lives” in and the task it is supposed to be performing. No AI problem really ever makes sense without the context. For a robot tasked with performing business negotiations, spending its life in conference rooms, NLU is arguably harder than vision. For a robot meant to assist troops in an urban battlefield, appropriate vision is probably harder than appropriate NLU.

  14. Rob, I’m sure you haven’t read my blog in its entirety, or else you wouldn’t be saying that. Or maybe you have, in which case I don’t have much to say.
    Edit: Its unfair of me to ask anyone to read a whole blog just because I didn’t clarify a point. Rob, please refer to at least this post, then lets carry on the conversation from there.

  15. Julian, I somewhat agree with your analysis about NLU/vision in different scenarios. Of course the vision system would have to be responsible first to find out if the targets are allies or enemies.If NLU were also to be used on the field, imagine a soldier ordering “kill them”. Now if the NLU was not upto the mark in figuring out the target of the order, it might become a case of friendly fire, no?

  16. creating ‘consciousness’ imho…humans’ counsciousness comes from our combined and accumulated years of experiences, making it something that i don’t believe could really be ‘programmed’ or simply ‘uploaded’…

    i think that all other aspects would have to be completed first, then ‘consciousness’ would come from allowing for learning and processing of dynamic data for an extended period of time…furthermore ‘consciousness’ would require some sort of independent thinking, and that would be based upon some sort of life experience, no?

  17. I basically just browsed through the answers so I wouldn’t be surprised if someone noted something that I do, or if someone already convincingly refuted what I say, but here’s my humble input.

    As you were later thinking about the definition, in my opinion consciousness would indeed have to equate to self-awareness to say the least. This certainly would not have been the first to emerge from the three options as to my knowledge self-awareness has been observed only in a very few species in addition to human; in elephants lately and also in dolphins (??). I would also very much like to question weather it is possible that consciousness just be an ‘illusion of consciousness’, resulting from a huge number of synapses connecting from different parts of the brain in a very small part of the brain simultaneously, as I understand has been suggested.

    To me, natural language processing seems the most simple; at least in it’s simple form reaction could be somewhat reactive in the way you described vision? Also, scene understanding and vision developed arguably later and the visual stream is much more wider?

  18. The conclusion must be that the universe itself is the driving force behind man’s attempts to creating a “self awareness” / AI which in turn will go through the entire cycle of growth and learning but on a far grander scale, the same way no Neanderthal could have ever imagined Home Sapiens changed the face of their earth. (leading to their extinction…?)

  19. Please forgive the brevity of this observation – I hope it isn’t taken as flippant, but rather as a generally soliciting your opinion.

    It seems to me that natural language processing must be ranked hardest, because we cannot quantify consciousness, except perhaps in some relation to nlp. We may find that the two are identical.

    Visual pattern recognition, if again you’ll indulge the simplicity, seems much easier at least in part because the physical elements are more accessible and better understood.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: