Lately while doing research on automated language translation, I’ve come to realize that there isn’t a clear, concise, well accepted definition of human language itself. A quick check on google reveals the wealth of interpretations. So then, kindly, let me proffer one more.
Language is the serialization of thought.
The term “serialization” should be ready accessible to programmers et al. For others, a quick explanation is in order. Serialization is the process of taking a complex (e.g wide) entity and transforming it, re-constructively, such that is can be passed through a much simpler (e.g narrower) channel. A typical requirement for correct serialization is the ability to de-serialize the serialized data to result in exactly the original entity.
A very simple analogy would be “serializing” a bunch of untidy children through a narrow gate, one child at a time. The only caveat being that, if the serialization process was perfect, then the children would regroup on the other side in the exact same configuration as before the serialization started.
Once the concept of serialization is clear, the intent of my definiton of language should also be clear, although you may or may not agree with it.
If we go along this line of thought (pardon the pun), a corollary immediately follows:
Language is the ultimate compression engine.
If we believe that human thought is one of the most complex phenomena known to us, and if language allows serialization of a complex thought into a small compact representation that can be communicated and stored in a myriad of ways, and ultimately easily de-serialized by target humans to reveal the original thought, then the corollary must be true.
When one thinks hard to solve a problem, it is likely he or she uses bits of language in self-communication to focus attention on particular aspects, annotate the intermediate results, and proceed one by one onto higher level steps. Although the most revealing flashes of insight most likely occur during a moment of unbounded “massively parallel” thought, knowledge of language undoubtedly plays a role in allowing the thinker to carry out elaborate thought experiments.
In primitive cultures (or in someone never exposed to the concept of language), undoubtedly the enterprising inventors of that time devised their own methods of mentally labelling specific ideas with individual symbols, and then using those symbols to ease the task of deriving higher level constructs.
Also, if the above corollary holds weight, I think it has some additional fantastic implications.
This means that if we were to one day achieve “brain dumps”, or “downloading a brain” and their ilk, the best format to allow accurate storage and re-construction would be plain text! If the brain could somehow be tricked into emitting a high speed lecture on its current and past states, then the language best suited for that would be the mother-tongue of the brain’s owner. Nothing else we conceive will probably ever come close in accuracy or compactness.
Of course, this does not bode well for automated machine translation attempts. To be fully successful at that task, implies the ability to de-serialize a piece of text into the original speaker’s thoughts. If we accept that human thought is one of the most complex or mysterious activities known to mankind, then we are accepting that automated machine translation is a pipe dream for many years to come.
On the other hand, if the code of language does get cracked soon, will it mean that human thought is not so complex after all?