Dear Wikiwand AI, let's keep it short by simply answering these key questions:
Can you list the top facts and stats about Language?
Summarize this article for a 10 year old
Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and written forms, and may also be conveyed through sign languages. Human language is characterized by its cultural and historical diversity, with significant variations observed between cultures and across time. Human languages possess the properties of productivity and displacement, which enable the creation of an infinite number of sentences, and the ability to refer to objects, events, and ideas that are not immediately present in the discourse. The use of human language relies on social convention and is acquired through learning.
Estimates of the number of human languages in the world vary between 5,000 and 7,000. Precise estimates depend on an arbitrary distinction (dichotomy) established between languages and dialects. Natural languages are spoken, signed, or both; however, any language can be encoded into secondary media using auditory, visual, or tactile stimuli – for example, writing, whistling, signing, or braille. In other words, human language is modality-independent, but written or signed language is the way to inscribe or encode the natural human speech or gestures.
Depending on philosophical perspectives regarding the definition of language and meaning, when used as a general concept, "language" may refer to the cognitive ability to learn and use systems of complex communication, or to describe the set of rules that makes up these systems, or the set of utterances that can be produced from those rules. All languages rely on the process of semiosis to relate signs to particular meanings. Oral, manual and tactile languages contain a phonological system that governs how symbols are used to form sequences known as words or morphemes, and a syntactic system that governs how words and morphemes are combined to form phrases and utterances.
The scientific study of language is called linguistics. Critical examinations of languages, such as philosophy of language, the relationships between language and thought, how words represent experience, etc., have been debated at least since Gorgias and Plato in ancient Greek civilization. Thinkers such as Jean-Jacques Rousseau (1712–1778) have argued that language originated from emotions, while others like Immanuel Kant (1724–1804) have argued that languages originated from rational and logical thought. Twentieth century philosophers such as Ludwig Wittgenstein (1889–1951) argued that philosophy is really the study of language itself. Major figures in contemporary linguistics of these times include Ferdinand de Saussure and Noam Chomsky.
Language is thought to have gradually diverged from earlier primate communication systems when early hominins acquired the ability to form a theory of mind and shared intentionality. This development is sometimes thought to have coincided with an increase in brain volume, and many linguists see the structures of language as having evolved to serve specific communicative and social functions. Language is processed in many different locations in the human brain, but especially in Broca's and Wernicke's areas. Humans acquire language through social interaction in early childhood, and children generally speak fluently by approximately three years old. Language and culture are codependent. Therefore, in addition to its strictly communicative uses, language has social uses such as signifying group identity, social stratification, as well as use for social grooming and entertainment.
Languages evolve and diversify over time, and the history of their evolution can be reconstructed by comparing modern languages to determine which traits their ancestral languages must have had in order for the later developmental stages to occur. A group of languages that descend from a common ancestor is known as a language family; in contrast, a language that has been demonstrated to not have any living or non-living relationship with another language is called a language isolate. There are also many unclassified languages whose relationships have not been established, and spurious languages may have not existed at all. Academic consensus holds that between 50% and 90% of languages spoken at the beginning of the 21st century will probably have become extinct by the year 2100.
The English word language derives ultimately from Proto-Indo-European *dn̥ǵʰwéh₂s "tongue, speech, language" through Latin lingua, "language; tongue", and Old French language. The word is sometimes used to refer to codes, ciphers, and other kinds of artificially constructed communication systems such as formally defined computer languages used for computer programming. Unlike conventional human languages, a formal language in this sense is a system of signs for encoding and decoding information. This article specifically concerns the properties of natural human language as it is studied in the discipline of linguistics.
As an object of linguistic study, "language" has two primary meanings: an abstract concept, and a specific linguistic system, e.g. "French". The Swiss linguist Ferdinand de Saussure, who defined the modern discipline of linguistics, first explicitly formulated the distinction using the French word language for language as a concept, langue as a specific instance of a language system, and parole for the concrete usage of speech in a particular language.
When speaking of language as a general concept, definitions can be used which stress different aspects of the phenomenon. These definitions also entail different approaches and understandings of language, and they also inform different and often incompatible schools of linguistic theory. Debates about the nature and origin of language go back to the ancient world. Greek philosophers such as Gorgias and Plato debated the relation between words, concepts and reality. Gorgias argued that language could represent neither the objective experience nor human experience, and that communication and truth were therefore impossible. Plato maintained that communication is possible because language represents ideas and concepts that exist independently of, and prior to, language.
During the Enlightenment and its debates about human origins, it became fashionable to speculate about the origin of language. Thinkers such as Rousseau and Johann Gottfried Herder argued that language had originated in the instinctive expression of emotions, and that it was originally closer to music and poetry than to the logical expression of rational thought. Rationalist philosophers such as Kant and René Descartes held the opposite view. Around the turn of the 20th century, thinkers began to wonder about the role of language in shaping our experiences of the world – asking whether language simply reflects the objective structure of the world, or whether it creates concepts that in turn impose structure on our experience of the objective world. This led to the question of whether philosophical problems are really firstly linguistic problems. The resurgence of the view that language plays a significant role in the creation and circulation of concepts, and that the study of philosophy is essentially the study of language, is associated with what has been called the linguistic turn and philosophers such as Wittgenstein in 20th-century philosophy. These debates about language in relation to meaning and reference, cognition and consciousness remain active today.
Mental faculty, organ or instinct
One definition sees language primarily as the mental faculty that allows humans to undertake linguistic behaviour: to learn languages and to produce and understand utterances. This definition stresses the universality of language to all humans, and it emphasizes the biological basis for the human capacity for language as a unique development of the human brain. Proponents of the view that the drive to language acquisition is innate in humans argue that this is supported by the fact that all cognitively normal children raised in an environment where language is accessible will acquire language without formal instruction. Languages may even develop spontaneously in environments where people live or grow up together without a common language; for example, creole languages and spontaneously developed sign languages such as Nicaraguan Sign Language. This view, which can be traced back to the philosophers Kant and Descartes, understands language to be largely innate, for example, in Chomsky's theory of Universal Grammar, or American philosopher Jerry Fodor's extreme innatist theory. These kinds of definitions are often applied in studies of language within a cognitive science framework and in neurolinguistics.
Formal symbolic system
Another definition sees language as a formal system of signs governed by grammatical rules of combination to communicate meaning. This definition stresses that human languages can be described as closed structural systems consisting of rules that relate particular signs to particular meanings. This structuralist view of language was first introduced by Ferdinand de Saussure, and his structuralism remains foundational for many approaches to language.
Some proponents of Saussure's view of language have advocated a formal approach which studies language structure by identifying its basic elements and then by presenting a formal account of the rules according to which the elements combine in order to form words and sentences. The main proponent of such a theory is Noam Chomsky, the originator of the generative theory of grammar, who has defined language as the construction of sentences that can be generated using transformational grammars. Chomsky considers these rules to be an innate feature of the human mind and to constitute the rudiments of what language is. By way of contrast, such transformational grammars are also commonly used in formal logic, in formal linguistics, and in applied computational linguistics. In the philosophy of language, the view of linguistic meaning as residing in the logical relations between propositions and reality was developed by philosophers such as Alfred Tarski, Bertrand Russell, and other formal logicians.
Tool for communication
Yet another definition sees language as a system of communication that enables humans to exchange verbal or symbolic utterances. This definition stresses the social functions of language and the fact that humans use it to express themselves and to manipulate objects in their environment. Functional theories of grammar explain grammatical structures by their communicative functions, and understand the grammatical structures of language to be the result of an adaptive process by which grammar was "tailored" to serve the communicative needs of its users.
This view of language is associated with the study of language in pragmatic, cognitive, and interactive frameworks, as well as in sociolinguistics and linguistic anthropology. Functionalist theories tend to study grammar as dynamic phenomena, as structures that are always in the process of changing as they are employed by their speakers. This view places importance on the study of linguistic typology, or the classification of languages according to structural features, as it can be shown that processes of grammaticalization tend to follow trajectories that are partly dependent on typology. In the philosophy of language, the view of pragmatics as being central to language and meaning is often associated with Wittgenstein's later works and with ordinary language philosophers such as J.L. Austin, Paul Grice, John Searle, and W.O. Quine.
Distinctive features of human language
Communication systems used by other animals such as bees or apes are closed systems that consist of a finite, usually very limited, number of possible ideas that can be expressed. In contrast, human language is open-ended and productive, meaning that it allows humans to produce a vast range of utterances from a finite set of elements, and to create new words and sentences. This is possible because human language is based on a dual code, in which a finite number of elements which are meaningless in themselves (e.g. sounds, letters or gestures) can be combined to form an infinite number of larger units of meaning (words and sentences). However, one study has demonstrated that an Australian bird, the chestnut-crowned babbler, is capable of using the same acoustic elements in different arrangements to create two functionally distinct vocalizations. Additionally, pied babblers have demonstrated the ability to generate two functionally distinct vocalisations composed of the same sound type, which can only be distinguished by the number of repeated elements.
Several species of animals have proved to be able to acquire forms of communication through social learning: for instance a bonobo named Kanzi learned to express itself using a set of symbolic lexigrams. Similarly, many species of birds and whales learn their songs by imitating other members of their species. However, while some animals may acquire large numbers of words and symbols, none have been able to learn as many different signs as are generally known by an average 4 year old human, nor have any acquired anything resembling the complex grammar of human language.
Human languages differ from animal communication systems in that they employ grammatical and semantic categories, such as noun and verb, present and past, which may be used to express exceedingly complex meanings. It is distinguished by the property of recursivity: for example, a noun phrase can contain another noun phrase (as in "[[the chimpanzee]'s lips]") or a clause can contain another clause (as in "[I see [the dog is running]]"). Human language is the only known natural communication system whose adaptability may be referred to as modality independent. This means that it can be used not only for communication through one channel or medium, but through several. For example, spoken language uses the auditive modality, whereas sign languages and writing use the visual modality, and braille writing uses the tactile modality.
Human language is unusual in being able to refer to abstract concepts and to imagined or hypothetical events as well as events that took place in the past or may happen in the future. This ability to refer to events that are not at the same time or place as the speech event is called displacement, and while some animal communication systems can use displacement (such as the communication of bees that can communicate the location of sources of nectar that are out of sight), the degree to which it is used in human language is also considered unique.
Theories about the origin of language differ in regard to their basic assumptions about what language is. Some theories are based on the idea that language is so complex that one cannot imagine it simply appearing from nothing in its final form, but that it must have evolved from earlier pre-linguistic systems among our pre-human ancestors. These theories can be called continuity-based theories. The opposite viewpoint is that language is such a unique human trait that it cannot be compared to anything found among non-humans and that it must therefore have appeared suddenly in the transition from pre-hominids to early man. These theories can be defined as discontinuity-based. Similarly, theories based on the generative view of language pioneered by Noam Chomsky see language mostly as an innate faculty that is largely genetically encoded, whereas functionalist theories see it as a system that is largely cultural, learned through social interaction.
Continuity-based theories are held by a majority of scholars, but they vary in how they envision this development. Those who see language as being mostly innate, such as psychologist Steven Pinker, hold the precedents to be animal cognition, whereas those who see language as a socially learned tool of communication, such as psychologist Michael Tomasello, see it as having developed from animal communication in primates: either gestural or vocal communication to assist in cooperation. Other continuity-based models see language as having developed from music, a view already espoused by Rousseau, Herder, Humboldt, and Charles Darwin. A prominent proponent of this view is archaeologist Steven Mithen. Stephen Anderson states that the age of spoken languages is estimated at 60,000 to 100,000 years and that:
Researchers on the evolutionary origin of language generally find it plausible to suggest that language was invented only once, and that all modern spoken languages are thus in some way related, even if that relation can no longer be recovered ... because of limitations on the methods available for reconstruction.
Because language emerged in the early prehistory of man, before the existence of any written records, its early development has left no historical traces, and it is believed that no comparable processes can be observed today. Theories that stress continuity often look at animals to see if, for example, primates display any traits that can be seen as analogous to what pre-human language must have been like. Early human fossils can be inspected for traces of physical adaptation to language use or pre-linguistic forms of symbolic behaviour. Among the signs in human fossils that may suggest linguistic abilities are: the size of the brain relative to body mass, the presence of a larynx capable of advanced sound production and the nature of tools and other manufactured artifacts.
It was mostly undisputed that pre-human australopithecines did not have communication systems significantly different from those found in great apes in general. However, a 2017 study on Ardipithecus ramidus challenges this belief. Scholarly opinions vary as to the developments since the appearance of the genus Homo some 2.5 million years ago. Some scholars assume the development of primitive language-like systems (proto-language) as early as Homo habilis (2.3 million years ago) while others place the development of primitive symbolic communication only with Homo erectus (1.8 million years ago) or Homo heidelbergensis (0.6 million years ago), and the development of language proper with anatomically modern Homo sapiens with the Upper Paleolithic revolution less than 100,000 years ago.
Chomsky is one prominent proponent of a discontinuity-based theory of human language origins. He suggests that for scholars interested in the nature of language, "talk about the evolution of the language capacity is beside the point." Chomsky proposes that perhaps "some random mutation took place [...] and it reorganized the brain, implanting a language organ in an otherwise primate brain." Though cautioning against taking this story literally, Chomsky insists that "it may be closer to reality than many other fairy tales that are told about evolutionary processes, including language."
The study of language, linguistics, has been developing into a science since the first grammatical descriptions of particular languages in India more than 2000 years ago, after the development of the Brahmi script. Modern linguistics is a science that concerns itself with all aspects of language, examining it from all of the theoretical viewpoints described above.
The academic study of language is conducted within many different disciplinary areas and from different theoretical angles, all of which inform modern approaches to linguistics. For example, descriptive linguistics examines the grammar of single languages, theoretical linguistics develops theories on how best to conceptualize and define the nature of language based on data from the various extant human languages, sociolinguistics studies how languages are used for social purposes informing in turn the study of the social functions of language and grammatical description, neurolinguistics studies how language is processed in the human brain and allows the experimental testing of theories, computational linguistics builds on theoretical and descriptive linguistics to construct computational models of language often aimed at processing natural language or at testing linguistic hypotheses, and historical linguistics relies on grammatical and lexical descriptions of languages to trace their individual histories and reconstruct trees of language families by using the comparative method.
The formal study of language is often considered to have started in India with Pāṇini, the 5th century BC grammarian who formulated 3,959 rules of Sanskrit morphology. However, Sumerian scribes already studied the differences between Sumerian and Akkadian grammar around 1900 BC. Subsequent grammatical traditions developed in all of the ancient cultures that adopted writing.
In the 17th century AD, the French Port-Royal Grammarians developed the idea that the grammars of all languages were a reflection of the universal basics of thought, and therefore that grammar was universal. In the 18th century, the first use of the comparative method by British philologist and expert on ancient India William Jones sparked the rise of comparative linguistics. The scientific study of language was broadened from Indo-European to language in general by Wilhelm von Humboldt. Early in the 20th century, Ferdinand de Saussure introduced the idea of language as a static system of interconnected units, defined through the oppositions between them.
By introducing a distinction between diachronic and synchronic analyses of language, he laid the foundation of the modern discipline of linguistics. Saussure also introduced several basic dimensions of linguistic analysis that are still fundamental in many contemporary linguistic theories, such as the distinctions between syntagm and paradigm, and the Langue-parole distinction, distinguishing language as an abstract system (langue), from language as a concrete manifestation of this system (parole).
In the 1960s, Noam Chomsky formulated the generative theory of language. According to this theory, the most basic form of language is a set of syntactic rules that is universal for all humans and which underlies the grammars of all human languages. This set of rules is called Universal Grammar; for Chomsky, describing it is the primary objective of the discipline of linguistics. Thus, he considered that the grammars of individual languages are only of importance to linguistics insofar as they allow us to deduce the universal underlying rules from which the observable linguistic variability is generated.
In opposition to the formal theories of the generative school, functional theories of language propose that since language is fundamentally a tool, its structures are best analyzed and understood by reference to their functions. Formal theories of grammar seek to define the different elements of language and describe the way they relate to each other as systems of formal rules or operations, while functional theories seek to define the functions performed by language and then relate them to the linguistic elements that carry them out. The framework of cognitive linguistics interprets language in terms of the concepts (which are sometimes universal, and sometimes specific to a particular language) which underlie its forms. Cognitive linguistics is primarily concerned with how the mind creates meaning through language.
Physiological and neural architecture of language and speech
Speaking is the default modality for language in all cultures. The production of spoken language depends on sophisticated capacities for controlling the lips, tongue and other components of the vocal apparatus, the ability to acoustically decode speech sounds, and the neurological apparatus required for acquiring and producing language. The study of the genetic bases for human language is at an early stage: the only gene that has definitely been implicated in language production is FOXP2, which may cause a kind of congenital language disorder if affected by mutations.
The brain is the coordinating center of all linguistic activity; it controls both the production of linguistic cognition and of meaning and the mechanics of speech production. Nonetheless, our knowledge of the neurological bases for language is quite limited, though it has advanced considerably with the use of modern imaging techniques. The discipline of linguistics dedicated to studying the neurological aspects of language is called neurolinguistics.
Early work in neurolinguistics involved the study of language in people with brain lesions, to see how lesions in specific areas affect language and speech. In this way, neuroscientists in the 19th century discovered that two areas in the brain are crucially implicated in language processing. The first area is Wernicke's area, which is in the posterior section of the superior temporal gyrus in the dominant cerebral hemisphere. People with a lesion in this area of the brain develop receptive aphasia, a condition in which there is a major impairment of language comprehension, while speech retains a natural-sounding rhythm and a relatively normal sentence structure. The second area is Broca's area, in the posterior inferior frontal gyrus of the dominant hemisphere. People with a lesion to this area develop expressive aphasia, meaning that they know what they want to say, they just cannot get it out. They are typically able to understand what is being said to them, but unable to speak fluently. Other symptoms that may be present in expressive aphasia include problems with word repetition. The condition affects both spoken and written language. Those with this aphasia also exhibit ungrammatical speech and show inability to use syntactic information to determine the meaning of sentences. Both expressive and receptive aphasia also affect the use of sign language, in analogous ways to how they affect speech, with expressive aphasia causing signers to sign slowly and with incorrect grammar, whereas a signer with receptive aphasia will sign fluently, but make little sense to others and have difficulties comprehending others' signs. This shows that the impairment is specific to the ability to use language, not to the physiology used for speech production.
With technological advances in the late 20th century, neurolinguists have also incorporated non-invasive techniques such as functional magnetic resonance imaging (fMRI) and electrophysiology to study language processing in individuals without impairments.
Anatomy of speech
Spoken language relies on human physical ability to produce sound, which is a longitudinal wave propagated through the air at a frequency capable of vibrating the ear drum. This ability depends on the physiology of the human speech organs. These organs consist of the lungs, the voice box (larynx), and the upper vocal tract – the throat, the mouth, and the nose. By controlling the different parts of the speech apparatus, the airstream can be manipulated to produce different speech sounds.
The sound of speech can be analyzed into a combination of segmental and suprasegmental elements. The segmental elements are those that follow each other in sequences, which are usually represented by distinct letters in alphabetic scripts, such as the Roman script. In free flowing speech, there are no clear boundaries between one segment and the next, nor usually are there any audible pauses between them. Segments therefore are distinguished by their distinct sounds which are a result of their different articulations, and can be either vowels or consonants. Suprasegmental phenomena encompass such elements as stress, phonation type, voice timbre, and prosody or intonation, all of which may have effects across multiple segments.
Consonants and vowel segments combine to form syllables, which in turn combine to form utterances; these can be distinguished phonetically as the space between two inhalations. Acoustically, these different segments are characterized by different formant structures, that are visible in a spectrogram of the recorded sound wave. Formants are the amplitude peaks in the frequency spectrum of a specific sound.
Vowels are those sounds that have no audible friction caused by the narrowing or obstruction of some part of the upper vocal tract. They vary in quality according to the degree of lip aperture and the placement of the tongue within the oral cavity. Vowels are called close when the lips are relatively closed, as in the pronunciation of the vowel [i] (English "ee"), or open when the lips are relatively open, as in the vowel [a] (English "ah"). If the tongue is located towards the back of the mouth, the quality changes, creating vowels such as [u] (English "oo"). The quality also changes depending on whether the lips are rounded as opposed to unrounded, creating distinctions such as that between [i] (unrounded front vowel such as English "ee") and [y] (rounded front vowel such as German "ü").
Consonants are those sounds that have audible friction or closure at some point within the upper vocal tract. Consonant sounds vary by place of articulation, i.e. the place in the vocal tract where the airflow is obstructed, commonly at the lips, teeth, alveolar ridge, palate, velum, uvula, or glottis. Each place of articulation produces a different set of consonant sounds, which are further distinguished by manner of articulation, or the kind of friction, whether full closure, in which case the consonant is called occlusive or stop, or different degrees of aperture creating fricatives and approximants. Consonants can also be either voiced or unvoiced, depending on whether the vocal cords are set in vibration by airflow during the production of the sound. Voicing is what separates English [s] in bus (unvoiced sibilant) from [z] in buzz (voiced sibilant).
Some speech sounds, both vowels and consonants, involve release of air flow through the nasal cavity, and these are called nasals or nasalized sounds. Other sounds are defined by the way the tongue moves within the mouth such as the l-sounds (called laterals, because the air flows along both sides of the tongue), and the r-sounds (called rhotics).
By using these speech organs, humans can produce hundreds of distinct sounds: some appear very often in the world's languages, whereas others are much more common in certain language families, language areas, or even specific to a single language.
Human languages display considerable plasticity in their deployment of two fundamental modes: oral (speech and mouthing) and manual (sign and gesture). For example, it is common for oral language to be accompanied by gesture, and for sign language to be accompanied by mouthing. In addition, some language communities use both modes to convey lexical or grammatical meaning, each mode complementing the other. Such bimodal use of language is especially common in genres such as story-telling (with Plains Indian Sign Language and Australian Aboriginal sign languages used alongside oral language, for example), but also occurs in mundane conversation. For instance, many Australian languages have a rich set of case suffixes that provide details about the instrument used to perform an action. Others lack such grammatical precision in the oral mode, but supplement it with gesture to convey that information in the sign mode. In Iwaidja, for example, 'he went out for fish using a torch' is spoken as simply "he-hunted fish torch", but the word for 'torch' is accompanied by a gesture indicating that it was held. In another example, the ritual language Damin had a heavily reduced oral vocabulary of only a few hundred words, each of which was very general in meaning, but which were supplemented by gesture for greater precision (e.g., the single word for fish, l*i, was accompanied by a gesture to indicate the kind of fish).
Secondary modes of language, by which a fundamental mode is conveyed in a different medium, include writing (including braille), sign (in manually coded language), whistling and drumming. Tertiary modes – such as semaphore, Morse code and spelling alphabets – convey the secondary mode of writing in a different medium. For some extinct languages that are maintained for ritual or liturgical purposes, writing may be the primary mode, with speech secondary.
When described as a system of symbolic communication, language is traditionally seen as consisting of three parts: signs, meanings, and a code connecting signs with their meanings. The study of the process of semiosis, how signs and meanings are combined, used, and interpreted is called semiotics. Signs can be composed of sounds, gestures, letters, or symbols, depending on whether the language is spoken, signed, or written, and they can be combined into complex signs, such as words and phrases. When used in communication, a sign is encoded and transmitted by a sender through a channel to a receiver who decodes it.
Some of the properties that define human language as opposed to other communication systems are: the arbitrariness of the linguistic sign, meaning that there is no predictable connection between a linguistic sign and its meaning; the duality of the linguistic system, meaning that linguistic structures are built by combining elements into larger structures that can be seen as layered, e.g. how sounds build words and words build phrases; the discreteness of the elements of language, meaning that the elements out of which linguistic signs are constructed are discrete units, e.g. sounds and words, that can be distinguished from each other and rearranged in different patterns; and the productivity of the linguistic system, meaning that the finite number of linguistic elements can be combined into a theoretically infinite number of combinations.
The rules by which signs can be combined to form words and phrases are called syntax or grammar. The meaning that is connected to individual signs, morphemes, words, phrases, and texts is called semantics. The division of language into separate but connected systems of sign and meaning goes back to the first linguistic studies of de Saussure and is now used in almost all branches of linguistics.
Languages express meaning by relating a sign form to a meaning, or its content. Sign forms must be something that can be perceived, for example, in sounds, images, or gestures, and then related to a specific meaning by social convention. Because the basic relation of meaning for most linguistic signs is based on social convention, linguistic signs can be considered arbitrary, in the sense that the convention is established socially and historically, rather than by means of a natural relation between a specific sign form and its meaning.
Thus, languages must have a vocabulary of signs related to specific meaning. The English sign "dog" denotes, for example, a member of the species Canis familiaris. In a language, the array of arbitrary signs connected to specific meanings is called the lexicon, and a single sign connected to a meaning is called a lexeme. Not all meanings in a language are represented by single words. Often, semantic concepts are embedded in the morphology or syntax of the language in the form of grammatical categories.
All languages contain the semantic structure of predication: a structure that predicates a property, state, or action. Traditionally, semantics has been understood to be the study of how speakers and interpreters assign truth values to statements, so that meaning is understood to be the process by which a predicate can be said to be true or false about an entity, e.g. "[x [is y]]" or "[x [does y]]". Recently, this model of semantics has been complemented with more dynamic models of meaning that incorporate shared knowledge about the context in which a sign is interpreted into the production of meaning. Such models of meaning are explored in the field of pragmatics.
Sounds and symbols
Depending on modality, language structure can be based on systems of sounds (speech), gestures (sign languages), or graphic or tactile symbols (writing). The ways in which languages use sounds or signs to construct meaning are studied in phonology.
Sounds as part of a linguistic system are called phonemes. Phonemes are abstract units of sound, defined as the smallest units in a language that can serve to distinguish between the meaning of a pair of minimally different words, a so-called minimal pair. In English, for example, the words bat [bæt] and pat [pʰæt] form a minimal pair, in which the distinction between /b/ and /p/ differentiates the two words, which have different meanings. However, each language contrasts sounds in different ways. For example, in a language that does not distinguish between voiced and unvoiced consonants, the sounds [p] and [b] (if they both occur) could be considered a single phoneme, and consequently, the two pronunciations would have the same meaning. Similarly, the English language does not distinguish phonemically between aspirated and non-aspirated pronunciations of consonants, as many other languages like Korean and Hindi do: the unaspirated /p/ in spin [spɪn] and the aspirated /p/ in pin [pʰɪn] are considered to be merely different ways of pronouncing the same phoneme (such variants of a single phoneme are called allophones), whereas in Mandarin Chinese, the same difference in pronunciation distinguishes between the words [pʰá] 'crouch' and [pá] 'eight' (the accent above the á means that the vowel is pronounced with a high tone).
All spoken languages have phonemes of at least two different categories, vowels and consonants, that can be combined to form syllables. As well as segments such as consonants and vowels, some languages also use sound in other ways to convey meaning. Many languages, for example, use stress, pitch, duration, and tone to distinguish meaning. Because these phenomena operate outside of the level of single segments, they are called suprasegmental. Some languages have only a few phonemes, for example, Rotokas and Pirahã language with 11 and 10 phonemes respectively, whereas languages like Taa may have as many as 141 phonemes. In sign languages, the equivalent to phonemes (formerly called cheremes) are defined by the basic elements of gestures, such as hand shape, orientation, location, and motion, which correspond to manners of articulation in spoken language.
Writing systems represent language using visual symbols, which may or may not correspond to the sounds of spoken language. The Latin alphabet (and those on which it is based or that have been derived from it) was originally based on the representation of single sounds, so that words were constructed from letters that generally denote a single consonant or vowel in the structure of the word. In syllabic scripts, such as the Inuktitut syllabary, each sign represents a whole syllable. In logographic scripts, each sign represents an entire word, and will generally bear no relation to the sound of that word in spoken language.
Because all languages have a very large number of words, no purely logographic scripts are known to exist. Written language represents the way spoken sounds and words follow one after another by arranging symbols according to a pattern that follows a certain direction. The direction used in a writing system is entirely arbitrary and established by convention. Some writing systems use the horizontal axis (left to right as the Latin script or right to left as the Arabic script), while others such as traditional Chinese writing use the vertical dimension (from top to bottom). A few writing systems use opposite directions for alternating lines, and others, such as the ancient Maya script, can be written in either direction and rely on graphic cues to show the reader the direction of reading.
In order to represent the sounds of the world's languages in writing, linguists have developed the International Phonetic Alphabet, designed to represent all of the discrete sounds that are known to contribute to meaning in human languages.
Grammar is the study of how meaningful elements called morphemes within a language can be combined into utterances. Morphemes can either be free or bound. If they are free to be moved around within an utterance, they are usually called words, and if they are bound to other words or morphemes, they are called affixes. The way in which meaningful elements can be combined within a language is governed by rules. The study of the rules for the internal structure of words are called morphology. The rules of the internal structure of phrases and sentences are called syntax.
Grammar can be described as a system of categories and a set of rules that determine how categories combine to form different aspects of meaning. Languages differ widely in whether they are encoded through the use of categories or lexical units. However, several categories are so common as to be nearly universal. Such universal categories include the encoding of the grammatical relations of participants and predicates by grammatically distinguishing between their relations to a predicate, the encoding of temporal and spatial relations on predicates, and a system of grammatical person governing reference to and distinction between speakers and addressees and those about whom they are speaking.
Languages organize their parts of speech into classes according to their functions and positions relative to other parts. All languages, for instance, make a basic distinction between a group of words that prototypically denotes things and concepts and a group of words that prototypically denotes actions and events. The first group, which includes English words such as "dog" and "song", are usually called nouns. The second, which includes "think" and "sing", are called verbs. Another common category is the adjective: words that describe properties or qualities of nouns, such as "red" or "big". Word classes can be "open" if new words can continuously be added to the class, or relatively "closed" if there is a fixed number of words in a class. In English, the class of pronouns is closed, whereas the class of adjectives is open, since an infinite number of adjectives can be constructed from verbs (e.g. "saddened") or nouns (e.g. with the -like suffix, as in "noun-like"). In other languages such as Korean, the situation is the opposite, and new pronouns can be constructed, whereas the number of adjectives is fixed.
Word classes also carry out differing functions in grammar. Prototypically, verbs are used to construct predicates, while nouns are used as arguments of predicates. In a sentence such as "Sally runs", the predicate is "runs", because it is the word that predicates a specific state about its argument "Sally". Some verbs such as "curse" can take two arguments, e.g. "Sally cursed John". A predicate that can only take a single argument is called intransitive, while a predicate that can take two arguments is called transitive.
Many other word classes exist in different languages, such as conjunctions like "and" that serve to join two sentences, articles that introduce a noun, interjections such as "wow!", or ideophones like "splash" that mimic the sound of some event. Some languages have positionals that describe the spatial position of an event or entity. Many languages have classifiers that identify countable nouns as belonging to a particular type or having a particular shape. For instance, in Japanese, the general noun classifier for humans is nin (人), and it is used for counting humans, whatever they are called:
- san-nin no gakusei (三人の学生) lit. "3 human-classifier of student" – three students
For trees, it would be:
- san-bon no ki (三本の木) lit. "3 classifier-for-long-objects of tree" – three trees
In linguistics, the study of the internal structure of complex words and the processes by which words are formed is called morphology. In most languages, it is possible to construct complex words that are built of several morphemes. For instance, the English word "unexpected" can be analyzed as being composed of the three morphemes "un-", "expect" and "-ed".
Morphemes can be classified according to whether they are independent morphemes, so-called roots, or whether they can only co-occur attached to other morphemes. These bound morphemes or affixes can be classified according to their position in relation to the root: prefixes precede the root, suffixes follow the root, and infixes are inserted in the middle of a root. Affixes serve to modify or elaborate the meaning of the root. Some languages change the meaning of words by changing the phonological structure of a word, for example, the English word "run", which in the past tense is "ran". This process is called ablaut. Furthermore, morphology distinguishes between the process of inflection, which modifies or elaborates on a word, and the process of derivation, which creates a new word from an existing one. In English, the verb "sing" has the inflectional forms "singing" and "sung", which are both verbs, and the derivational form "singer", which is a noun derived from the verb with the agentive suffix "-er".
Languages differ widely in how much they rely on morphological processes of word formation. In some languages, for example, Chinese, there are no morphological processes, and all grammatical information is encoded syntactically by forming strings of single words. This type of morpho-syntax is often called isolating, or analytic, because there is almost a full correspondence between a single word and a single aspect of meaning. Most languages have words consisting of several morphemes, but they vary in the degree to which morphemes are discrete units. In many languages, notably in most Indo-European languages, single morphemes may have several distinct meanings that cannot be analyzed into smaller segments. For example, in Latin, the word bonus, or "good", consists of the root bon-, meaning "good", and the suffix -us, which indicates masculine gender, singular number, and nominative case. These languages are called fusional languages, because several meanings may be fused into a single morpheme. The opposite of fusional languages are agglutinative languages which construct words by stringing morphemes together in chains, but with each morpheme as a discrete semantic unit. An example of such a language is Turkish, where for example, the word evlerinizden, or "from your houses", consists of the morphemes, ev-ler-iniz-den with the meanings house-plural-your-from. The languages that rely on morphology to the greatest extent are traditionally called polysynthetic languages. They may express the equivalent of an entire English sentence in a single word. For example, in Persian the single word nafahmidamesh means I didn't understand it consisting of morphemes na-fahm-id-am-esh with the meanings, "negation.understand.past.I.it". As another example with more complexity, in the Yupik word tuntussuqatarniksatengqiggtuq, which means "He had not yet said again that he was going to hunt reindeer", the word consists of the morphemes tuntu-ssur-qatar-ni-ksaite-ngqiggte-uq with the meanings, "reindeer-hunt-future-say-negation-again-third.person.singular.indicative", and except for the morpheme tuntu ("reindeer") none of the other morphemes can appear in isolation.
Many languages use morphology to cross-reference words within a sentence. This is sometimes called agreement. For example, in many Indo-European languages, adjectives must cross-reference the noun they modify in terms of number, case, and gender, so that the Latin adjective bonus, or "good", is inflected to agree with a noun that is masculine gender, singular number, and nominative case. In many polysynthetic languages, verbs cross-reference their subjects and objects. In these types of languages, a single verb may include information that would require an entire sentence in English. For example, in the Basque phrase ikusi nauzu, or "you saw me", the past tense auxiliary verb n-au-zu (similar to English "do") agrees with both the subject (you) expressed by the n- prefix, and with the object (me) expressed by the – zu suffix. The sentence could be directly transliterated as "see you-did-me"
Another way in which languages convey meaning is through the order of words within a sentence. The grammatical rules for how to produce new sentences from words that are already known is called syntax. The syntactical rules of a language determine why a sentence in English such as "I love you" is meaningful, but "*love you I" is not. Syntactical rules determine how word order and sentence structure is constrained, and how those constraints contribute to meaning. For example, in English, the two sentences "the slaves were cursing the master" and "the master was cursing the slaves" mean different things, because the role of the grammatical subject is encoded by the noun being in front of the verb, and the role of object is encoded by the noun appearing after the verb. Conversely, in Latin, both Dominus servos vituperabat and Servos vituperabat dominus mean "the master was reprimanding the slaves", because servos, or "slaves", is in the accusative case, showing that they are the grammatical object of the sentence, and dominus, or "master", is in the nominative case, showing that he is the subject.
Latin uses morphology to express the distinction between subject and object, whereas English uses word order. Another example of how syntactic rules contribute to meaning is the rule of inverse word order in questions, which exists in many languages. This rule explains why when in English, the phrase "John is talking to Lucy" is turned into a question, it becomes "Who is John talking to?", and not "John is talking to who?". The latter example may be used as a way of placing special emphasis on "who", thereby slightly altering the meaning of the question. Syntax also includes the rules for how complex sentences are structured by grouping words together in units, called phrases, that can occupy different places in a larger syntactic structure. Sentences can be described as consisting of phrases connected in a tree structure, connecting the phrases to each other at different levels. To the right is a graphic representation of the syntactic analysis of the English sentence "the cat sat on the mat". The sentence is analyzed as being constituted by a noun phrase, a verb, and a prepositional phrase; the prepositional phrase is further divided into a preposition and a noun phrase, and the noun phrases consist of an article and a noun.
The reason sentences can be seen as being composed of phrases is because each phrase would be moved around as a single element if syntactic operations were carried out. For example, "the cat" is one phrase, and "on the mat" is another, because they would be treated as single units if a decision was made to emphasize the location by moving forward the prepositional phrase: "[And] on the mat, the cat sat". There are many different formalist and functionalist frameworks that propose theories for describing syntactic structures, based on different assumptions about what language is and how it should be described. Each of them would analyze a sentence such as this in a different manner.
Typology and universals
Languages can be classified in relation to their grammatical types. Languages that belong to different families nonetheless often have features in common, and these shared features tend to correlate. For example, languages can be classified on the basis of their basic word order, the relative order of the verb, and its constituents in a normal indicative sentence. In English, the basic order is SVO (subject–verb–object): "The snake(S) bit(V) the man(O)", whereas for example, the corresponding sentence in the Australian language Gamilaraay would be d̪uyugu n̪ama d̪ayn yiːy (snake man bit), SOV. Word order type is relevant as a typological parameter, because basic word order type corresponds with other syntactic parameters, such as the relative order of nouns and adjectives, or of the use of prepositions or postpositions. Such correlations are called implicational universals. For example, most (but not all) languages that are of the SOV type have postpositions rather than prepositions, and have adjectives before nouns.
All languages structure sentences into Subject, Verb, and Object, but languages differ in the way they classify the relations between actors and actions. English uses the nominative-accusative word typology: in English transitive clauses, the subjects of both intransitive sentences ("I run") and transitive sentences ("I love you") are treated in the same way, shown here by the nominative pronoun I. Some languages, called ergative, Gamilaraay among them, distinguish instead between Agents and Patients. In ergative languages, the single participant in an intransitive sentence, such as "I run", is treated the same as the patient in a transitive sentence, giving the equivalent of "me run". Only in transitive sentences would the equivalent of the pronoun "I" be used. In this way the semantic roles can map onto the grammatical relations in different ways, grouping an intransitive subject either with Agents (accusative type) or Patients (ergative type) or even making each of the three roles differently, which is called the tripartite type.
The shared features of languages which belong to the same typological class type may have arisen completely independently. Their co-occurrence might be due to universal laws governing the structure of natural languages, "language universals", or they might be the result of languages evolving convergent solutions to the recurring communicative problems that humans use language to solve.