Natural language processing ( NLP ) is a subfield of computer science and especially artificial intelligence . It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval , knowledge representation and computational linguistics , a subfield of linguistics . Typically data is collected in text corpora , using either rule-based, statistical or neural-based approaches in machine learning and deep learning .
22-724: [REDACTED] Look up NLP in Wiktionary, the free dictionary. NLP commonly refers to: Natural language processing , a field of computer science and linguistics Neuro-linguistic programming , a pseudoscientific method aimed at modifying human behavior Computer programming [ edit ] Natural-language programming , a programming paradigm Libraries [ edit ] National Library of Pakistan Mathematics [ edit ] Nonlinear programming , solving optimisation problems with nonlinear constraints Medicine [ edit ] No light perception,
44-474: A Chinese phrasebook, with questions and matching answers), the computer emulates natural language understanding (or other NLP tasks) by applying those rules to the data it confronts. Up until the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of machine learning algorithms for language processing. This
66-640: A diagnosis of severe blindness Political parties [ edit ] National Liberal Party (disambiguation) National Liberation Party (disambiguation) National Labour Party (disambiguation) Natural Law Party , a trans-national union of political parties, with national branches in over 80 countries Natural Law Party of Canada Natural Law Party (Ireland) Natural Law Party of Israel Natural Law Party of New Zealand Natural Law Party of Ontario Natural Law Party of Quebec Natural Law Party (Trinidad and Tobago) Natural Law Party (United States) Topics referred to by
88-505: A programming paradigm Libraries [ edit ] National Library of Pakistan Mathematics [ edit ] Nonlinear programming , solving optimisation problems with nonlinear constraints Medicine [ edit ] No light perception, a diagnosis of severe blindness Political parties [ edit ] National Liberal Party (disambiguation) National Liberation Party (disambiguation) National Labour Party (disambiguation) Natural Law Party ,
110-403: A trans-national union of political parties, with national branches in over 80 countries Natural Law Party of Canada Natural Law Party (Ireland) Natural Law Party of Israel Natural Law Party of New Zealand Natural Law Party of Ontario Natural Law Party of Quebec Natural Law Party (Trinidad and Tobago) Natural Law Party (United States) Topics referred to by
132-510: Is different from Wikidata All article disambiguation pages All disambiguation pages NLP [REDACTED] Look up NLP in Wiktionary, the free dictionary. NLP commonly refers to: Natural language processing , a field of computer science and linguistics Neuro-linguistic programming , a pseudoscientific method aimed at modifying human behavior Computer programming [ edit ] Natural-language programming ,
154-476: Is different from Wikidata All article disambiguation pages All disambiguation pages Natural language processing Major tasks in natural language processing are speech recognition , text classification , natural-language understanding , and natural-language generation . Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled " Computing Machinery and Intelligence " which proposed what
176-493: Is given below. Based on long-standing trends in the field, it is possible to extrapolate future directions of NLP. As of 2020, three trends among the topics of the long-standing series of CoNLL Shared Tasks can be observed: Most higher-level NLP applications involve aspects that emulate intelligent behaviour and apparent comprehension of natural language. More broadly speaking, the technical operationalization of increasingly advanced aspects of cognitive behaviour represents one of
198-603: Is increasingly important in medicine and healthcare , where NLP helps analyze notes and text in electronic health records that would otherwise be inaccessible for study when seeking to improve care or protect patient privacy. Symbolic approach, i.e., the hand-coding of a set of rules for manipulating symbols, coupled with a dictionary lookup, was historically the first approach used both by AI in general and by NLP in particular: such as by writing grammars or devising heuristic rules for stemming . Machine learning approaches, which include both statistical and neural networks, on
220-454: Is now called the Turing test as a criterion of intelligence, though at the time that was not articulated as a problem separate from artificial intelligence. The proposed test includes a task that involves the automated interpretation and generation of natural language. The premise of symbolic NLP is well-summarized by John Searle 's Chinese room experiment: Given a collection of rules (e.g.,
242-504: The ACL ). More recently, ideas of cognitive NLP have been revived as an approach to achieve explainability , e.g., under the notion of "cognitive AI". Likewise, ideas of cognitive NLP are inherent to neural models multimodal NLP (although rarely made explicit) and developments in artificial intelligence , specifically tools and technologies using large language model approaches and new directions in artificial general intelligence based on
SECTION 10
#1732765637438264-472: The age of symbolic NLP , the area of computational linguistics maintained strong ties with cognitive studies. As an example, George Lakoff offers a methodology to build natural language processing (NLP) algorithms through the perspective of cognitive science, along with the findings of cognitive linguistics, with two defining aspects: Ties with cognitive linguistics are part of the historical heritage of NLP, but they have been less frequently addressed since
286-480: The developmental trajectories of NLP (see trends among CoNLL shared tasks above). Cognition refers to "the mental action or process of acquiring knowledge and understanding through thought, experience, and the senses." Cognitive science is the interdisciplinary, scientific study of the mind and its processes. Cognitive linguistics is an interdisciplinary branch of linguistics, combining knowledge and research from both psychology and linguistics. Especially during
308-435: The following years he went on to develop Word2vec . In the 2010s, representation learning and deep neural network -style (featuring many hidden layers) machine learning methods became widespread in natural language processing. That popularity was due partly to a flurry of results showing that such techniques can achieve state-of-the-art results in many natural language tasks, e.g., in language modeling and parsing. This
330-501: The intermediate steps, such as word alignment, previously necessary for statistical machine translation . The following is a list of some of the most commonly researched tasks in natural language processing. Some of these tasks have direct real-world applications, while others more commonly serve as subtasks that are used to aid in solving larger tasks. Though natural language processing tasks are closely intertwined, they can be subdivided into categories for convenience. A coarse division
352-658: The old rule-based approaches. Only the introduction of hidden Markov models , applied to part-of-speech tagging, announced the end of the old rule-based approach. A major drawback of statistical methods is that they require elaborate feature engineering . Since 2015, the statistical approach has been replaced by the neural networks approach, using semantic networks and word embeddings to capture semantic properties of words. Intermediate tasks (e.g., part-of-speech tagging and dependency parsing) are not needed anymore. Neural machine translation , based on then-newly-invented sequence-to-sequence transformations, made obsolete
374-521: The other hand, have many advantages over the symbolic approach: Although rule-based systems for manipulating symbols were still in use in 2020, they have become mostly obsolete with the advance of LLMs in 2023. Before that they were commonly used: In the late 1980s and mid-1990s, the statistical approach ended a period of AI winter , which was caused by the inefficiencies of the rule-based approaches. The earliest decision trees , producing systems of hard if–then rules , were still very similar to
396-403: The same term [REDACTED] This disambiguation page lists articles associated with the title NLP . If an internal link led you here, you may wish to change the link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=NLP&oldid=1257978489 " Category : Disambiguation pages Hidden categories: Short description
418-403: The same term [REDACTED] This disambiguation page lists articles associated with the title NLP . If an internal link led you here, you may wish to change the link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=NLP&oldid=1257978489 " Category : Disambiguation pages Hidden categories: Short description
440-442: The statistical turn during the 1990s. Nevertheless, approaches to develop cognitive models towards technically operationalizable frameworks have been pursued in the context of various frameworks, e.g., of cognitive grammar, functional grammar, construction grammar, computational psycholinguistics and cognitive neuroscience (e.g., ACT-R ), however, with limited uptake in mainstream NLP (as measured by presence on major conferences of
462-419: Was due to both the steady increase in computational power (see Moore's law ) and the gradual lessening of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar ), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing. In 2003, word n-gram model , at the time the best statistical algorithm,
SECTION 20
#1732765637438484-430: Was outperformed by a multi-layer perceptron (with a single hidden layer and context length of several words trained on up to 14 million of words with a CPU cluster in language modelling ) by Yoshua Bengio with co-authors. In 2010, Tomáš Mikolov (then a PhD student at Brno University of Technology ) with co-authors applied a simple recurrent neural network with a single hidden layer to language modelling, and in
#437562