Misplaced Pages

General Service List

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The General Service List ( GSL ) is a list of roughly 2,000 words published by Michael West in 1953. The words were selected to represent the most frequent words of English and were taken from a corpus of written English. The target audience was English language learners and ESL teachers. To maximize the utility of the list, some frequent words that overlapped broadly in meaning with words already on the list were omitted. In the original publication the relative frequencies of various senses of the words were also included.

#252747

24-455: The list is important because a person who knows all the words on the list and their related families would understand approximately 90–95 per cent of colloquial speech and 80–85 per cent of common written texts. The list consists only of headwords , which means that the word "be" is high on the list, but assumes that the person is fluent in all forms of the word, e.g. am, is, are, was, were, being, and been. Researchers have expressed doubts about

48-401: A lemma ( pl. : lemmas or lemmata ) is the canonical form , dictionary form , or citation form of a set of word forms. In English, for example, break , breaks , broke , broken and breaking are forms of the same lexeme , with break as the lemma by which they are indexed. Lexeme , in this context, refers to the set of all the inflected or alternating forms in the paradigm of

72-478: A dictionary, the lemma "go" represents the inflected forms "go", "goes", "going", "went", and "gone". The relationship between an inflected form and its lemma is usually denoted by an angle bracket, e.g., "went" < "go". Of course, the disadvantage of such simplifications is the inability to look up a declined or conjugated form of the word, but some dictionaries, like Webster's Dictionary , list "went". Multilingual dictionaries vary in how they deal with this issue:

96-557: A form of the indefinite pronoun one : do one's best , perjure oneself . In European languages with grammatical gender , the citation form of regular adjectives and nouns is usually the masculine singular. If the language also has cases , the citation form is often the masculine singular nominative. For many languages, the citation form of a verb is the infinitive : French aller , German gehen , Hindustani जाना / جانا , Spanish ir . English verbs usually have an infinitive, which in its bare form (without

120-499: A sentence because of initial mutations . The noun cainteoir , the lemma for the noun meaning "speaker", has a variety of forms: chainteoir , gcainteoir , cainteora , chainteora , cainteoirí , chainteoirí and gcainteoirí . Some phrases are cited in a sort of lemma: Carthago delenda est (literally, "Carthage must be destroyed") is a common way of citing Cato , but what he said was nearer to censeo Carthaginem esse delendam ("I hold Carthage to be in need of destruction"). In

144-434: A single word, and lemma refers to the particular form that is chosen by convention to represent the lexeme. Lemmas have special significance in highly inflected languages such as Arabic , Turkish , and Russian . The process of determining the lemma for a given lexeme is called lemmatisation . The lemma can be viewed as the chief of the principal parts , although lemmatisation is at least partly arbitrary. The form of

168-424: A small number of archaic terms, such as shilling , while excluding words that have gained currency since the first half of the twentieth century, such as plastic , television , battery , okay , victim , and drug . The GSL evolved over several decades before West's publication in 1953. The GSL is not a list based solely on frequency, but includes groups of words on a semantic basis. Various versions float around

192-400: A word that is chosen to serve as the lemma is usually the least marked form, but there are several exceptions such as the use of the infinitive for verbs in some languages. For English, the citation form of a noun is the singular (and non-possessive) form: mouse rather than mice . For multiword lexemes that contain possessive adjectives or reflexive pronouns , the citation form uses

216-416: Is the lemma under which a set of related dictionary or encyclopaedia entries appears. The headword is used to locate the entry, and dictates its alphabetical position. Depending on the size and nature of the dictionary or encyclopedia, the entry may include alternative meanings of the word, its etymology , pronunciation and inflections , related lemmas such as compound words or phrases that contain

240-400: Is the part of the word that never changes even when morphologically inflected; a lemma is the least marked form of the word. In linguistic analysis, the stem is defined more generally as a form without any of its possible inflectional morphemes (but including derivational morphemes and may contain multiple roots). When phonology is taken into account, the definition of the unchangeable part of

264-524: Is traditionally used, but some modern dictionaries use the infinitive instead (except for Bulgarian, which lacks infinitives; for contracted verbs in Ancient Greek, an uncontracted first person singular present tense is used to reveal the contract vowel: φιλέω philéō for φιλῶ philō "I love" [implying affection], ἀγαπάω agapáō for ἀγαπῶ agapō "I love" [implying regard]). Finnish dictionaries list verbs not under their root, but under

SECTION 10

#1732780439253

288-536: The Langenscheidt dictionary of German does not list ging (< gehen ), but the Cassell does. Lemmas or word stems are used often in corpus linguistics for determining word frequency. In that usage, the specific definition of "lemma" is flexible depending on the task it is being used for. A word may have different pronunciations , depending on its phonetic environment (the neighbouring sounds) or on

312-517: The article wizard to submit a draft for review, or request a new article . Search for " Gehen " in existing articles. Look for pages within Misplaced Pages that link to this title . Other reasons this message may be displayed: If a page was recently created here, it may not be visible yet because of a delay in updating the database; wait a few minutes or try the purge function . Titles on Misplaced Pages are case sensitive except for

336-522: The GSL could not be considered general service words because the range and frequency of these words were too low to be included in the list. Recent research by Billuroğlu and Neufeld (2005) confirmed that the General Service List was in need of minor revision, but the headwords in the list still provide approximately 80% text coverage in written English. The research showed that the GSL contains

360-585: The Internet, and attempts have been made to improve it. There are two major updates of the GSL: Some ESL dictionaries use the General Service List as their controlled defining vocabulary . In the Longman Dictionary of Contemporary English , each definition is written using the 2000-word Longman Defining Vocabulary based on the GSL. Headword In morphology and lexicography ,

384-414: The adequacy of the GSL because of its age and the relatively low coverage provided by the words not in the first 1,000 words of the list. Engels was, in particular, critical of the limited vocabulary chosen by West (1953), and while he concurred that the first 1,000 words of the GSL were good selections based on their high frequency and wide range, he was of the opinion that the words beyond the first 1,000 of

408-484: The degree of stress in a sentence. An example of the latter is the weak and strong forms of certain English function words like some and but (pronounced /sʌm/ , /bʌt/ when stressed but /s(ə)m/ , /bət/ when unstressed). Dictionaries usually give the pronunciation used when the word is pronounced alone (its isolation form ) and with stress, but they may also note common weak forms of pronunciation. The stem

432-429: The first infinitive, marked with -(t)a , -(t)ä . For Japanese , the non-past (present and future) tense is used. For Arabic the third-person singular masculine of the past/perfect tense is the least-marked form and is used for entries in modern dictionaries. In older dictionaries, which are still commonly used, the triliteral of the word, either a verb or a noun, is used. This is similar to Hebrew , which also uses

456-469: The headword, and encyclopedic information about the concepts represented by the word. For example, the headword bread may contain the following (simplified) definitions: The Academic Dictionary of Lithuanian contains around 500,000 headwords. The Oxford English Dictionary (OED) has around 273,000 headwords along with 220,000 other lemmas, while Webster's Third New International Dictionary has about 470,000. The Deutsches Wörterbuch (DWB),

480-558: The largest lexicon of the German language , has around 330,000 headwords. These values are cited by the dictionary makers and may not use exactly the same definition of a headword. In addition, headwords may not accurately reflect a dictionary's physical size. The OED and the DWB , for instance, include exhaustive historical reviews and exact citations from source documents not usually found in standard dictionaries. The term 'lemma' comes from

504-433: The particle to ) is its least marked (for example, break is chosen over to break , breaks , broke , breaking , and broken ); for defective verbs with no infinitive the present tense is used (for example, must has only one form while shall has no infinitive, and both lemmas are their lexemes' present tense forms). For Latin , Ancient Greek , Modern Greek , and Bulgarian , the first person singular present tense

SECTION 20

#1732780439253

528-1279: The practice in Greco-Roman antiquity of using the word to refer to the headwords of marginal glosses in scholia ; for this reason, the Ancient Greek plural form is sometimes used, namely lemmata (Greek λῆμμα, pl. λήμματα). gehen#German Look for Gehen on one of Misplaced Pages's sister projects : [REDACTED] Wiktionary (dictionary) [REDACTED] Wikibooks (textbooks) [REDACTED] Wikiquote (quotations) [REDACTED] Wikisource (library) [REDACTED] Wikiversity (learning resources) [REDACTED] Commons (media) [REDACTED] Wikivoyage (travel guide) [REDACTED] Wikinews (news source) [REDACTED] Wikidata (linked database) [REDACTED] Wikispecies (species directory) Misplaced Pages does not have an article with this exact name. Please search for Gehen in Misplaced Pages to check for alternative titles or spellings. You need to log in or create an account and be autoconfirmed to create new articles. Alternatively, you can use

552-439: The third-person singular masculine perfect form, e.g. ברא bara' create, כפר kaphar deny. Georgian uses the verbal noun . For Korean , -da is attached to the stem. In Tamil , an agglutinative language , the verb stem (which is also the imperative form - the least marked one) is often cited, e.g., இரு In Irish , words are highly inflected by case (genitive, nominative, dative and vocative) and by their place within

576-429: The word is not useful, as can be seen in the phonological forms of the words in the preceding example: "produced" / p r ə ˈ dj uː s t / vs. "production" / p r ə ˈ d ʌ k ʃ ən / . Some lexemes have several stems but one lemma. For instance the verb " to go " has the stems "go" and "went" due to suppletion : the past tense was co-opted from a different verb, " to wend ". A headword or catchword

#252747