Umlaut ( / ˈ ʊ m l aʊ t / ) is a name for the two dots diacritical mark ( ◌̈ ) as used to indicate in writing (as part of the letters ⟨ä⟩ , ⟨ö⟩ , and ⟨ü⟩ ) the result of the historical sound shift due to which former back vowels are now pronounced as front vowels (for example [a] , [ɔ] , and [ʊ] as [ɛ] , [œ] , and [ʏ] ). (The term Germanic umlaut is also used for the underlying historical sound shift process.)
49-585: Krüger , Krueger or Kruger (without the umlaut Ü ) are German surnames originating from Krüger , meaning tavern -keeper in Low German and potter in Central German and Upper German , both associated with the Germanic word wikt:Krug , "jug". Notable people with the surname include: Umlaut (diacritic) In its contemporary printed form, the mark consists of two dots placed over
98-473: A , b , etc. This is sometimes called ASCIIbetical order . This deviates from the standard alphabetical order, particularly due to the ordering of capital letters before all lower-case ones (and possibly the treatment of spaces and other non-letter characters). It is therefore often applied with certain alterations, the most obvious being case conversion (often to uppercase, for historical reasons ) before comparison of ASCII values. In many collation algorithms,
147-494: A collation method typically defines a total order on a set of possible identifiers, called sort keys, which consequently produces a total preorder on the set of items of information (items with the same identifier are not placed in any defined order). A collation algorithm such as the Unicode collation algorithm defines an order through the process of comparing two given character strings and deciding which should come before
196-473: A form that would be recognisable as an ⟨e⟩ , but in manuscript writing, umlauted vowels could be indicated by two dots since the late medieval period. In the forms of handwriting that emerged in the early modern period (of which Sütterlin is the latest and best-known example) the letter ⟨e⟩ was composed of two short vertical lines very close together, and the superscript ⟨e⟩ looked like two tiny strokes. Even from
245-431: A means of labeling items that are already ordered. For example, pages, sections, chapters, and the like, as well as the items of lists, are frequently "numbered" in this way. Labeling series that may be used include ordinary Arabic numerals (1, 2, 3, ...), Roman numerals (I, II, III, ... or i, ii, iii, ...), or letters (A, B, C, ... or a, b, c, ...). (An alternative method for indicating list items, without numbering them,
294-550: A result, logographic languages often supplement radical-and-stroke ordering with alphabetic sorting of a phonetic conversion of the logographs. For example, the kanji word Tōkyō (東京) can be sorted as if it were spelled out in the Japanese characters of the hiragana syllabary as "to-u-ki- yo -u" (とうきょう), using the conventional sorting order for these characters. In addition, Chinese characters can also be sorted by stroke-based sorting . In Greater China, surname stroke ordering
343-479: A roughly similar procedure, though this will often be done unconsciously. Other advantages are that one can easily find the first or last elements on the list (most likely to be useful in the case of numerically sorted data), or elements in a given range (useful again in the case of numerical data, and also with alphabetically ordered data when one may be sure of only the first few letters of the sought item or items). Strings representing numbers may be sorted based on
392-602: A transformation analogous to the German umlaut, called omljud ), treat them always as independent letters. In collation , this means they have their own positions in the alphabet, for example at the end ("A–Ö" or "A–Ü", not "A–Z") as in Swedish, Estonian and Finnish, which means that the dictionary order is different from German. The transformations ä → ae and ö → oe can, therefore, be considered less appropriate for these languages, although Swedish and Finnish passports use
441-417: Is a bit more difficult, because different locales use different symbols for a decimal point , and sometimes the same character used as a decimal point is also used as a separator, for example "Section 3.2.5". There is no universal answer for how to sort such strings; any rules are application dependent. In some contexts, numbers and letters are used not so much as a basis for establishing an ordering, but as
490-439: Is a convention in some official documents where people's names are listed without hierarchy. When information is stored in digital systems, collation may become an automated process. It is then necessary to implement an appropriate collation algorithm that allows the information to be sorted in a satisfactory manner for the application in question. Often the aim will be to achieve an alphabetical or numerical ordering that follows
539-409: Is a fundamental element of most office filing systems , library catalogs , and reference books . Collation differs from classification in that the classes themselves are not necessarily ordered. However, even if the order of the classes is irrelevant, the identifiers of the classes may be members of an ordered set, allowing a sorting algorithm to arrange the items by class. Formally speaking,
SECTION 10
#1732782918081588-630: Is a specific historical phenomenon of vowel-fronting in German and other Germanic languages , including English. English examples are 'man ~ men' and 'foot ~ feet' (from Proto-Germanic * fōts , pl. * fōtiz ), but English orthography does not indicate this vowel change using the umlaut diacritic. German phonological umlaut was present in the Old High German period and continued to develop in Middle High German . From
637-592: Is appropriate to use ae . The same goes for ö and oe . While ae has a great resemblance to the letter æ and, therefore, does not impede legibility, the digraph oe is likely to reduce the legibility of a Norwegian text. This especially applies to the digraph øy , which would be rendered in the more cryptic form oey . Also in Danish , Ö has been used in place of Ø in some older texts and to distinguish between open and closed ö-sounds and when confusion with other symbols could occur, e.g. on maps. The Danish/Norwegian Ø
686-611: Is because Swiss typewriter keyboards use the same keys for French accents (in Swiss French) as are used for German umlauts (in Swiss German) and which version is active (on a computer) is chosen by system setting. Consequently to apply an accent or umlaut to a capital letter requires use of a dead key mechanism. Some languages have borrowed some of the forms of the German letters Ä , Ö , or Ü , including Azerbaijani , Estonian , Finnish , Hungarian , Karelian , some of
735-574: Is deemed to come first; for example, "cart" comes before "carthorse".) The result of arranging a set of strings in alphabetical order is that words with the same first letter are grouped together, and within such a group words with the same first two letters are grouped together, and so on. Capital letters are typically treated as equivalent to their corresponding lowercase letters. (For alternative treatments in computerized systems, see Automated collation , below.) Certain limitations, complications, and special conventions may apply when alphabetical order
784-460: Is desired to order text with embedded numbers using proper numerical order. For example, "Figure 7b" goes before "Figure 11a", even though '7' comes after '1' in Unicode . This can be extended to Roman numerals . This behavior is not particularly difficult to produce as long as only integers are to be sorted, although it can slow down sorting significantly. For example, Microsoft Windows does this when sorting file names . Sorting decimals properly
833-655: Is like the German Ö a development of OE, to be compared with the French Œ . Early Volapük used Fraktur a , o and u as different from Antiqua ones. Later, the Fraktur forms were replaced with umlauted vowels. The usage of umlaut-like diacritic vowels, particularly ü , occurs in the romanization of languages that do not use the Roman alphabet, such as Chinese . For example, Mandarin Chinese 女 [ny˨˩˦] ("female")
882-495: Is one of the main schemes to romanize Persian (for example, rendering ⟨ ض ⟩ as ⟨z̤⟩ ). The notation was used to write some Asian languages in Latin script, for example Red Karen . Collation Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order , or extensions and combinations thereof. Collation
931-679: Is romanized as nǚ in Hanyu Pinyin . Tibetan pinyin uses ä, ö, ü with approximately their German values. The Cyrillic letters ӓ , ӧ , ӱ are used in Mari , Khanty , and other languages for approximately [æ] , [ø] , and [y] . These directly parallel the German umlaut ä, ö, ü . Other vowels using a double dot to modify their values in various minority languages of Russia are ӛ , ӫ , and ӹ . The two dot diacritic can be used in " sensational spellings " or foreign branding , for example in advertising, or for other special effects, where it
980-861: Is the Unicode Collation Algorithm . This can be adapted to use the appropriate collation sequence for a given language by tailoring its default collation table. Several such tailorings are collected in Common Locale Data Repository . In some applications, the strings by which items are collated may differ from the identifiers that are displayed. For example, The Shining might be sorted as Shining, The (see Alphabetical order above), but it may still be desired to display it as The Shining . In this case two sets of strings can be stored, one for display purposes, and another for collation purposes. Strings used for collation in this way are called sort keys . Sometimes, it
1029-421: Is the basis for many systems of collation where items of information are identified by strings consisting principally of letters from an alphabet . The ordering of the strings relies on the existence of a standard ordering for the letters of the alphabet in question. (The system is not limited to alphabets in the strict technical sense; languages that use a syllabary or abugida , for example Cherokee , can use
SECTION 20
#17327829180811078-462: Is to use a bulleted list .) When letters of an alphabet are used for this purpose of enumeration , there are certain language-specific conventions as to which letters are used. For example, the Russian letters Ъ and Ь (which in writing are only used for modifying the preceding consonant ), and usually also Ы , Й , and Ё , are omitted. Also in many languages that use extended Latin script ,
1127-414: Is used for collation. For example, the Chinese character 妈 (meaning "mother") is sorted as a six-stroke character under the three-stroke primary radical 女. The radical-and-stroke system is cumbersome compared to an alphabetical system in which there are a few characters, all unambiguous. The choice of which components of a logograph comprise separate radicals and which radical is primary is not clear-cut. As
1176-432: Is used: In several languages the rules have changed over time, and so older dictionaries may use a different order than modern ones. Furthermore, collation may depend on use. For example, German dictionaries and telephone directories use different approaches. Some Arabic dictionaries, such as Hans Wehr 's bilingual A Dictionary of Modern Written Arabic , group and sort Arabic words by semitic root . For example,
1225-452: Is usual to replace them with the underlying vowel followed by an ⟨e⟩ . So, for example, "Schröder" becomes "Schroeder". As the pronunciation differs greatly between the normal letter and the umlaut, simply omitting the dots would be incorrect. The result would often be a different word, as in schon "already", schön "beautiful"; or a different grammatic form, e.g. Mutter "mother", Mütter "mothers". Despite this,
1274-531: Is usually called an umlaut (rather than a diaeresis). Mötley Crüe , Blue Öyster Cult , Motörhead and Häagen-Dazs are examples of such usage. The Brontë sisters are so-called because their Irish father, Patrick Brunty, used the device to Anglicise the family name. The International Phonetic Alphabet uses a double dot below a letter, a notation it calls " subscript umlaut " to indicate breathy (murmured) voice , (for example Hindi [kʊm̤ar] "potter". ) The ALA-LC romanization system provides for its use and
1323-603: The Alphabetical order article. Such algorithms are potentially quite complex, possibly requiring several passes through the text. Problems are nonetheless still common when the algorithm has to encompass more than one language. For example, in German dictionaries the word ökonomisch comes between offenbar and olfaktorisch , while Turkish dictionaries treat o and ö as different letters, placing oyun before öbür . A standard algorithm for collating any collection of strings composed of any standard Unicode symbols
1372-516: The Sami languages , Slovak , Swedish , and Turkish . This indicates sounds similar to the corresponding umlauted letters in German. In spoken Scandinavian languages the grammatical umlaut change is used (singular to plural, derivations, etc.) but the character used differs between languages. In Finnish, a/ä and o/ö change systematically in suffixes according to the rules of vowel harmony . In Hungarian, where long vowels are indicated with an acute accent,
1421-453: The hanzi of Chinese and the kanji of Japanese , whose thousands of symbols defy ordering by convention. In this system, common components of characters are identified; these are called radicals in Chinese and logographic systems derived from Chinese. Characters are then grouped by their primary radical, then ordered by number of pen strokes within radicals. When there is no obvious radical or more than one radical, convention governs which
1470-443: The 16th century, the handwritten convention of indicating umlaut by two dots placed above the affected vowel is also found in printed texts. Unusual umlaut designs are sometimes also created for graphic design purposes, such as to fit umlaut dots into tightly spaced lines of text. This may include umlaut dots placed vertically or inside the body of the letter. When typing German with a keyboard that doesn't have umlaut letters, it
1519-823: The Arabic to the Latin alphabet in 1928, it adopted a number of diacritics borrowed from various languages, including ⟨ü⟩ and ⟨ö⟩ from German (probably reinforced by their use in languages like Swedish, Hungarian, etc.). These Turkish graphemes represent sounds similar to their respective values in German (see Turkish alphabet ). As the borrowed diacritic has lost its relationship to Germanic i-mutation, they are in some languages considered independent graphemes , and cannot be replaced with ⟨ae⟩ , ⟨oe⟩ , or ⟨ue⟩ as in German. In Estonian and Finnish, for example, these latter diphthongs have independent meanings. Even some Germanic languages, such as Swedish (which does have
Kruger - Misplaced Pages Continue
1568-469: The Middle High German period, it was sometimes denoted in written German by adding an e to the affected vowel, either after the vowel or, in small form, above it. This can still be seen in some names, e.g. Goethe , Goebbels , Staedtler . In medieval German manuscripts, other digraphs were also commonly written using superscripts. In bluome ("flower"), for example, the ⟨o⟩
1617-424: The affected graphemes ⟨a⟩ , ⟨o⟩ , ⟨u⟩ , and ⟨au⟩ are written as ⟨ ä ⟩ , ⟨ ö ⟩ , ⟨ ü ⟩ , and ⟨äu⟩ , i.e. they are written with the umlaut diacritic, which looks identical to the diaeresis mark used in other European languages and is represented by the same Unicode character. The Germanic umlaut
1666-407: The basic principles of alphabetical ordering (mathematically speaking, lexicographical ordering ). So a computer program might treat the characters a , b , C , d , and $ as being ordered $ , C , a , b , d (the corresponding ASCII codes are $ = 36, a = 97, b = 98, C = 67, and d = 100). Therefore, strings beginning with C , M , or Z would be sorted before strings with lower-case
1715-438: The combination of a letter with the diacritical mark is called Umlaut , while the marks themselves are called Umlautzeichen (literally "umlaut sign"). In German, the umlaut diacritic indicates that the short back vowels and the diphthong [aʊ] are pronounced ("shifted forward in the mouth") as follows: And the long back vowels are pronounced in the front of the mouth as follows: In modern German orthography,
1764-564: The comparison is based not on the numerical codes of the characters, but with reference to the collating sequence – a sequence in which the characters are assumed to come for the purpose of collation – as well as other ordering rules appropriate to the given application. This can serve to apply the correct conventions used for alphabetical ordering in the language in question, dealing properly with differently cased letters, modified letters , digraphs , particular abbreviations, and so on, as mentioned above under Alphabetical order , and in detail in
1813-403: The exception of Hungarian, the replacement rule for situations where the umlaut character is not available, is to simply use the underlying unaccented character instead. Hungarian follows the German rules and replaces ⟨ö⟩ and ⟨ü⟩ with ⟨oe⟩ and ⟨ue⟩ respectively – at least for telegrams and telex messages. The same rule is followed for
1862-406: The letter ⟨ä⟩ to denote [e] (or a bit archaic but still correct [ɛɐ] ). The sign is called dve bodky [ˈdʋe ˈbɔtki] ("two dots"), and the full name of the letter ä is široké e [ˈʂirɔkeː ˈe] ("wide e"). The similar word dvojbodka [ˈdʋɔjbɔtka] ("double dot") however refers to the colon . In these languages, with
1911-441: The letter to represent the changed vowel sound. (In some Romance and other languages, the diaeresis diacritic has the same appearance but a different function.) Umlaut (literally "changed sound") is the German name of the sound shift phenomenon also known as i-mutation . In German, this term is also used for the corresponding letters ä, ö, and ü (and the diphthong äu) and the sounds that these letters represent. In German,
1960-411: The near-lookalikes ⟨ő⟩ and ⟨ű⟩ . In Luxembourgish ( Lëtzebuergesch ), ⟨ä⟩ and ⟨ë⟩ represent stressed [æ] and [ə] ( schwa ) respectively. The letters ⟨ü⟩ and ⟨ö⟩ do not occur in native Luxembourgish words, but at least the former is common in words borrowed from standard German. When Turkish switched from
2009-437: The other. When an order has been defined in this way, a sorting algorithm can be used to put a list of any number of items into that order. The main advantage of collation is that it makes it fast and easy for a user to find an element in the list, or to confirm that it is absent from the list. In automatic systems this can be done using a binary search algorithm or interpolation search ; manual searching may be performed using
Kruger - Misplaced Pages Continue
2058-459: The same ordering principle provided there is a set ordering for the symbols used.) To decide which of two strings comes first in alphabetical order, initially their first letters are compared. The string whose first letter appears earlier in the alphabet comes first in alphabetical order. If the first letters are the same, then the second letters are compared, and so on, until the order is decided. (If one string runs out of letters to compare, then it
2107-441: The standard criteria as described in the preceding sections. However, not all of these criteria are easy to automate. The simplest kind of automated collation is based on the numerical codes of the symbols in a character set , such as ASCII coding (or any of its supersets such as Unicode ), with the symbols being ordered in increasing numerical order of their codes, and this ordering being extended to strings in accordance with
2156-520: The transformation to render ö and ä (and å as aa ) in the machine-readable zone . In contexts of technological limitation, e.g. in English based systems, Swedes can either be forced to omit the diacritics or use the two letter system. When typing in Norwegian , the letters Æ and Ø might be replaced with Ä and Ö respectively if the former are not available. If ä is not available either, it
2205-606: The umlaut notation has been expanded with a version of the umlaut which looks like double acute accents , indicating a blend of umlaut and acute. Contrast: short ö; long ő. The Estonian alphabet has borrowed ⟨ä⟩ , ⟨ö⟩ , and ⟨ü⟩ from German; Swedish and Finnish have ⟨ä⟩ and ⟨ö⟩ ; and Slovak has ⟨ä⟩ . In Estonian, Swedish, Finnish, and Sami ⟨ä⟩ and ⟨ö⟩ denote [æ] and [ø] , respectively. Hungarian and Turkish have ⟨ö⟩ and ⟨ü⟩ . Slovak uses
2254-894: The umlauted letters are not considered to be separate letters of the alphabet in German, in contrast to the situation in other Germanic languages. When alphabetically sorting German words, the umlaut is usually not distinguished from the underlying vowel, although if two words differ only by an umlaut, the umlauted one comes second, for example: There is a second system in limited use, mostly for sorting names (such as in telephone directories), which treats letters with umlauts as their base equivalents followed by e. Austrian telephone directories insert ö after oz. In Switzerland , capital umlauts are sometimes printed as digraphs , in other words, ⟨Ae⟩ , ⟨Oe⟩ , ⟨Ue⟩ , instead of ⟨Ä⟩ , ⟨Ö⟩ , ⟨Ü⟩ (see German alphabet § Umlaut diacritic usage for an elaboration). This
2303-489: The values of the numbers that they represent. For example, "−4", "2.5", "10", "89", "30,000". Pure application of this method may provide only a partial ordering on the strings, since different strings can represent the same number (as with "2" and "2.0" or, when scientific notation is used, "2e3" and "2000"). A similar approach may be taken with strings representing dates or other items that can be ordered chronologically or in some other natural fashion. Alphabetical order
2352-401: The words kitāba ( كتابة 'writing'), kitāb ( كتاب 'book'), kātib ( كاتب 'writer'), maktaba ( مكتبة 'library'), maktab ( مكتب 'office'), maktūb ( مكتوب 'fate,' or 'written'), are agglomerated under the triliteral root k - t - b ( ك ت ب ), which denotes 'writing'. Another form of collation is radical-and-stroke sorting , used for non-alphabetic writing systems such as
2401-460: Was frequently placed above the ⟨u⟩ ( blůme ). This letter survives now only in Czech . Compare also ⟨ ñ ⟩ for the digraph nn , with the tilde as a superscript ⟨n⟩ . In blackletter handwriting, as used in German manuscripts of the later Middle Ages, and also in many printed texts of the early modern period, the superscript ⟨e⟩ still had
#80919