Misplaced Pages

Windows-1252

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Windows-1252 or CP-1252 ( Windows code page 1252) is a legacy single-byte character encoding that is used by default (as the "ANSI code page") in Microsoft Windows throughout the Americas , Western Europe , Oceania , and much of Africa .

#553446

53-539: Initially the same as ISO 8859-1 , it began to diverge starting in Windows 2.0 by adding additional characters in the 0x80 to 0x9F ( hex ) range (the ISO standards reserve this range for C1 control codes ). Notable additional characters include curly quotation marks and all printable characters from ISO 8859-15 . It is the most-used single-byte character encoding in the world. Although almost all websites now use

106-446: A "control picture" for any of these. There is also no well-known variation of Caret notation for them either. Some terminal emulators , including xterm , use OSC sequences for setting the window title and changing the colour palette. They may also support terminating an OSC sequence with BEL instead of ST. Kermit used APC to transmit commands. The ISO/IEC 2022 (ECMA-35) extension mechanism allowed escape sequences to change

159-559: A 7-bit environment, thus it was decided that no alternative character set could use them, and that these codes should be additional control codes, which become known as the C1 control codes . To allow a 7-bit environment to use these new controls, the sequences ESC @ through ESC _ were to be considered equivalent. The later ISO 8859 standards abandoned support for 7-bit codes, but preserved this range of control characters. The first C1 control code set to be registered for use with ISO 2022

212-430: A collaboration agreement that allow "key industry players to negotiate in an open workshop environment" outside of ISO in a way that may eventually lead to development of an ISO standard. C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about

265-545: A custom character encoding based on Windows-1252. For Japanese, it instead uses a multibyte character encoding based on code page 932 . Regardless of the system locale, all characters in the range 0x00 to 0x7F are guaranteed to be the same, except 0x5D which is the Yen sign in Japanese and a backslash on all others. Palm OS 3.1 introduced several changes to the character encoding to better align with Windows-1252: The following

318-493: A document is submitted directly for approval as a draft International Standard (DIS) to the ISO member bodies or as a final draft International Standard (FDIS), if the document was developed by an international standardizing body recognized by the ISO Council. The first step, a proposal of work (New Proposal), is approved at the relevant subcommittee or technical committee (e.g., SC 29 and JTC 1 respectively in

371-442: A long process that commonly starts with the proposal of new work within a committee. Some abbreviations used for marking a standard with its status are: Abbreviations used for amendments are: Other abbreviations are: International Standards are developed by ISO technical committees (TC) and subcommittees (SC) by a process with six steps: The TC/SC may set up working groups  (WG) of experts for

424-554: A method so an 8-bit "extended ASCII" code could be converted to a corresponding 7-bit code, and vice versa . In a 7-bit environment, the Shift Out ( SO ) would change the meaning of the 96 bytes 0x20 through 0x7F (i.e. all but the C0 control codes), to be the characters that an 8-bit environment would print if it used the same code with the high bit set. This meant that the range 0x80 through 0x9F could not be printed in

477-479: A necessary extra character for the DEL character, 7F HEX or 01111111 BIN (needed to punch out all the holes on a paper tape and erase it). This large number of codes was desirable at the time, as multi-byte controls would require implementation of a state machine in the terminal, which was very difficult with contemporary electronics and mechanical terminals. Only a few codes have maintained their use: BEL, ESC, and

530-548: A proposal to form a new global standards body. In October 1946, ISA and UNSCC delegates from 25 countries met in London and agreed to join forces to create the International Organization for Standardization. The organization officially began operations on 23 February 1947. ISO Standards were originally known as ISO Recommendations ( ISO/R ), e.g., " ISO 1 " was issued in 1951 as "ISO/R 1". ISO

583-436: A relatively small number of standards, ISO standards are not available free of charge, but rather for a purchase fee, which has been seen by some as unaffordable for small open-source projects. The process of developing standards within ISO was criticized around 2007 as being too difficult for timely completion of large and complex standards, and some members were failing to respond to ballots, causing problems in completing

SECTION 10

#1732801840554

636-637: Is "to develop worldwide Information and Communication Technology (ICT) standards for business and consumer applications." There was previously also a JTC 2 that was created in 2009 for a joint project to establish common terminology for "standardization in the field of energy efficiency and renewable energy sources". It was later disbanded. As of 2022 , there are 167 national members representing ISO in their country, with each country having only one member. ISO has three membership categories, Participating members are called "P" members, as opposed to observing members, who are called "O" members. ISO

689-466: Is a voluntary organization whose members are recognized authorities on standards, each one representing one country. Members meet annually at a General Assembly to discuss the strategic objectives of ISO. The organization is coordinated by a central secretariat based in Geneva . A council with a rotating membership of 20 member bodies provides guidance and governance, including setting the annual budget of

742-464: Is abused, ISO should halt the process... ISO is an engineering old boys club and these things are boring so you have to have a lot of passion ... then suddenly you have an investment of a lot of money and lobbying and you get artificial results. The process is not set up to deal with intensive corporate lobbying and so you end up with something being a standard that is not clear. International Workshop Agreements (IWAs) are documents that establish

795-517: Is an abbreviation for "International Standardization Organization" or a similar title in another language, the letters do not officially represent an acronym or initialism . The organization provides this explanation of the name: Because 'International Organization for Standardization' would have different acronyms in different languages (IOS in English, OIN in French), our founders decided to give it

848-521: Is approved as an International Standard (IS) if a two-thirds majority of the P-members of the TC/SC is in favour and not more than one-quarter of the total number of votes cast are negative. After approval, the document is published by the ISO central secretariat , with only minor editorial changes introduced in the publication process before the publication as an International Standard. Except for

901-404: Is called "WE8MSWIN1252" by Oracle Database . Starting in the 1990s, many Microsoft products that could produce HTML included Windows-1252-exclusive characters, but marked the encoding as ISO-8859-1, ASCII, or undeclared. Characters exclusive to Windows-1252 would render incorrectly on non-Windows operating systems (often as question marks). In particular, typographers' quotes—curly variants of

954-522: Is funded by a combination of: International standards are the main products of ISO. It also publishes technical reports, technical specifications, publicly available specifications, technical corrigenda (corrections), and guides. International standards Technical reports For example: Technical and publicly available specifications For example: Technical corrigenda ISO guides For example: ISO documents have strict copyright restrictions and ISO charges for most copies. As of 2020 ,

1007-507: Is now required by the HTML5 specification. Undeclared charsets in HTML are also assumed to be Windows-1252. Although Windows NT supported Unicode and attempted to encourage programs to use it, it only provided the 16-bit code units of UCS-2 / UTF-16 , despite the existing support for other multibyte character encodings. As many applications preferred to use 8-bit strings, Windows-1252 remained

1060-425: Is produced, for example, for audio and video coding standards is called a verification model (VM) (previously also called a "simulation and test model"). When a sufficient confidence in the stability of the standard under development is reached, a working draft (WD) is produced. This is in the form of a standard, but is kept internal to working group for revision. When a working draft is sufficiently mature and

1113-617: Is restricted. The organization that is known today as ISO began in 1926 as the International Federation of the National Standardizing Associations ( ISA ), which primarily focused on mechanical engineering . The ISA was suspended in 1942 during World War II but, after the war, the ISA was approached by the recently-formed United Nations Standards Coordinating Committee (UNSCC) with

SECTION 20

#1732801840554

1166-498: Is the variant of Windows-1252 used by Palm OS 3.3 onward for English and several other locales. Python gives it the palmos label, describing it as the encoding for Palm OS 3.5. Differences from Windows-1252 have their Unicode code point. ISO Early research and development: Merging the networks and creating the Internet: Commercialization, privatization, broader access leads to

1219-655: The International Electrotechnical Commission . It is headquartered in Geneva , Switzerland. The three official languages of ISO are English , French , and Russian . The International Organization for Standardization in French is Organisation internationale de normalisation and in Russian, Международная организация по стандартизации ( Mezhdunarodnaya organizatsiya po standartizatsii ). Although one might think ISO

1272-457: The Unix info format and Python 's splitlines string method. The names of some codes were changed in ISO 6429:1992 (or ECMA-48:1991) to be neutral with respect to writing direction. The abbreviations used were not changed, as the standard had already specified that those would remain unchanged when the standard is translated to other languages. In this table both new and old names are shown for

1325-436: The "Format effector " (FE n ) characters BS, TAB, LF, VT, FF, and CR. Others are unused or have acquired different meanings such as NUL being the C string terminator . Some data transfer protocols such as ANPA-1312 , Kermit , and XMODEM do make extensive use of SOH, STX, ETX, EOT, ACK, NAK and SYN for purposes approximating their original definitions; and some file formats use the "Information Separators" (IS n ) such as

1378-430: The 8-bit forms of these codes were almost never used. CSI , DCS and OSC are used to control text terminals and terminal emulators , but almost always by using their 7-bit escape code representations. Nowadays if these codes are encountered it is far more likely they are intended to be printing characters from that position of Windows-1252 or Mac OS Roman . Except for NEL Unicode does not provide

1431-577: The C0 and C1 sets. The standard C0 control character set shown above is chosen with the sequence ESC ! @ and the above C1 set chosen with the sequence ESC " C . Several official and unofficial alternatives have been defined, but this is pretty much obsolete. Most were forced to retain a good deal of compatibility with the ASCII controls for interoperability. The standard makes ESC, SP and DEL "fixed" coded characters, which are available in their ASCII locations in all encodings that conform to

1484-507: The C0 format controls HT, LF, VT, FF, and CR (note BS is missing); the C0 information separators FS, GS, RS, US (and SP); and the C1 control NEL. The rest of the codes are transparent to Unicode and their meanings are left to higher-level protocols, with ISO/IEC 6429 suggested as a default. Unicode includes many additional format effector characters besides these, such as marks, embeds, isolates and pops for explicit bidirectional formatting, and

1537-491: The case of MPEG, the Moving Picture Experts Group ). A working group (WG) of experts is typically set up by the subcommittee for the preparation of a working draft (e.g., MPEG is a collection of seven working groups as of 2023). When the scope of a new work is sufficiently clarified, some of the working groups may make an open request for proposals—known as a "call for proposals". The first document that

1590-418: The central secretariat. The technical management board is responsible for more than 250 technical committees , who develop the ISO standards. ISO has a joint technical committee (JTC) with the International Electrotechnical Commission (IEC) to develop standards relating to information technology (IT). Known as JTC 1 and entitled "Information technology", it was created in 1987 and its mission

1643-495: The code page has never been an ANSI standard. Microsoft explains, "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community." LaTeX can input Windows-1252 by using inputenc.sty with parameter ansinew (and more recently cp1252 ). IBM uses code page 1252 ( CCSID 1252 and euro sign extended CCSID 5348) for Windows-1252. It

Windows-1252 - Misplaced Pages Continue

1696-421: The confidence people have in the standards setting process", and alleged that ISO did not carry out its responsibility. He also said that Microsoft had intensely lobbied many countries that traditionally had not participated in ISO and stacked technical committees with Microsoft employees, solution providers, and resellers sympathetic to Office Open XML: When you have a process built on trust and when that trust

1749-413: The document, the draft is then approved for submission as a Final Draft International Standard (FDIS) if a two-thirds majority of the P-members of the TC/SC are in favour and if not more than one-quarter of the total number of votes cast are negative. ISO will then hold a ballot among the national bodies where no technical changes are allowed (a yes/no final approval ballot), within a period of two months. It

1802-666: The immediate right of the character, shows the Unicode code point name and the decimal Alt code .     According to the information on Microsoft's and the Unicode Consortium's websites, positions 81, 8D, 8F, 90, and 9D are unused; however, the Windows API MultiByteToWideChar maps these to the corresponding C1 control codes . The "best fit" mapping documents this behavior, too. The OS/2 operating system supports an encoding by

1855-520: The modern Internet: Examples of Internet services: The International Organization for Standardization ( ISO / ˈ aɪ s oʊ / ) is an independent, non-governmental , international standard development organization composed of representatives from the national standards organizations of member countries. Membership requirements are given in Article 3 of the ISO Statutes. ISO

1908-493: The most popular encoding on Windows even after it added support for UTF-16. Unicode support in Windows has improved over time, with UTF-8 support available starting in Windows 10 . The following table shows Windows-1252. Differences from ISO-8859-1 have the Unicode code point number below the character, based on the Unicode.org mapping of Windows-1252 with "best fit". A tooltip, generally available only when one points to

1961-535: The multi-byte character encoding UTF-8 , as of July 2024 1.2% of websites declared ISO 8859-1 which is treated as Windows-1252 by all modern browsers (as demanded by the HTML5 standard), plus 0.3% declared Windows-1252 directly, for a total of 1.5%. Some countries or languages show a higher usage than the global average, in 2024 Brazil according to website use, use is at 3.4%, and in Germany at 2.7%. (these are

2014-466: The name of Code page 1004 ( CCSID 1004) or "Windows Extended". This mostly matches code page 1252, with the exception of certain C0 control characters being replaced by diacritic characters. There is a rarely used, but useful, graphics extended code page 1252 where codes 0x00 to 0x1f allow for box drawing as used in applications such as MSDOS Edit and Codeview. One of the applications to use this code page

2067-721: The necessary steps within the prescribed time limits. In some cases, alternative processes have been used to develop standards outside of ISO and then submit them for its approval. A more rapid "fast-track" approval procedure was used in ISO/IEC JTC 1 for the standardization of Office Open XML (OOXML, ISO/IEC 29500, approved in April 2008), and another rapid alternative "publicly available specification" (PAS) process had been used by OASIS to obtain approval of OpenDocument as an ISO/IEC standard (ISO/IEC 26300, approved in May 2006). As

2120-492: The next stage, called the "enquiry stage". After a consensus to proceed is established, the subcommittee will produce a draft international standard (DIS), and the text is submitted to national bodies for voting and comment within a period of five months. A document in the DIS stage is available to the public for purchase and may be referred to with its ISO DIS reference number. Following consideration of any comments and revision of

2173-411: The preparation of a working drafts. Subcommittees may have several working groups, which may have several Sub Groups (SG). It is possible to omit certain stages, if there is a document with a certain degree of maturity at the start of a standardization project, for example, a standard developed by another organization. ISO/IEC directives also allow the so-called "Fast-track procedure". In this procedure,

Windows-1252 - Misplaced Pages Continue

2226-437: The renamed controls (the old name is the one matching the abbreviation). Unicode provides Control Pictures that can replace C0 control characters to make them visible on screen. However caret notation is used more often. Teletype used these for the paper tape reader and the paper tape punch. The first use became the de facto standard for software flow control . In 1973, ECMA-35 and ISO 2022 attempted to define

2279-472: The short form ISO . ISO is derived from the Greek word isos ( ίσος , meaning "equal"). Whatever the country, whatever the language, the short form of our name is always ISO . During the founding meetings of the new organization, however, the Greek word explanation was not invoked, so this meaning may be a false etymology . Both the name ISO and the ISO logo are registered trademarks and their use

2332-475: The standard straight apostrophes and quotation marks in US-ASCII—were commonly used in files produced in Windows applications such as Microsoft Word due to the smart quotes feature, which can automatically convert straight apostrophes and quotation marks to the curly variants. To fix this, by 2000 most web browsers and e-mail clients treated the charsets ISO-8859-1 and US-ASCII as Windows-1252—this behavior

2385-443: The standard. It also specifies that if a C0 set included transmission control (TC n ) codes, they must be encoded at their ASCII locations and could not be put in a C1 set, and any new transmission controls must be in a C1 set. Unicode reserves the 65 code points described above for compatibility with the C0 and C1 control codes, giving them the general category Cc (control). These are: Unicode only specifies semantics for

2438-509: The subcommittee is satisfied that it has developed an appropriate technical document for the problem being addressed, it becomes a committee draft (CD) and is sent to the P-member national bodies of the SC for the collection of formal comments. Revisions may be made in response to the comments, and successive committee drafts may be produced and circulated until consensus is reached to proceed to

2491-565: The sums of ISO-8859-1 and CP-1252 declarations). It is known to Windows by the code page number 1252, and by the IANA -approved name "windows-1252". Historically, the phrase "ANSI Code Page" was used in Windows to refer to non-DOS encodings; the intention was that most of these would be ANSI standards such as ISO-8859-1 . Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance,

2544-697: The text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received. C0 codes are the range 00 HEX –1F HEX and the default C0 set was originally defined in ISO 646 ( ASCII ). C1 codes are the range 80 HEX –9F HEX and the default C1 set was originally defined in ECMA-48 (harmonized later with ISO 6429). The ISO/IEC 2022 system of specifying control and graphic characters allows other C0 and C1 sets to be available for specialized applications, but they are rarely used. ASCII defined 32 control characters, plus

2597-414: The typical cost of a copy of an ISO standard is about US$ 120 or more (and electronic copies typically have a single-user license, so they cannot be shared among groups of people). Some standards by ISO and its official U.S. representative (and, via the U.S. National Committee, the International Electrotechnical Commission ) are made freely available. A standard published by ISO/IEC is the last stage of

2650-715: Was DIN 31626 , a specialised set for bibliographic use which was registered in 1979. The more common general-use ISO/IEC 6429 set was registered in 1983, although the ECMA-48 specification upon which it was based had been first published in 1976 and JIS X 0211 (formerly JIS C 6323). Symbolic names defined by RFC   1345 and early drafts of ISO 10646, but not in ISO/IEC 6429 ( PAD , HOP and SGC ) are also used. Except for SS2 and SS3 in EUC-JP text, and NEL in text transcoded from EBCDIC ,

2703-468: Was an Intel Corporation Install/Recovery disk image utility from mid/late 1995. These programs were written for its P6 User Test Program machines (US example). It was used exclusively in its then EMEA region (Europe, Middle East & Africa). In time the programs were changed to use code page 850 . Each Palm OS device supports a single language and a single character encoding, depending on its locale. For languages such as English and French, Palm OS uses

SECTION 50

#1732801840554

2756-608: Was founded on 23 February 1947, and (as of July 2024 ) it has published over 25,000 international standards covering almost all aspects of technology and manufacturing. It has over 800 technical committees (TCs) and subcommittees (SCs) to take care of standards development. The organization develops and publishes international standards in technical and nontechnical fields, including everything from manufactured products and technology to food safety, transport, IT, agriculture, and healthcare. More specialized topics like electrical and electronic engineering are instead handled by

2809-517: Was suggested at the time by Martin Bryan, the outgoing convenor (chairman) of working group 1 (WG1) of ISO/IEC JTC 1/SC 34 , the rules of ISO were eventually tightened so that participating members that fail to respond to votes are demoted to observer status. The computer security entrepreneur and Ubuntu founder, Mark Shuttleworth , was quoted in a ZDNet blog article in 2008 about the process of standardization of OOXML as saying: "I think it de-values

#553446