Cawang is an administrative village ( kelurahan in Indonesian ) at Kramat Jati subdistrict , East Jakarta . The borders of Cawang are:
61-581: The postal code of this administrative village is 13630. The name Cawang derived from the name of a Malay lieutenant who served the Dutch, called Encik Awang. The name of Enci Awang eventually turned into Cawang. Awang was subordinate of Lieutenant Captain Encik Wan Abdul Bagus, who along with his team lived in the area now known as Kampung Melayu, south of Jatinegara . According another sources, Cawang name originated from Cai Wang Hui (蔡汪惠),
122-464: A dropout color which can be easily removed by the OCR system. Palm OS used a special set of glyphs, known as Graffiti , which are similar to printed English characters but simplified or modified for easier recognition on the platform's computationally limited hardware. Users would need to learn how to write these special glyphs. Zone-based OCR restricts the image to a specific part of a document. This
183-647: A postcode , post code , PIN or ZIP Code ) is a series of letters or digits or both, sometimes including spaces or punctuation, included in a postal address for the purpose of sorting mail . As of August 2021, the Universal Postal Union lists 160 countries which require the use of a postal code. Although postal codes are usually assigned to geographical areas, special codes are sometimes assigned to individual addresses or to institutions that receive large volumes of mail, such as government agencies and large commercial companies. One example
244-452: A "Statistical Machine" for searching microfilm archives using an optical code recognition system. In 1931, he was granted US Patent number 1,838,389 for the invention. The patent was acquired by IBM . In 1974, Ray Kurzweil started the company Kurzweil Computer Products, Inc. and continued development of omni- font OCR, which could recognize text printed in virtually any font. (Kurzweil is often credited with inventing omni-font OCR, but it
305-464: A Chinese merchant who escaped from the Batavia massacre which that time took place in Batavia city centre and built his residence there which would become known as Cawang like today. This Jakarta location article is a stub . You can help Misplaced Pages by expanding it . Postal code A postal code (also known locally in various English-speaking countries throughout the world as
366-669: A certain code. See also postcode lottery . In Brazil the 8-digit postcodes are an evolution of the five-digit area postal codes. In the 1990s the Brazilian five-digit postal code (illustrated), DDDDD , received a three-digit suffix DDDDD-SSS , but this suffix is not directly related to the administrative district hierarchy. The suffix was created only for logistic reasons. The postal code assignment can be assigned to individual land lots in some special cases – in Brazil, they are named "large receivers" and receive suffixes 900–959. It
427-439: A character error rate of 1% (99% accuracy) may result in an error rate of 5% or worse if the measurement is based on whether each whole word was recognized with no incorrect letters. Using a large enough dataset is important in a neural-network-based handwriting recognition solutions. On the other hand, producing natural datasets is very complicated and time-consuming. An example of the difficulties inherent in digitizing old text
488-433: A letter. The outward postcode and the leading numeric of the inward postcode in combination forms a postal sector, and this usually corresponds to a couple of thousand properties. Optical character recognition Optical character recognition or optical character reader ( OCR ) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from
549-469: A list of optical character recognition software, see Comparison of optical character recognition software . OCR accuracy can be increased if the output is constrained by a lexicon – a list of words that are allowed to occur in a document. This might be, for example, all the words in the English language, or a more technical lexicon for a specific field. This technique can be problematic if
610-559: A noun, for example, allowing greater accuracy. The Levenshtein Distance algorithm has also been used in OCR post-processing to further optimize results from an OCR API. In recent years, the major OCR technology providers began to tweak OCR systems to deal more efficiently with specific types of input. Beyond an application-specific lexicon, better performance may be had by taking into account business rules, standard expression, or rich information contained in color images. This strategy
671-442: A numeric postcode format of four or five digits is commonly used) the numeric postal code is sometimes prefixed with a country code when sending international mail to that country. Postal services have their own formats and placement rules for postal codes. In most English-speaking countries, the postal code forms the last item of the address, following the city or town name, whereas in most continental European countries it precedes
SECTION 10
#1732772076260732-478: A quarter of a mile (400 metres) apart. The structure is alphanumeric, with the following six valid formats, as defined by BS 7666: There are always two halves: the separation between outward and inward postcodes is indicated by one space. The outward postcode covers a unique area and has two parts which may in total be two, three or four characters in length. A postcode area of one or two letters, followed by one or two numbers, followed in some parts of London by
793-399: A ranked list of candidate characters. Software such as Cuneiform and Tesseract use a two-pass approach to character recognition. The second pass is known as adaptive recognition and uses the letter shapes recognized with high confidence on the first pass to better recognize the remaining letters on the second pass. This is advantageous for unusual fonts or low-quality scans where the font
854-650: A result of a new postcode system being introduced. Some postal code systems, like those of Ecuador and Costa Rica , show an exact agreement with the hierarchy of administrative divisions . Format of six digit numeric (eight digit alphanumeric) postal codes in Ecuador , introduced in December 2007: ECAABBCC Format of five digit numeric Postal codes in Costa Rica , introduced in 2007: ABBCC In Costa Rica these codes were originally used as district identifiers by
915-471: A scanned document, a photo of a document, a scene photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast). Widely used as a form of data entry from printed paper data records – whether passport documents, invoices, bank statements , computerized receipts, business cards, mail, printed data, or any suitable documentation – it
976-411: A single character) – are still the subject of active research. The MNIST database is commonly used for testing systems' ability to recognize handwritten digits. Accuracy rates can be measured in several ways, and how they are measured can greatly affect the reported accuracy rate. For example, if word context (a lexicon of words) is not used to correct software finding non-existent words,
1037-575: A single exception, these codes are in the format: ANN XXXX The single exception is the Dublin D6W postal district. It is the only routing key area in the country that takes the format ANA instead of ANN: D6W XXXX While it is not intended to replace addresses, in theory simply providing a seven-character Eircode would locate any Irish delivery address. For example, the Irish Parliament Dáil Éireann is: D02 A272 Postal codes in
1098-451: A time. Advanced systems capable of producing a high degree of accuracy for most fonts are now common, and with support for a variety of image file format inputs. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components. Early optical character recognition may be traced to technologies involving telegraphy and creating reading devices for
1159-403: A two digit numeric code can represent 100 locations, a two character alphanumeric code using ten numbers and twenty letters can represent 900 locations. The independent nations using alphanumeric postal code systems are: Countries which prefix their postal codes with a fixed group of letters, indicating a country code, include Andorra , Azerbaijan , Barbados , Ecuador and Saint Vincent and
1220-444: A unique, six-digit postal code . For domestic properties, an individual postcode may cover up to 100 properties in contiguous proximity (e.g. a short section of a populous road, or a group of less populous neighbouring roads). The postcode together with the number or name of a property is not always unique, particularly in rural areas. For example, GL20 8NX/1 might refer to either 1 Frampton Cottages or 1 Frampton Farm Cottages, roughly
1281-479: Is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed online, and used in machine processes such as cognitive computing , machine translation , (extracted) text-to-speech , key data and text mining . OCR is a field of research in pattern recognition , artificial intelligence and computer vision . Early versions needed to be trained with images of each character, and worked on one font at
SECTION 20
#17327720762601342-428: Is accomplished relatively simply by aligning the image to a uniform grid based on where vertical grid lines will least often intersect black areas. For proportional fonts , more sophisticated techniques are needed because whitespace between letters can sometimes be greater than that between words, and vertical lines can intersect more than one character. There are two basic types of core OCR algorithm, which may produce
1403-600: Is an error to associate the postal code with the whole land lot area (illustrated). A postal code is often related to a land lot , but postal codes are usually related to access points on streets. Small or middle-sized houses, in general, only have a single main gate, which is the delivery point. Parks, large businesses such as shopping centers and big houses, may have more than one entrance and more than one delivery point. Czechoslovakia introduced Postal Routing Numbers (PSČ – poštovní směrovací čísla) in 1973. The code consists of 5 digits formatted into two groups: NNN NN. Originally,
1464-548: Is called "Application-Oriented OCR" or "Customized OCR", and has been applied to OCR of license plates , invoices , screenshots , ID cards , driver's licenses , and automobile manufacturing . The New York Times has adapted the OCR technology into a proprietary tool they entitle Document Helper , that enables their interactive news team to accelerate the processing of documents that need to be reviewed. They note that it enables them to process what amounts to as many as 5,400 pages per hour in preparation for reporters to review
1525-413: Is distorted (e.g. blurred or faded). As of December 2016 , modern OCR software includes Google Docs OCR, ABBYY FineReader , and Transym. Others like OCRopus and Tesseract use neural networks which are trained to recognize whole lines of text instead of focusing on single characters. A technique known as iterative OCR automatically crops a document into sections based on the page layout. OCR
1586-411: Is generally an offline process, which analyses a static document. There are cloud based services which provide an online OCR API service. Handwriting movement analysis can be used as input to handwriting recognition . Instead of merely using the shapes of glyphs and words, this technique is able to capture motion, such as the order in which segments are drawn, the direction, and the pattern of putting
1647-486: Is often referred to as Template OCR . Crowdsourcing humans to perform the character recognition can quickly process images like computer-driven OCR, but with higher accuracy for recognizing images than that obtained via computers. Practical systems include the Amazon Mechanical Turk and reCAPTCHA . The National Library of Finland has developed an online interface for users to correct OCRed texts in
1708-529: Is the French CEDEX system. There are a number of synonyms for postal code; some are country-specific: The development of postal codes happened first in large cities. Postal codes began with postal district numbers (or postal zone numbers) within large cities. London was first subdivided into 10 districts in 1857 (EC (East Central), WC (West Central), N, NE, E, SE, S, SW, W, and NW), four were created to cover Liverpool in 1864; and Manchester / Salford
1769-650: Is the French Cedex system. Before postal codes as described here were used, large cities were often divided into postal zones or postal districts, usually numbered from 1 upwards within each city. The newer postal code systems often incorporate the old zone numbers, as with London postal district numbers, for example. Ireland still uses postal district numbers in Dublin . In New Zealand, Auckland , Wellington and Christchurch were divided into postal zones, but these fell into disuse, and have now become redundant as
1830-434: Is the inability of OCR to differentiate between the " long s " and "f" characters. Web-based OCR systems for recognizing hand-printed text on the fly have become well known as commercial products in recent years (see Tablet PC history ). Accuracy rates of 80% to 90% on neat, clean hand-printed characters can be achieved by pen computing software, but that accuracy rate still translates to dozens of errors per page, making
1891-572: Is then performed on each section individually using variable character confidence level thresholds to maximize page-level OCR accuracy. A patent from the United States Patent Office has been issued for this method. The OCR result can be stored in the standardized ALTO format, a dedicated XML schema maintained by the United States Library of Congress . Other common formats include hOCR and PAGE XML. For
Cawang, Kramat Jati, East Jakarta - Misplaced Pages Continue
1952-454: The Amount line of a check (which is always a written-out number) is an example where using a smaller dictionary can increase recognition rates greatly. The shapes of individual cursive characters themselves simply do not contain enough information to accurately (greater than 98%) recognize all handwritten cursive script. Most programs allow users to set "confidence rates". This means that if
2013-817: The National Institute of Statistics and Census of Costa Rica and the Administrative Territorial Division , and continue to be equivalent. The first two digits of the postal codes in Turkey correspond to the provinces and each province has assigned only one number. They are the same for them as in ISO 3166-2:TR . The first two digits of the postal codes in Vietnam indicate a province . Some provinces have one, other have several two digit numbers assigned. The numbers differ from
2074-749: The Northwest Territories and Nunavut share the letter X. The first two digits of the postal codes in Germany define areas independently of administrative regions. The coding space of the first digit is fully used (0–9); that of the first two combined is utilized to 89%, i.e. there are 89 postal zones defined. Zone 11 is non-geographic. Royal Mail designed the postal codes in the United Kingdom mostly for efficient distribution. Nevertheless, people associated codes with certain areas, leading to some people wanting or not wanting to have
2135-469: The 2000s, OCR was made available online as a service (WebOCR), in a cloud computing environment, and in mobile applications like real-time translation of foreign-language signs on a smartphone . With the advent of smartphones and smartglasses , OCR can be used in internet connected mobile device applications that extract text captured using the device's camera. These devices that do not have built-in OCR functionality will typically use an OCR API to extract
2196-606: The Grenadines . ISO 3166-1 alpha-2 country codes were recommended by the European Committee for Standardization as well as the Universal Postal Union to be used in conjunction with postal codes starting in 1994, but they have not become widely used. Andorra , Azerbaijan , Barbados , Ecuador , Latvia and Saint Vincent and the Grenadines use the ISO 3166-1 alpha-2 as a prefix in their postal codes. In some countries (such as in continental Europe , where
2257-645: The Netherlands , known as postcodes, are alphanumeric, consisting of four digits followed by a space and two letters (NNNN AA). Adding the house number to the postcode will identify the address, making the street name and town name redundant. For example: 2597 GV 75 will direct a postal delivery to Theo Mann-Bouwmeesterlaan 75, 's-Gravenhage (the International School of The Hague). Since 1 September 1995, every building in Singapore has been given
2318-463: The blind. In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code. Concurrently, Edmund Fournier d'Albe developed the Optophone , a handheld scanner that when moved across a printed page, produced tones that corresponded to specific letters or characters. In the late 1920s and into the 1930s, Emanuel Goldberg developed what he called
2379-828: The contents. There are several techniques for solving the problem of character recognition by means other than improved OCR algorithms. Special fonts like OCR-A , OCR-B , or MICR fonts, with precisely specified sizing, spacing, and distinctive character shapes, allow a higher accuracy rate during transcription in bank check processing. Several prominent OCR engines were designed to capture text in popular fonts such as Arial or Times New Roman, and are incapable of capturing text in these fonts that are specialized and very different from popularly used fonts. As Google Tesseract can be trained to recognize new fonts, it can recognize OCR-A, OCR-B and MICR fonts. Comb fields are pre-printed boxes that encourage humans to write more legibly – one glyph per box. These are often printed in
2440-406: The document contains words not in the lexicon, like proper nouns . Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy. The output stream may be a plain text stream or file of characters, but more sophisticated OCR systems can preserve the original layout of the page and produce, for example, an annotated PDF that includes both the original image of
2501-435: The first group marked a district transport centre, the second group represented the order of post offices on the collection route. In the first group, the first digit corresponds partly with the region, the second digit meant a collection transport node (sběrný přepravní uzel, SPU) and the third digit a "district transport node" (okresní přepravní uzel). However, processing was later centralized and mechanized while codes remained
Cawang, Kramat Jati, East Jakarta - Misplaced Pages Continue
2562-465: The following letters only: A, C, D, E, F, H, K, N, P, R, T, V, W, X, Y. This serves to avoid confusion in OCR, and to avoid accidental double-entendres by avoiding the creation of word lookalikes, as Eircode's last four characters are random. Most of the postal code systems are numeric; only a few are alphanumeric (i.e., use both letters and digits). Alphanumeric systems can, given the same number of characters, encode many more locations. For example, while
2623-532: The idea of extending the postal district or zone numbering plans beyond large cities to cover even small towns and rural locales had started. These developed into postal codes as they are defined today. The name of US postal codes, "ZIP Codes", reflects this evolutionary growth from a zone plan to a zone improvement plan, "ZIP". Modern postal codes were first introduced in the Ukrainian Soviet Socialist Republic in December 1932, but
2684-594: The leaders of the National Federation of the Blind . In 1978, Kurzweil Computer Products began selling a commercial version of the optical character recognition computer program. LexisNexis was one of the first customers, and bought the program to upload legal paper and news documents onto its nascent online databases. Two years later, Kurzweil sold his company to Xerox , which eventually spun it off as Scansoft , which merged with Nuance Communications . In
2745-576: The letters 'F', 'I', 'O', 'Q', 'U' and 'Y' for technical reasons. But as almost all existing combinations are now used, these letters were allowed for new locations starting 2005. The letter combinations "SS" ( Schutzstaffel ), "SD" ( Sicherheitsdienst ), and "SA" ( Sturmabteilung ) are not used, due to links with the Nazi occupation in World War II . Postal codes in Canada do not include
2806-414: The letters D, F, I, O, Q, or U, as the optical character recognition (OCR) equipment used in automated sorting could easily confuse them with other letters and digits. The letters W and Z are used, but are not currently used as the first letter. The Canadian Postal Codes use alternate letters and numbers (with a space after the third character) in this format: A9A 9A9 In Ireland, the eircode system uses
2867-722: The most authoritative of the Annual Test of OCR Accuracy from 1992 to 1996. Recognition of typewritten, Latin script text is still not 100% accurate even where clear imaging is available. One study based on recognition of 19th- and early 20th-century newspaper pages concluded that character-by-character OCR accuracy for commercial OCR software varied from 81% to 99%; total accuracy can be achieved by human review or Data Dictionary Authentication. Other areas – including recognition of hand printing, cursive handwriting, and printed text in other scripts (especially those East Asian language characters which have many strokes for
2928-695: The name of the city or town. When it follows the city, it may be on the same line or on a new line. In Belarus , Kyrgyzstan , Russia and Turkmenistan , it is written at the beginning of an address. In Japan , it is written at the start of the address when written in Japanese, but at the end when the address is written in the Latin alphabet. Postal codes are usually assigned to geographical areas. Sometimes codes are assigned to individual addresses or to institutions that receive large volumes of mail, e.g. government agencies or large commercial companies. One example
2989-522: The new postal code system launched in 2015, known as Eircode provides a unique 7-character alphanumerical code for each individual address. The first three digits are the routing key, which is a postal district and the last four characters are a unique identifier that relates to an individual address (business, house or apartment). A fully developed API is also available for integrating the Eircode database into business databases and logistics systems. With
3050-600: The number of a neighbouring department. The first digit of the postal codes in the United States comprises discrete states. From the first three digits one can infer the state, with a few exceptions where an area is served by a central office in an adjacent state. Similarly, in Canada , the first letter indicates the province or territory, although the provinces of Quebec and Ontario are divided into several lettered sub-regions (e.g. H for Montreal and Laval ), and
3111-462: The number used in ISO 3166-2:VN . In France the numeric code for the departments is used as the first two digits of the postal code, except for the two departments in Corsica that have codes 2A and 2B and use 20 as postal code. Furthermore, the codes are only the codes for the department in charge of delivery of the post, so it can be that a location in one department has a postal code starting with
SECTION 50
#17327720762603172-406: The page and a searchable textual representation. Near-neighbor analysis can make use of co-occurrence frequencies to correct errors, by noting that certain words are often seen together. For example, "Washington, D.C." is generally far more common in English than "Washington DOC". Knowledge of the grammar of the language being scanned can also help determine if a word is likely to be a verb or
3233-416: The pen down and lifting it. This additional information can make the process more accurate. This technology is also known as "online character recognition", "dynamic character recognition", "real-time character recognition", and "intelligent character recognition". OCR software often pre-processes images to improve the chances of successful recognition. Techniques include: Segmentation of fixed-pitch fonts
3294-533: The same. After separation, Slovakia and the Czech Republic kept the system. Codes with an initial digit of 1, 2, 3, 4, 5, 6, or 7 are used in the Czech Republic, while codes with an initial digit of 8, 9, or 0 are used in Slovakia A code corresponds to a local postal office. However, some larger companies or organizations have their own post codes. In 2004–2006, there were some efforts in Slovakia to reform
3355-576: The standardized ALTO format. Crowd sourcing has also been used not to perform character recognition directly but to invite software developers to develop image processing algorithms, for example, through the use of rank-order tournaments . Commissioned by the U.S. Department of Energy (DOE), the Information Science Research Institute (ISRI) had the mission to foster the improvement of automated technologies for understanding machine printed documents, and it conducted
3416-480: The system was abandoned in 1939. The next country to introduce postal codes was Germany in 1941, followed by Singapore in 1950, Argentina in 1958, the United States in 1963 and Switzerland in 1964. The United Kingdom began introducing its current system in Norwich in 1959, but it was not used nationwide until 1974. The characters used in postal codes are: Postal codes in the Netherlands originally did not use
3477-605: The system, to get separate post codes for every district of single postmen, but the change was not realized. Postal codes are known as Postal Index Numbers (PINs; sometimes as PIN codes) in India. The PIN system was introduced on 15 August 1972 by India Post. India uses a unique six-digit code as a geographical number to identify locations in India. The format of the PIN is ZSDPPP defined as follows: The first digit represents nine total zones: eight regional and one functional. In Ireland,
3538-458: The technology useful only in very limited applications. Recognition of cursive text is an active area of research, with recognition rates even lower than that of hand-printed text . Higher rates of recognition of general cursive script will likely not be possible without the use of contextual or grammatical information. For example, recognizing entire words from a dictionary is easier than trying to parse individual characters from script. Reading
3599-688: The text from the image file captured by the device. The OCR API returns the extracted text, along with information about the location of the detected text in the original image back to the device app for further processing (such as text-to-speech) or display. Various commercial and open source OCR systems are available for most common writing systems , including Latin, Cyrillic, Arabic, Hebrew, Indic, Bengali (Bangla), Devanagari, Tamil, Chinese, Japanese, and Korean characters. OCR engines have been developed into software applications specializing in various subjects such as receipts, invoices, checks, and legal billing documents. The software can be used for: OCR
3660-402: Was in use by companies, including CompuScan, in the late 1960s and 1970s. ) Kurzweil used the technology to create a reading machine for blind people to have a computer read text to them out loud. The device included a CCD -type flatbed scanner and a text-to-speech synthesizer. On January 13, 1976, the finished product was unveiled during a widely reported news conference headed by Kurzweil and
3721-483: Was split into eight numbered districts in 1867/68. By World War I , such postal district or zone numbers also existed in various large European cities. They existed in the United States at least as late as the 1920s, possibly implemented at the local post office level only (for example, instances of "Boston 9, Mass" in 1920 are attested ) although they were evidently not used throughout all major US cities (implemented USPOD -wide) until World War II . By 1930 or earlier,
SECTION 60
#1732772076260#259740