Simple file verification ( SFV ) is a file format for storing CRC32 checksums of files to verify the integrity of files. SFV is used to verify that a file has not been corrupted , but it does not otherwise verify the file's authenticity . The .sfv file extension is usually used for SFV files.
26-658: SFV could refer to: Simple file verification , computer file checksum format Simian foamy virus San Fernando Valley Suitable For Vegans Street Fighter V , the fifth installment in the Street Fighter video game series Swiss Football Association , governing body of football in Switzerland National Property Board of Sweden , or Statens fastighetsverk , abbreviated SFV Saybolt FUROL viscosity Topics referred to by
52-655: A " .txt " file, or a TXT Record generally contains only plain text (without formatting) intended for humans to read. The best format for storing knowledge persistently is plain text, rather than some binary format . Before the early 1960s, computers were mainly used for number-crunching rather than for text, and memory was extremely expensive. Computers often allocated only 6 bits for each character, permitting only 64 characters—assigning codes for A-Z, a-z, and 0-9 would leave only 2 codes: nowhere near enough. Most computers opted not to support lower-case letters. Thus, early text projects such as Roberto Busa 's Index Thomisticus ,
78-544: A document is received without any explicit indication of the character encoding, some applications use charset detection to attempt to guess what encoding was used. ASCII reserves the first 32 codes (numbers 0–31 decimal) for control characters known as the "C0 set": codes originally intended not to represent printable information, but rather to control devices (such as printers ) that make use of ASCII, or to provide meta-information about data streams such as those stored on magnetic tape. They include common characters like
104-564: A file has not been accidentally corrupted in transmission, since they can correct common small errors with a much shorter download. Despite the weaknesses of the SFV format, it is popular due to the relatively small amount of time taken by SFV utilities to calculate the CRC32 checksums when compared to the time taken to calculate cryptographic hashes such as MD5 or SHA-1. SFV uses a plain text file containing one line for each file and its checksum in
130-417: A previously calculated value. Due to the nature of hash functions, hash collisions may result in false positives , but the likelihood of collisions is usually negligible with random corruption. (The number of possible checksums is limited though large, so that with any checksum scheme many files will have the same checksum. However, the probability of a corrupted file having the same checksum as its original
156-527: Is 7-Zip . Many Linux distributions include a simple command-line tool cksfv to verify the checksums. Plain text In computing , plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers , images, etc.). It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text
182-499: Is also known as "Latin-1", and covers the needs of most (not all) European languages that use Latin-based characters (there was not quite enough room to cover them all). ISO 2022 then provided conventions for "switching" between different character sets in mid-file. Many other organisations developed variations on these, and for many years Windows and Macintosh computers used incompatible variations. The text-encoding situation became more and more complex, leading to efforts by ISO and by
208-419: Is also sometimes used only to exclude "binary" files: those in which at least some parts of the file cannot be correctly interpreted via the character encoding in effect. For example, a file or string consisting of "hello" (in any encoding), following by 4 bytes that express a binary integer that is not a character, is a binary file. Converting a plain text file to a different character encoding does not change
234-518: Is called a collision attack . For this reason, the md5sum and sha1sum utilities are often preferred in Unix operating systems, which use the MD5 and SHA-1 cryptographic hash functions respectively. Even a single-bit error causes both SFV's CRC and md5sum's cryptographic hash to fail, requiring the entire file to be re-fetched. The Parchive and rsync utilities are often preferred for verifying that
260-422: Is considered plain text regardless of its encoding. To properly understand or process it the recipient must know (or be able to figure out) what encoding was used; however, they need not know anything about the computer architecture that was used, or about the binary structures defined by whatever program (if any) created the data. Perhaps the most common way of explicitly stating the specific encoding of plain text
286-455: Is different from formatted text , where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from binary files in which some portions must be interpreted as binary objects (encoded integers, real numbers, images, etc.). The term is sometimes used quite loosely, to mean files that contain only "readable" content (or just files with nothing that
SECTION 10
#1732765660169312-417: Is different from Wikidata All article disambiguation pages All disambiguation pages Simple file verification Files can become corrupted for a variety of reasons, including faulty storage media , errors in transmission , write errors during copying or moving, and software bugs . SFV verification ensures that a file has not been corrupted by comparing the file's CRC hash value to
338-435: Is exceedingly small, unless deliberately constructed to maintain the checksum.) SFV cannot be used to verify the authenticity of files, as CRC32 is not a collision resistant hash function; even if the hash sum file is not tampered with, it is computationally trivial for an attacker to cause deliberate hash collisions, meaning that a malicious change in the file is not detected by a hash comparison. In cryptography, this attack
364-589: Is irrelevant to whether a file is plain text. For example, an SVG file can express drawings or even bitmapped graphics, but is still plain text. The use of plain text rather than binary files enables files to survive much better "in the wild", in part by making them largely immune to computer architecture incompatibilities. For example, all the problems of Endianness can be avoided (with encodings such as UCS-2 rather than UTF-8, endianness matters, but uniformly for every character, rather than for potentially-unknown subsets of it). The purpose of using plain text today
390-618: Is primarily independence from programs that require their very own special encoding or formatting or file format . Plain text files can be opened, read, and edited with ubiquitous text editors and utilities. A command-line interface allows people to give commands in plain text and get a response, also typically in plain text. Many other computer programs are also capable of processing or creating plain text, such as countless programs in DOS , Windows , classic Mac OS , and Unix and its kin; as well as web browsers (a few browsers such as Lynx and
416-464: Is with a MIME type . For email and HTTP , the default MIME type is " text/plain " -- plain text without markup. Another MIME type often used in both email and HTTP is " text/html ; charset=UTF-8" -- plain text represented using the UTF-8 character encoding with HTML markup. Another common MIME type is "application/json" -- plain text represented using the UTF-8 character encoding with JSON markup. When
442-684: The Brown Corpus , and others had to resort to conventions such as keying an asterisk preceding letters actually intended to be upper-case. Fred Brooks of IBM argued strongly for going to 8-bit bytes, because someday people might want to process text, and won. Although IBM used EBCDIC , most text from then on came to be encoded in ASCII , using values from 0 to 31 for (non-printing) control characters , and values from 32 to 127 for graphic characters such as letters, digits, and punctuation. Most machines stored characters in 8 bits rather than 7, ignoring
468-496: The Line Mode Browser produce only plain text for display) and other e-text readers. Plain text files are almost universal in programming; a source code file containing instructions in a programming language is almost always a plain text file. Plain text is also commonly used for configuration files , which are read for saved settings at the startup of a program. Plain text is used for much e-mail . A comment ,
494-519: The Unicode Consortium to develop a single, unified character encoding that could cover all known (or at least all currently known) languages. After some conflict, these efforts were unified. Unicode currently allows for 1,114,112 code values, and assigns codes covering nearly all modern text writing systems, as well as many historical ones, and for many non-linguistic characters such as printer's dingbats , mathematical symbols, etc. Text
520-408: The newline and the tab character . In 8-bit character sets such as Latin-1 and the other ISO 8859 sets, the first 32 characters of the "upper half" (128 to 159) are also control codes, known as the "C1 set". They are rarely used directly; when they turn up in documents which are ostensibly in an ISO 8859 encoding, their code positions generally refer instead to the characters at that position in
546-399: The format FILENAME<whitespaces>CHECKSUM . Any line starting with a semicolon ';' is considered to be a comment and is ignored for the purposes of file verification. The delimiter between the filename and checksum is always one or several spaces; tabs are never used. A sample SFV file is: An example of an open-source cross-platform command-line utility that outputs crc32 checksums
SECTION 20
#1732765660169572-687: The meaning of the text, as long as the correct character encoding is used. However, converting a binary file to a different format may alter the interpretation of the non-textual data. According to The Unicode Standard: According to other definitions, however, files that contain markup or other meta-data are generally considered plain text, so long as the markup is also in a directly human-readable form (as in HTML, XML, and so on). Thus, representations such as SGML, RTF, HTML, XML, wiki markup , and TeX, as well as nearly all programming language source code files, are considered plain text. The particular content
598-599: The range from 128 to 255. Using values above 128 conflicts with using the 8th bit as a checksum, but the checksum usage gradually died out. These additional characters were encoded differently in different countries, making texts impossible to decode without figuring out the originator's rules. For instance, a browser might display ¬A rather than ` if it tried to interpret one character set as another. The International Organization for Standardization ( ISO ) eventually developed several code pages under ISO 8859 , to accommodate various languages. The first of these ( ISO 8859-1 )
624-776: The remaining bit or using it as a checksum . The near-ubiquity of ASCII was a great help, but failed to address international and linguistic concerns. The dollar-sign ("$ ") was not as useful in England, and the accented characters used in Spanish, French, German, Portuguese, Italian and many other languages were entirely unavailable in ASCII (not to mention characters used in Greek, Russian, and most Eastern languages). Many individuals, companies, and countries defined extra characters as needed—often reassigning control characters, or using values in
650-403: The same term [REDACTED] This disambiguation page lists articles associated with the title SFV . If an internal link led you here, you may wish to change the link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=SFV&oldid=1213624716 " Category : Disambiguation pages Hidden categories: Short description
676-477: The speaker does not prefer). For example, that could exclude any indication of fonts or layout (such as markup, markdown, or even tabs); characters such as curly quotes, non-breaking spaces, soft hyphens, em dashes, and/or ligatures; or other things. In principle, plain text can be in any encoding , but occasionally the term is taken to imply ASCII . As Unicode -based encodings such as UTF-8 and UTF-16 become more common, that usage may be shrinking. Plain text
#168831