In coding theory , a variable-length code is a code which maps source symbols to a variable number of bits . The equivalent concept in computer science is bit string .
20-474: VLC may refer to: Variable-length code , a code which maps source symbols to a variable number of bits The Very Light Car , prototype vehicle Visible light communication , a communications medium using fluorescent bulbs or LEDs Victorian Landcare Council, merged to form the Landcare Victoria Inc. Victorian Legislative Council , one of
40-531: A homomorphism of S ∗ {\displaystyle S^{*}} into T ∗ {\displaystyle T^{*}} , which naturally maps each sequence of source symbols to a sequence of target symbols, is referred to as its extension . Variable-length codes can be strictly nested in order of decreasing generality as non-singular codes, uniquely decodable codes and prefix codes. Prefix codes are always uniquely decodable, and these in turn are always non-singular: A code
60-466: A sequence of the symbols from the alphabet set. For example, the alphabet of lowercase letters "a" through "z" can be used to form English words like "iceberg" while the alphabet of both upper and lower case letters can also be used to form proper names like "Misplaced Pages". A common alphabet is {0,1}, the binary alphabet , and a "00101111" is an example of a binary string . Infinite sequences of symbols may be considered as well (see Omega language ). It
80-439: A code is the mapping of finite length source sequences to finite length bit strings, that is obtained by concatenating for each symbol of the source sequence the corresponding codeword produced by the original code. Using terms from formal language theory , the precise mathematical definition is as follows: Let S {\displaystyle S} and T {\displaystyle T} be two finite sets, called
100-401: A low expected codeword length. For the above example, if the probabilities of (a, b, c, d) were ( 1 2 , 1 4 , 1 8 , 1 8 ) {\displaystyle \textstyle \left({\frac {1}{2}},{\frac {1}{4}},{\frac {1}{8}},{\frac {1}{8}}\right)} , the expected number of bits used to represent a source symbol using
120-646: A set are used in a diverse range of fields including logic, mathematics, computer science, and linguistics. An alphabet may have any cardinality ("size") and, depending on its purpose, may be finite (e.g., the alphabet of letters "a" through "z"), countable (e.g., { v 1 , v 2 , … } {\displaystyle \{v_{1},v_{2},\ldots \}} ), or even uncountable (e.g., { v x : x ∈ R } {\displaystyle \{v_{x}:x\in \mathbb {R} \}} ). Strings , also known as "words" or "sentences", over an alphabet are defined as
140-467: Is non-singular if each source symbol is mapped to a different non-empty bit string, i.e. the mapping from source symbols to bit strings is injective . A code is uniquely decodable if its extension is § non-singular . Whether a given code is uniquely decodable can be decided with the Sardinas–Patterson algorithm . A code is a prefix code if no target bit string in the mapping is a prefix of
160-482: Is also called the Kleene closure of Σ {\displaystyle \Sigma } . The notation Σ ω {\displaystyle \Sigma ^{\omega }} indicates the set of all infinite sequences over the alphabet Σ {\displaystyle \Sigma } , and Σ ∞ {\displaystyle \Sigma ^{\infty }} indicates
180-434: Is often necessary for practical purposes to restrict the symbols in an alphabet so that they are unambiguous when interpreted. For instance, if the two-member alphabet is {00,0}, a string written on paper as "000" is ambiguous because it is unclear if it is a sequence of three "0" symbols, a "00" followed by a "0", or a "0" followed by a "00". If L is a formal language, i.e. a (possibly infinite) set of finite-length strings,
200-452: The alphabet of L is the set of all symbols that may occur in any string in L . For example, if L is the set of all variable identifiers in the programming language C , L ' s alphabet is the set { a, b, c, ..., x, y, z, A, B, C, ..., X, Y, Z, 0, 1, 2, ..., 7, 8, 9, _ }. Given an alphabet Σ {\displaystyle \Sigma } , the set of all strings of length n {\displaystyle n} over
220-563: The alphabet Σ {\displaystyle \Sigma } is indicated by Σ n {\displaystyle \Sigma ^{n}} . The set ⋃ i ∈ N Σ i {\textstyle \bigcup _{i\in \mathbb {N} }\Sigma ^{i}} of all finite strings (regardless of their length) is indicated by the Kleene star operator as Σ ∗ {\displaystyle \Sigma ^{*}} , and
SECTION 10
#1732771998486240-504: The code above would be: As the entropy of this source is 1.75 bits per symbol, this code compresses the source as much as possible so that the source can be recovered with zero error. Alphabet (computer science) In formal language theory , an alphabet , sometimes called a vocabulary , is a non-empty set of indivisible symbols / characters / glyphs , typically thought of as representing letters, characters, digits, phonemes , or even words. Alphabets in this technical sense of
260-499: The context of source coding , but often serve as forward error correction in the context of channel coding . Another special case of prefix codes are LEB128 and variable-length quantity (VLQ) codes, which encode arbitrarily large integers as a sequence of octets—i.e., every codeword is a multiple of 8 bits. The advantage of a variable-length code is that unlikely source symbols can be assigned longer codewords and likely source symbols can be assigned shorter codewords, thus giving
280-611: The right coding strategy an independent and identically-distributed source may be compressed almost arbitrarily close to its entropy . This is in contrast to fixed-length coding methods, for which data compression is only possible for large blocks of data, and any compression beyond the logarithm of the total number of possibilities comes with a finite (though perhaps arbitrarily small) probability of failure. Some examples of well-known variable-length coding strategies are Huffman coding , Lempel–Ziv coding , arithmetic coding , and context-adaptive variable-length coding . The extension of
300-413: The set Σ ∗ ∪ Σ ω {\displaystyle \Sigma ^{\ast }\cup \Sigma ^{\omega }} of all finite or infinite sequences. For example, using the binary alphabet {0,1}, the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the empty string ). Alphabets are important in
320-404: The source and target alphabets , respectively. A code C : S → T ∗ {\displaystyle C:S\to T^{*}} is a total function mapping each symbol from S {\displaystyle S} to a sequence of symbols over T {\displaystyle T} , and the extension of C {\displaystyle C} to
340-412: The target bit string of a different source symbol in the same mapping. This means that symbols can be decoded instantaneously after their entire codeword is received. Other commonly used names for this concept are prefix-free code , instantaneous code , or context-free code . A special case of prefix codes are block codes . Here all codewords must have the same length. The latter are not very useful in
360-616: The title VLC . If an internal link led you here, you may wish to change the link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=VLC&oldid=1209751840 " Category : Disambiguation pages Hidden categories: Short description is different from Wikidata All article disambiguation pages All disambiguation pages Variable-length code Variable-length codes can allow sources to be compressed and decompressed with zero error ( lossless data compression ) and still be read back symbol by symbol. With
380-750: The two chambers of the Parliament of Victoria, Australia Visakha Law College , a private school in Andhra Pradesh, India VLC, IATA code for Valencia Airport , Spain VLC, pre-1928 call sign for Chatham Islands Radio, one of the Call signs in New Zealand VLC media player , a free software cross-platform multimedia player and framework Topics referred to by the same term [REDACTED] This disambiguation page lists articles associated with
400-477: The use of formal languages , automata and semiautomata . In most cases, for defining instances of automata, such as deterministic finite automata (DFAs), it is required to specify an alphabet from which the input strings for the automaton are built. In these applications, an alphabet is usually required to be a finite set , but is not otherwise restricted. When using automata, regular expressions , or formal grammars as part of string-processing algorithms ,
#485514