Misplaced Pages

Boyer–Moore

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The Boyer–Moore majority vote algorithm is an algorithm for finding the majority of a sequence of elements using linear time and a constant number of words of memory. It is named after Robert S. Boyer and J Strother Moore , who published it in 1981, and is a prototypical example of a streaming algorithm .

#121878

12-416: Boyer–Moore may refer to: Boyer–Moore majority vote algorithm Boyer–Moore string-search algorithm Boyer–Moore theorem prover Topics referred to by the same term [REDACTED] This disambiguation page lists articles associated with the title Boyer–Moore . If an internal link led you here, you may wish to change the link to point directly to

24-403: A machine word and the total space needed is O (1) . If an array index is needed to keep track of the algorithm's position in the input sequence, it doesn't change the overall constant space bound. The algorithm's bit complexity (the space it would need, for instance, on a Turing machine ) is higher, the sum of the binary logarithms of the input length and the size of the universe from which

36-419: A Turing machine in time linear in the input length ( n times the number of bits per input item). Suppose that the input contains a majority element x , whose number of copies is more than half of the input length. Then the algorithm should return x as its final value m . Boyer and Moore prove the following stronger statement: For the final values m and c of the algorithm, on a sequence of length n , it

48-424: A counter, with the counter initially zero. It then processes the elements of the sequence, one at a time. When processing an element x , if the counter is zero, the algorithm stores x as its remembered sequence element and sets the counter to one. Otherwise, it compares x to the stored element and either increments the counter (if they are equal) or decrements the counter (otherwise). At the end of this process, if

60-402: A majority element, if there is one: that is, an element that occurs repeatedly for more than half of the elements of the input. A version of the algorithm that makes a second pass through the data can be used to verify that the element found in the first pass really is a majority. If a second pass is not performed and there is no majority the algorithm will not detect that no majority exists. In

72-417: Is actually a majority. This second pass is needed, as it is not possible for a sublinear-space algorithm to determine whether there exists a majority element in a single pass through the input. The amount of memory that the algorithm needs is the space for one element and one counter. In the random access model of computing usually used for the analysis of algorithms , each of these values can be stored in

84-450: Is always possible to partition the input values into c copies of m and ( n − c )/2 pairs of unequal items. From this it follows that no element x ≠ m can have a majority, because x can have at most half of the elements in the unequal pairs and none of the remaining c copies of m . Thus, if there is a majority element, it can only be m . To prove the existence of this partition, Boyer and Moore proceed by induction on

96-409: The case that no strict majority exists, the returned element can be arbitrary; it is not guaranteed to be the element that occurs most often (the mode of the sequence). It is not possible for a streaming algorithm to find the most frequent element in less than linear space, for sequences whose number of repetitions can be small. The algorithm maintains in its local variables a sequence element and

108-420: The elements are drawn. Both the random access model and bit complexity analyses only count the working storage of the algorithm, and not the storage for the input sequence itself. Similarly, on a random access machine the algorithm takes time O ( n ) (linear time) on an input sequence of n items, because it performs only a constant number of operations per input item. The algorithm can also be implemented on

120-407: The intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=Boyer–Moore&oldid=932733329 " Category : Disambiguation pages Hidden categories: Short description is different from Wikidata All article disambiguation pages All disambiguation pages Boyer%E2%80%93Moore majority vote algorithm In its simplest form, the algorithm finds

132-421: The length of the input sequence. At each step, they use the induction hypothesis to find a partition of the same type for the subsequence of the input with its last item removed, with its value of m . If the final input value also equals m , it is added to the set of c copies of m . If it is unequal to m , and c was positive, then one of the c copies of m is removed from this set of copies and paired with

SECTION 10

#1732775476122

144-425: The sequence has a majority, it will be the element stored by the algorithm. This can be expressed in pseudocode as the following steps: Even when the input sequence has no majority, the algorithm will report one of the sequence elements as its result. However, it is possible to perform a second pass over the same input sequence in order to count the number of times the reported element occurs and determine whether it

#121878