In database theory , relational algebra is a theory that uses algebraic structures for modeling data and defining queries on it with well founded semantics . The theory was introduced by Edgar F. Codd .
56-398: ISBL ( Information Systems Base Language ) is the relational algebra notation that was invented for PRTV , one of the earliest database management systems to implement E.F. Codd's relational model of data. This computer science article is a stub . You can help Misplaced Pages by expanding it . Relational algebra The main application of relational algebra is to provide
112-400: A n ( R ⋈ S ) . {\displaystyle R\ltimes S=\Pi _{a_{1},\ldots ,a_{n}}(R\bowtie S).} Since we can simulate the natural join with the basic operators it follows that this also holds for the semijoin. In Codd's 1970 paper, semijoin is called restriction. The antijoin (▷), written as R ▷ S where R and S are relations , is similar to
168-436: A and b are attribute names, θ is a binary relational operator in the set {<, ≤, =, ≠, >, ≥ }, υ is a value constant, and R and S are relations. The result of this operation consists of all combinations of tuples in R and S that satisfy θ . The result of the θ -join is defined only if the headers of S and R are disjoint, that is, do not contain a common attribute. The simulation of this operation in
224-413: A collection or family , especially when its elements are themselves sets. Roster or enumeration notation defines a set by listing its elements between curly brackets , separated by commas: This notation was introduced by Ernst Zermelo in 1908. In a set, all that matters is whether each element is in it or not, so the ordering of the elements in roster notation is irrelevant (in contrast, in
280-518: A relation t (in the mathematical sense) iff t is a function (that is, t does not map any attribute to multiple values). It is usually required that R and S must have at least one common attribute, but if this constraint is omitted, and R and S have no common attributes, then the natural join becomes exactly the Cartesian product. The natural join can be simulated with Codd's primitives as follows. Assume that c 1 ,..., c m are
336-429: A sequence , a tuple , or a permutation of a set, the ordering of the terms matters). For example, {2, 4, 6} and {4, 6, 4, 2} represent the same set. For sets with many elements, especially those following an implicit pattern, the list of members can be abbreviated using an ellipsis ' ... '. For instance, the set of the first thousand positive integers may be specified in roster notation as An infinite set
392-484: A common attribute name. In addition, the Cartesian product is defined differently from the one in set theory in the sense that tuples are considered to be "shallow" for the purposes of the operation. That is, the Cartesian product of a set of n -tuples with a set of m -tuples yields a set of "flattened" ( n + m ) -tuples (whereas basic set theory would have prescribed a set of 2-tuples, each containing an n -tuple and an m -tuple). More formally, R × S
448-403: A condition where the attributes are equal, for example Price, then the condition may be specified as Price = Price or alternatively ( Price ) itself. In order to combine tuples from two relations where the combination condition is not simply the equality of shared attributes it is convenient to have a more general form of join operator, which is the θ -join (or theta-join). The θ -join
504-466: A definition is called a semantic description . Set-builder notation specifies a set as a selection from a larger set, determined by a condition on the elements. For example, a set F can be defined as follows: F = { n ∣ n is an integer, and 0 ≤ n ≤ 19 } . {\displaystyle F=\{n\mid n{\text{ is an integer, and }}0\leq n\leq 19\}.} In this notation,
560-506: A join. To rename the "isFriend" attribute to "isBusinessContact" in a relation, ρ isBusinessContact / isFriend ( addressBook ) {\displaystyle \rho _{\text{isBusinessContact / isFriend}}({\text{addressBook}})} might be used. There is also the ρ x ( A 1 , … , A n ) ( R ) {\displaystyle \rho _{x(A_{1},\ldots ,A_{n})}(R)} notation, where R
616-459: A listing of all friends or business associates in an address book, the selection might be written as σ isFriend = true ∨ isBusinessContact = true ( addressBook ) {\displaystyle \sigma _{{\text{isFriend = true}}\,\lor \,{\text{isBusinessContact = true}}}({\text{addressBook}})} . The result would be a relation containing every attribute of every unique record where isFriend
SECTION 10
#1732787406612672-627: A predicate is usually called an inner join , and the on keyword allows one to specify the predicate used to filter the rows. It is important to note: forming the flattened Cartesian product then filtering the rows is conceptually correct, but an implementation would use more sophisticated data structures to speed up the join query. The left semijoin (⋉ and ⋊) is a joining similar to the natural join and written as R ⋉ S {\displaystyle R\ltimes S} where R {\displaystyle R} and S {\displaystyle S} are relations . The result
728-416: A projection to get rid of the renamed attributes: Consider tables Car and Boat which list models of cars and boats and their respective prices. Suppose a customer wants to buy a car and a boat, but she does not want to spend more money for the boat than for the car. The θ -join (⋈ θ ) on the predicate CarPrice ≥ BoatPrice produces the flattened pairs of rows which satisfy the predicate. When using
784-441: A set S , denoted | S | , is the number of members of S . For example, if B = {blue, white, red} , then | B | = 3 . Repeated members in roster notation are not counted, so | {blue, white, red, blue, white} | = 3 , too. More formally, two sets share the same cardinality if there exists a bijection between them. The cardinality of the empty set is zero. The list of elements of some sets
840-465: A set A to a set B is a rule that assigns to each "input" element of A an "output" that is an element of B ; more formally, a function is a special kind of relation , one that relates each element of A to exactly one element of B . A function is called An injective function is called an injection , a surjective function is called a surjection , and a bijective function is called a bijection or one-to-one correspondence . The cardinality of
896-636: A theoretical foundation for relational databases , particularly query languages for such databases, chief among which is SQL . Relational databases store tabular data represented as relations . Queries over relational databases often likewise return tabular data represented as relations. The main purpose of relational algebra is to define operators that transform one or more input relations to an output relation. Given that these operators accept relations as input and produce relations as output, they can be combined and used to express complex queries that transform multiple input relations (whose data are stored in
952-404: Is a binary operator that is written as R ⋈ S a θ b {\displaystyle {R\ \bowtie \ S \atop a\ \theta \ b}} or R ⋈ S a θ v {\displaystyle {R\ \bowtie \ S \atop a\ \theta \ v}} where
1008-403: Is a graphical representation of a collection of sets; each set is depicted as a planar region enclosed by a loop, with its elements inside. If A is a subset of B , then the region representing A is completely inside the region representing B . If two sets have no elements in common, the regions do not overlap. A Venn diagram , in contrast, is a graphical representation of n sets in which
1064-494: Is a set of attribute names. The result of such projection is defined as the set that is obtained when all tuples in R are restricted to the set { a 1 , … , a n } {\displaystyle \{a_{1},\ldots ,a_{n}\}} . Note: when implemented in SQL standard the "default projection" returns a multiset instead of a set, and the Π projection to eliminate duplicate data
1120-422: Is a set with an infinite number of elements. If the pattern of its elements is obvious, an infinite set can be given in roster notation, with an ellipsis placed at the end of the list, or at both ends, to indicate that the list continues forever. For example, the set of nonnegative integers is and the set of all integers is Another way to define a set is to use a rule to determine what the elements are: Such
1176-506: Is a set with exactly one element; such a set may also be called a unit set . Any such set can be written as { x }, where x is the element. The set { x } and the element x mean different things; Halmos draws the analogy that a box containing a hat is not the same as the hat. If every element of set A is also in B , then A is described as being a subset of B , or contained in B , written A ⊆ B , or B ⊇ A . The latter notation may be read B contains A , B includes A , or B
SECTION 20
#17327874066121232-435: Is a superset of A . The relationship between sets established by ⊆ is called inclusion or containment . Two sets are equal if they contain each other: A ⊆ B and B ⊆ A is equivalent to A = B . If A is a subset of B , but A is not equal to B , then A is called a proper subset of B . This can be written A ⊊ B . Likewise, B ⊋ A means B is a proper superset of A , i.e. B contains A , and
1288-403: Is as in the definition of natural join. The semijoin can be simulated using the natural join as follows. If a 1 , … , a n {\displaystyle a_{1},\ldots ,a_{n}} are the attribute names of R {\displaystyle R} , then R ⋉ S = Π a 1 , … ,
1344-723: Is defined as follows: R × S := { ( r 1 , r 2 , … , r n , s 1 , s 2 , … , s m ) | ( r 1 , r 2 , … , r n ) ∈ R , ( s 1 , s 2 , … , s m ) ∈ S } {\displaystyle R\times S:=\{(r_{1},r_{2},\dots ,r_{n},s_{1},s_{2},\dots ,s_{m})|(r_{1},r_{2},\dots ,r_{n})\in R,(s_{1},s_{2},\dots ,s_{m})\in S\}} The cardinality of
1400-399: Is endless, or infinite . For example, the set N {\displaystyle \mathbb {N} } of natural numbers is infinite. In fact, all the special sets of numbers mentioned in the section above are infinite. Infinite sets have infinite cardinality . Some infinite cardinalities are greater than others. Arguably one of the most significant results from set theory is that
1456-482: Is in B ". The statement " y is not an element of B " is written as y ∉ B , which can also be read as " y is not in B ". For example, with respect to the sets A = {1, 2, 3, 4} , B = {blue, white, red} , and F = { n | n is an integer, and 0 ≤ n ≤ 19} , The empty set (or null set ) is the unique set that has no members. It is denoted ∅ , ∅ {\displaystyle \emptyset } , { }, ϕ , or ϕ . A singleton set
1512-428: Is not equal to A . A third pair of operators ⊂ and ⊃ are used differently by different authors: some authors use A ⊂ B and B ⊃ A to mean A is any subset of B (and not necessarily a proper subset), while others reserve A ⊂ B and B ⊃ A for cases where A is a proper subset of B . Examples: The empty set is a subset of every set, and every set is a subset of itself: An Euler diagram
1568-714: Is obtained by the addition of the DISTINCT keyword . A generalized selection (σ) is a unary operation written as σ φ ( R ) {\displaystyle \sigma _{\varphi }(R)} where φ is a propositional formula that consists of atoms as allowed in the normal selection and the logical operators ∧ {\displaystyle \wedge } ( and ), ∨ {\displaystyle \lor } ( or ) and ¬ {\displaystyle \neg } ( negation ). This selection selects all those tuples in R for which φ holds. To obtain
1624-440: Is renamed to x and the attributes { a 1 , … , a n } {\displaystyle \{a_{1},\ldots ,a_{n}\}} are renamed to { A 1 , … , A n } {\displaystyle \{A_{1},\ldots ,A_{n}\}} . Natural join (⨝) is a binary operator that is written as ( R ⨝ S ) where R and S are relations . The result of
1680-417: Is the set of all tuples in R {\displaystyle R} for which there is a tuple in S {\displaystyle S} that is equal on their common attribute names. The difference from a natural join is that other columns of S {\displaystyle S} do not appear. For example, consider the tables Employee and Dept and their semijoin: More formally
1736-406: Is their join as shown above, projected on all but the common attribute DeptName . In category theory , the join is precisely the fiber product . The natural join is arguably one of the most important operators since it is the relational counterpart of the logical AND operator. Note that if the same variable appears in each of two predicates that are connected by AND, then that variable stands for
ISBL - Misplaced Pages Continue
1792-402: Is true or where isBusinessContact is true. A rename ( ρ ) is a unary operation written as ρ a / b ( R ) {\displaystyle \rho _{a/b}(R)} where the result is identical to R except that the b attribute in all tuples is renamed to an a attribute. This is commonly used to rename the attribute of a relation for the purpose of
1848-440: Is uncountable. Moreover, the power set is always strictly "bigger" than the original set, in the sense that any attempt to pair up the elements of S with the elements of P ( S ) will leave some elements of P ( S ) unpaired. (There is never a bijection from S onto P ( S ) .) A partition of a set S is a set of nonempty subsets of S , such that every element x in S is in exactly one of these subsets. That is,
1904-842: The n loops divide the plane into 2 zones such that for each way of selecting some of the n sets (possibly all or none), there is a zone for the elements that belong to all the selected sets and none of the others. For example, if the sets are A , B , and C , there should be a zone for the elements that are inside A and C and outside B (even if such elements do not exist). There are sets of such mathematical importance, to which mathematicians refer so frequently, that they have acquired special names and notational conventions to identify them. Many of these important sets are represented in mathematical texts using bold (e.g. Z {\displaystyle \mathbf {Z} } ) or blackboard bold (e.g. Z {\displaystyle \mathbb {Z} } ) typeface. These include Each of
1960-462: The vertical bar "|" means "such that", and the description can be interpreted as " F is the set of all numbers n such that n is an integer in the range from 0 to 19 inclusive". Some authors use a colon ":" instead of the vertical bar. Philosophy uses specific terms to classify types of definitions: If B is a set and x is an element of B , this is written in shorthand as x ∈ B , which can also be read as " x belongs to B ", or " x
2016-453: The Cartesian product is the product of the cardinalities of its factors, that is, | R × S | = | R | × | S |. A projection ( Π ) is a unary operation written as Π a 1 , … , a n ( R ) {\displaystyle \Pi _{a_{1},\ldots ,a_{n}}(R)} where a 1 , … , a n {\displaystyle a_{1},\ldots ,a_{n}}
2072-399: The above sets of numbers has an infinite number of elements. Each is a subset of the sets listed below it. Sets of positive or negative numbers are sometimes denoted by superscript plus and minus signs, respectively. For example, Q + {\displaystyle \mathbf {Q} ^{+}} represents the set of positive rational numbers. A function (or mapping ) from
2128-438: The attribute names common to R and S , r 1 ,..., r n are the attribute names unique to R and s 1 ,..., s k are the attribute names unique to S . Furthermore, assume that the attribute names x 1 ,..., x m are neither in R nor in S . In a first step the common attribute names in S can be renamed: Then we take the Cartesian product and select the tuples that are to be joined: Finally we take
2184-400: The cardinality of a straight line (i.e., the number of points on a line) is the same as the cardinality of any segment of that line, of the entire plane , and indeed of any finite-dimensional Euclidean space . The continuum hypothesis, formulated by Georg Cantor in 1878, is the statement that there is no set with cardinality strictly between the cardinality of the natural numbers and
2240-470: The cardinality of a straight line. In 1963, Paul Cohen proved that the continuum hypothesis is independent of the axiom system ZFC consisting of Zermelo–Fraenkel set theory with the axiom of choice . (ZFC is the most widely-studied version of axiomatic set theory.) The power set of a set S is the set of all subsets of S . The empty set and S itself are elements of the power set of S , because these are both subsets of S . For example,
2296-414: The database) into a single output relation (the query results). Unary operators accept a single relation as input. Examples include operators to filter certain attributes (columns) or tuples (rows) from an input relation. Binary operators accept two relations as input and combine them into a single output relation. For example, taking all tuples found in either relation ( union ), removing tuples from
ISBL - Misplaced Pages Continue
2352-422: The elements outside the union of A and B are the elements that are outside A and outside B ). The cardinality of A × B is the product of the cardinalities of A and B . This is an elementary fact when A and B are finite. When one or both are infinite, multiplication of cardinal numbers is defined to make this true. The power set of any set becomes a Boolean ring with symmetric difference as
2408-749: The first relation found in the second relation ( difference ), extending the tuples of the first relation with tuples in the second relation matching certain conditions, and so forth. Relational algebra received little attention outside of pure mathematics until the publication of E.F. Codd 's relational model of data in 1970. Codd proposed such an algebra as a basis for database query languages. Relational algebra operates on homogeneous sets of tuples S = { ( s j 1 , s j 2 , . . . s j n ) | j ∈ 1... m } {\displaystyle S=\{(s_{j1},s_{j2},...s_{jn})|j\in 1...m\}} where we commonly interpret m to be
2464-406: The foreign key holds between attributes with the same name. If this is not the case such as in the foreign key from Dept . Manager to Employee . Name then these columns must be renamed before taking the natural join. Such a join is sometimes also referred to as an equijoin . More formally the semantics of the natural join are defined as follows: where Fun(t) is a predicate that is true for
2520-467: The fundamental operations is therefore as follows: In case the operator θ is the equality operator (=) then this join is also called an equijoin . Note, however, that a computer language that supports the natural join and selection operators does not need θ -join as well, as this can be achieved by selection from the result of a natural join (which degenerates to Cartesian product when there are no shared attributes). In SQL implementations, joining on
2576-481: The natural join is the set of all combinations of tuples in R and S that are equal on their common attribute names. For an example consider the tables Employee and Dept and their natural join: Note that neither the employee named Mary nor the Production department appear in the result. This can also be used to define composition of relations . For example, the composition of Employee and Dept
2632-518: The number of rows of tuples in a table and n to be the number of columns. All entries in each column have the same type . A relation also has a unique tuple called the header which gives each column a unique name or attribute inside the relation). Attributes are used in projections and selections. The relational algebra uses set union , set difference , and Cartesian product from set theory, and adds additional constraints to these operators to create new ones. For set union and set difference,
2688-410: The power set of {1, 2, 3} is {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}} . The power set of a set S is commonly written as P ( S ) or 2 . If S has n elements, then P ( S ) has 2 elements. For example, {1, 2, 3} has three elements, and its power set has 2 = 8 elements, as shown above. If S is infinite (whether countable or uncountable ), then P ( S )
2744-537: The same elements are equal (they are the same set). This property is called extensionality . In particular, this implies that there is only one empty set. Sets are ubiquitous in modern mathematics. Indeed, set theory , more specifically Zermelo–Fraenkel set theory , has been the standard way to provide rigorous foundations for all branches of mathematics since the first half of the 20th century. Mathematical texts commonly denote sets by capital letters in italic , such as A , B , C . A set may also be called
2800-477: The same thing and both appearances must always be substituted by the same value (this is a consequence of the idempotence of the logical AND). In particular, natural join allows the combination of relations that are associated by a foreign key . For example, in the above example a foreign key probably holds from Employee . DeptName to Dept . DeptName and then the natural join of Employee and Dept combines all employees with their departments. This works because
2856-571: The semantics of the semijoin can be defined as follows: R ⋉ S = { t : t ∈ R ∧ ∃ s ∈ S ( Fun ( t ∪ s ) ) } {\displaystyle R\ltimes S=\{t:t\in R\land \exists s\in S(\operatorname {Fun} (t\cup s))\}} where Fun ( r ) {\displaystyle \operatorname {Fun} (r)}
SECTION 50
#17327874066122912-439: The semijoin, but the result of an antijoin is only those tuples in R for which there is no tuple in S that is equal on their common attribute names. For an example consider the tables Employee and Dept and their antijoin: The antijoin is formally defined as follows: Set (mathematics) In mathematics , a set is a collection of different things; these things are called elements or members of
2968-440: The set and are typically mathematical objects of any kind: numbers, symbols, points in space, lines, other geometrical shapes, variables, or even other sets. A set may have a finite number of elements or be an infinite set . There is a unique set with no elements, called the empty set ; a set with a single element is a singleton . Sets are uniquely characterized by their elements; this means that two sets that have precisely
3024-620: The set of real numbers has greater cardinality than the set of natural numbers. Sets with cardinality less than or equal to that of N {\displaystyle \mathbb {N} } are called countable sets ; these are either finite sets or countably infinite sets (sets of the same cardinality as N {\displaystyle \mathbb {N} } ); some authors use "countable" to mean "countably infinite". Sets with cardinality strictly greater than that of N {\displaystyle \mathbb {N} } are called uncountable sets . However, it can be shown that
3080-489: The subsets are pairwise disjoint (meaning any two sets of the partition contain no element in common), and the union of all the subsets of the partition is S . Suppose that a universal set U (a set containing all elements being discussed) has been fixed, and that A is a subset of U . Given any two sets A and B , Examples: The operations above satisfy many identities. For example, one of De Morgan's laws states that ( A ∪ B )′ = A ′ ∩ B ′ (that is,
3136-404: The two relations involved must be union-compatible —that is, the two relations must have the same set of attributes. Because set intersection is defined in terms of set union and set difference, the two relations involved in set intersection must also be union-compatible. For the Cartesian product to be defined, the two relations involved must have disjoint headers—that is, they must not have
#611388