In computer science , program synthesis is the task to construct a program that provably satisfies a given high-level formal specification . In contrast to program verification , the program is to be constructed rather than given; however, both fields make use of formal proof techniques, and both comprise approaches of different degrees of automation. In contrast to automatic programming techniques, specifications in program synthesis are usually non- algorithmic statements in an appropriate logical calculus .
52-579: Generic Eclipse Modeling System ( GEMS ) is a configurable toolkit for creating domain-specific modeling and program synthesis environments for Eclipse . The project aims to bridge the gap between the communities experienced with visual metamodeling tools like those built around the Eclipse modeling technologies, such as the Eclipse Modeling Framework (EMF) and Graphical Modeling Framework (GMF). GEMS helps developers rapidly create
104-507: A DPLL -style search with theory-specific solvers ( T-solvers ) that handle conjunctions (ANDs) of predicates from a given theory. This approach is referred to as the lazy approach . Dubbed DPLL(T) , this architecture gives the responsibility of Boolean reasoning to the DPLL-based SAT solver which, in turn, interacts with a solver for theory T through a well-defined interface. The theory solver only needs to worry about checking
156-484: A constraint satisfaction problem and thus a certain formalized approach to constraint programming . Formally speaking, an SMT instance is a formula in first-order logic , where some function and predicate symbols have additional interpretations, and SMT is the problem of determining whether such a formula is satisfiable. In other words, imagine an instance of the Boolean satisfiability problem (SAT) in which some of
208-400: A context-free grammar of expressions that constrains the syntax of valid solutions. For example, to synthesize a function f that returns the maximum of two integers, the logical specification might look like this: ( f ( x , y ) = x ∨ f ( x , y ) = y ) ∧ f ( x , y ) ≥ x ∧ f ( x , y ) ≥ y and the grammar might be: where "ite" stands for "if-then-else". The expression would be
260-589: A 32-bit integer variable would be encoded by 32 single-bit variables with appropriate weights and word-level operations such as 'plus' would be replaced by lower-level logic operations on the bits) and passing this formula to a Boolean SAT solver. This approach, which is referred to as the eager approach (or bitblasting ), has its merits: by pre-processing the SMT formula into an equivalent Boolean SAT formula existing Boolean SAT solvers can be used "as-is" and their performance and capacity improvements leveraged over time. On
312-917: A class of theory that support efficient theory propagation and conflict analysis, enabling practical use within DPLL(T) solvers. Monotonic theories support only boolean variables (boolean is the only sort ), and all their functions and predicates p obey the axiom p ( … , b i − 1 , 0 , b i + 1 , … ) ⟹ p ( … , b i − 1 , 1 , b i + 1 , … ) {\displaystyle p(\ldots ,b_{i-1},0,b_{i+1},\ldots )\implies p(\ldots ,b_{i-1},1,b_{i+1},\ldots )} Examples of monotonic theories include graph reachability , collision detection for convex hulls , minimum cuts , and computation tree logic . Every Datalog program can be interpreted as
364-399: A graphical modeling tool from a visual language description or metamodel without any coding in third-generation languages . Graphical modeling tools created with GEMS automatically support complex capabilities, such as remote updating and querying, template creation, styling with Cascading Style Sheets (CSS), and model linking. The configuration is accomplished through metamodels specifying
416-400: A monotonic theory. Most of the common SMT approaches support decidable theories. However, many real-world systems, such as an aircraft and its behavior, can only be modelled by means of non-linear arithmetic over the real numbers involving transcendental functions . This fact motivates an extension of the SMT problem to non-linear theories, such as determining whether the following equation
468-536: A non-linear optimization packet as (necessarily incomplete) subordinate theory solver, iSAT , building on a unification of DPLL SAT-solving and interval constraint propagation called the iSAT algorithm, and cvc5 . The table below summarizes some of the features of the many available SMT solvers. The column "SMT-LIB" indicates compatibility with the SMT-LIB language; many systems marked 'yes' may support only older versions of SMT-LIB, or offer only partial support for
520-654: A number of standardized benchmarks and has enabled a yearly competition between SMT solvers called SMT-COMP. Initially, the competition took place during the Computer Aided Verification conference (CAV), but as of 2020 the competition is hosted as part of the SMT Workshop, which is affiliated with the International Joint Conference on Automated Reasoning (IJCAR). SMT solvers are useful both for verification, proving
572-436: A set of inputs I , a set of possible programs P , and a specification S , the goal of program synthesis is to find a program p in P such that for all inputs i in I , S ( p , i ) holds. CEGIS is parameterized over a generator and a verifier: CEGIS runs the generator and verifier run in a loop, accumulating counter-examples: Implementations of CEGIS typically use SMT solvers as verifiers. CEGIS
SECTION 10
#1732802197871624-445: A similar way, line 14 yields line 15 and then line 16 by resolution. Also, the second case, x ≤ M ∧ y ≤ M ∧ y = M {\displaystyle x\leq M\land y\leq M\land y=M} in line 13, is handled similarly, yielding eventually line 18. In a last step, both cases (i.e. lines 16 and 18) are joined, using the resolution rule from line 58; to make that rule applicable,
676-512: A standardized interface to SMT solvers (and automated theorem provers , a term often used synonymously). The most prominent is the SMT-LIB standard, which provides a language based on S-expressions . Other standardized formats commonly supported are the DIMACS format supported by many Boolean SAT solvers, and the CVC format used by the CVC automated theorem prover. The SMT-LIB format also comes with
728-688: A toy example, a functional program to compute the maximum M {\displaystyle M} of two numbers x {\displaystyle x} and y {\displaystyle y} can be derived as follows. Starting from the requirement description " The maximum is larger than or equal to any given number, and is one of the given numbers ", the first-order formula ∀ X ∀ Y ∃ M : X ≤ M ∧ Y ≤ M ∧ ( X = M ∨ Y = M ) {\displaystyle \forall X\forall Y\exists M:X\leq M\land Y\leq M\land (X=M\lor Y=M)}
780-399: A valid solution, because it conforms to the grammar and the specification. From 2014 through 2019, the yearly Syntax-Guided Synthesis Competition (or SyGuS-Comp) compared the different algorithms for program synthesis in a competitive event. The competition used a standardized input format, SyGuS-IF, based on SMT-Lib 2 . For example, the following SyGuS-IF encodes the problem of synthesizing
832-440: Is a dependently typed language that uses Z3 to find proofs; the compiler carries these proofs through to produce proof-carrying bytecode. The Viper verification infrastructure encodes verification conditions to Z3. The sbv library provides SMT-based verification of Haskell programs, and lets the user choose among a number of solvers such as Z3, ABC, Boolector, cvc5, MathSAT and Yices. There are also many verifiers built on top of
884-421: Is already NP-complete , the SMT problem is typically NP-hard , and for many theories it is undecidable . Researchers study which theories or subsets of theories lead to a decidable SMT problem and the computational complexity of decidable cases. The resulting decision procedures are often implemented directly in SMT solvers; see, for instance, the decidability of Presburger arithmetic . SMT can be thought of as
936-406: Is also decidable. Since multiplication by constants can be implemented as nested additions, the arithmetic in many computer programs can be expressed using Presburger arithmetic, resulting in decidable formulas. Examples of SMT solvers addressing Boolean combinations of theory atoms from undecidable arithmetic theories over the reals are ABsolver, which employs a classical DPLL(T) architecture with
988-525: Is blurry enough that some ATPs participate in SMT-COMP, while some SMT solvers participate in CASC . An SMT instance is a generalization of a Boolean SAT instance in which various sets of variables are replaced by predicates from a variety of underlying theories. SMT formulas provide a much richer modeling language than is possible with Boolean SAT formulas. For example, an SMT formula allows one to model
1040-541: Is derived from the fact that these expressions are interpreted within ("modulo") a certain formal theory in first-order logic with equality (often disallowing quantifiers ). SMT solvers are tools that aim to solve the SMT problem for a practical subset of inputs. SMT solvers such as Z3 and cvc5 have been used as a building block for a wide range of applications across computer science, including in automated theorem proving , program analysis , program verification , and software testing . Since Boolean satisfiability
1092-415: Is obtained as its formal translation. This formula is to be proved. By reverse Skolemization , the specification in line 10 is obtained, an upper- and lower-case letter denoting a variable and a Skolem constant , respectively. After applying a transformation rule for the distributive law in line 11, the proof goal is a disjunction, and hence can be split into two cases, viz. lines 12 and 13. Turning to
SECTION 20
#17328021978711144-415: Is possible to encode program synthesis problems in Boolean logic and use algorithms for the Boolean satisfiability problem to automatically find programs. In 2013, a unified framework for program synthesis problems called Syntax-guided Synthesis (stylized SyGuS) was proposed by researchers at UPenn , UC Berkeley , and MIT . The input to a SyGuS algorithm consists of a logical specification along with
1196-399: Is satisfiable: where Such problems are, however, undecidable in general. (On the other hand, the theory of real closed fields , and thus the full first order theory of the real numbers , are decidable using quantifier elimination . This is due to Alfred Tarski .) The first order theory of the natural numbers with addition (but not multiplication), called Presburger arithmetic ,
1248-437: Is some unspecified function of two arguments). These predicates are classified according to each respective theory assigned. For instance, linear inequalities over real variables are evaluated using the rules of the theory of linear real arithmetic , whereas predicates involving uninterpreted terms and function symbols are evaluated using the rules of the theory of uninterpreted functions with equality (sometimes referred to as
1300-529: Is supported. Case studies performed within this framework synthesized algorithms to compute e.g. division , remainder , square root , term unification , answers to relational database queries and several sorting algorithms . Proof rules include: Murray has shown these rules to be complete for first-order logic . In 1986, Manna and Waldinger added generalized E-resolution and paramodulation rules to handle also equality; later, these rules turned out to be incomplete (but nevertheless sound ). As
1352-469: Is to relieve the programmer of the burden of writing correct, efficient code that satisfies a specification. However, program synthesis also has applications to superoptimization and inference of loop invariants . During the Summer Institute of Symbolic Logic at Cornell University in 1957, Alonzo Church defined the problem to synthesize a circuit from mathematical requirements. Even though
1404-541: Is to translate preconditions, postconditions, loop conditions, and assertions into SMT formulas in order to determine if all properties can hold. There are many verifiers built on top of the Z3 SMT solver . Boogie is an intermediate verification language that uses Z3 to automatically check simple imperative programs. The VCC verifier for concurrent C uses Boogie, as well as Dafny for imperative object-based programs, Chalice for concurrent programs, and Spec# for C#. F*
1456-449: Is valid; the step 15→16 established that both cases 16 and 18 are complementary. Since both line 16 and 18 comes with a program term, a conditional expression results in the program column. Since the goal formula true {\displaystyle {\textit {true}}} has been derived, the proof is done, and the program column of the " true {\displaystyle {\textit {true}}} " line contains
1508-804: The Alt-Ergo SMT solver. Here is a list of mature applications: Many SMT solvers implement a common interface format called SMTLIB2 (such files usually have the extension " .smt2 "). The LiquidHaskell tool implements a refinement type based verifier for Haskell that can use any SMTLIB2 compliant solver, e.g. cvc5, MathSat, or Z3. An important application of SMT solvers is symbolic execution for analysis and testing of programs (e.g., concolic testing ), aimed particularly at finding security vulnerabilities. Example tools in this category include SAGE from Microsoft Research , KLEE , S2E , and Triton . SMT solvers that have been used for symbolic-execution applications include Z3 , STP Archived 2015-04-06 at
1560-570: The Goals column, or, equivalently, f a l s e {\displaystyle {\it {false}}} in the Assertions column. Programs obtained by this approach are guaranteed to satisfy the specification formula started from; in this sense they are correct by construction . Only a minimalist, yet Turing-complete , purely functional programming language , consisting of conditional, recursion, and arithmetic and other operators
1612-598: The binary variables are replaced by predicates over a suitable set of non-binary variables. A predicate is a binary-valued function of non-binary variables. Example predicates include linear inequalities (e.g., 3 x + 2 y − z ≥ 4 {\displaystyle 3x+2y-z\geq 4} ) or equalities involving uninterpreted terms and function symbols (e.g., f ( f ( u , v ) , v ) = f ( u , v ) {\displaystyle f(f(u,v),v)=f(u,v)} where f {\displaystyle f}
Generic Eclipse Modeling System - Misplaced Pages Continue
1664-464: The correctness of programs, software testing based on symbolic execution , and for synthesis , generating program fragments by searching over the space of possible programs. Outside of software verification, SMT solvers have also been used for type inference and for modelling theoretic scenarios, including modelling actor beliefs in nuclear arms control . Computer-aided verification of computer programs often uses SMT solvers. A common technique
1716-445: The datapath operations of a microprocessor at the word rather than the bit level. By comparison, answer set programming is also based on predicates (more precisely, on atomic sentences created from atomic formulas ). Unlike SMT, answer-set programs do not have quantifiers, and cannot easily express constraints such as linear arithmetic or difference logic —answer set programming is best suited to Boolean problems that reduce to
1768-668: The empty theory ). Other theories include the theories of arrays and list structures (useful for modeling and verifying computer programs ), and the theory of bit vectors (useful in modeling and verifying hardware designs ). Subtheories are also possible: for example, difference logic is a sub-theory of linear arithmetic in which each inequality is restricted to have the form x − y > c {\displaystyle x-y>c} for variables x {\displaystyle x} and y {\displaystyle y} and constant c {\displaystyle c} . The examples above show
1820-583: The free theory of uninterpreted functions. Implementing 32-bit integers as bitvectors in answer set programming suffers from most of the same problems that early SMT solvers faced: "obvious" identities such as x + y = y + x are difficult to deduce. Constraint logic programming does provide support for linear arithmetic constraints, but within a completely different theoretical framework. SMT solvers have also been extended to solve formulas in higher-order logic . Early attempts for solving SMT instances involved translating them to Boolean SAT instances (e.g.,
1872-425: The 1969 automata-theoretic approach by Büchi and Landweber , and the works by Manna and Waldinger (c. 1980). The development of modern high-level programming languages can also be understood as a form of program synthesis. The early 21st century has seen a surge of practical interest in the idea of program synthesis in the formal verification community and related fields. Armando Solar-Lezama showed that it
1924-483: The development of specialized decidable theories , including linear arithmetic over rationals and integers , fixed-width bitvectors, floating-point arithmetic (often implemented in SMT solvers via bit-blasting , i.e., reduction to bitvectors), strings , (co)-datatypes , sequences (used to model dynamic arrays ), finite sets and relations , separation logic , finite fields , and uninterpreted functions among others. Boolean monotonic theories are
1976-522: The family of models that can be created using the resultant modeling environment. The built-in metamodeling language is based on the UML class diagram notation. Metamodels in other eCore readable formats can be used as well. Metamodel constraints can be specified in declarative languages (e.g. OCL, Prolog) or, alternatively, in Java. Once a metamodel has been created, GEMS plug-in generator can be invoked to create
2028-554: The feasibility of conjunctions of theory predicates passed on to it from the SAT solver as it explores the Boolean search space of the formula. For this integration to work well, however, the theory solver must be able to participate in propagation and conflict analysis, i.e., it must be able to infer new facts from already established facts, as well as to supply succinct explanations of infeasibility when theory conflicts arise. In other words,
2080-735: The first case, resolving line 12 with the axiom in line 1 leads to instantiation of the program variable M {\displaystyle M} in line 14. Intuitively, the last conjunct of line 12 prescribes the value that M {\displaystyle M} must take in this case. Formally, the non-clausal resolution rule shown in line 57 above is applied to lines 12 and 1, with yielding ¬ ( {\displaystyle \lnot (} true ∧ false ) ∧ ( x ≤ x ∧ y ≤ x ∧ true ) {\displaystyle )} , which simplifies to x ≤ x ∧ y ≤ x {\displaystyle x\leq x\land y\leq x} . In
2132-503: The language. The column "CVC" indicates support for the CVC language. The column "DIMACS" indicates support for the DIMACS format . Projects differ not only in features and performance, but also in the viability of the surrounding community, its ongoing interest in a project, and its ability to contribute documentation, fixes, tests and enhancements. There are multiple attempts to describe
Generic Eclipse Modeling System - Misplaced Pages Continue
2184-411: The maximum of two integers (as presented above): A compliant solver might return the following output: Counter-example guided inductive synthesis (CEGIS) is an effective approach to building sound program synthesizers. CEGIS involves the interplay of two components: a generator which generates candidate programs, and a verifier which checks whether the candidates satisfy the specification. Given
2236-449: The modeling paradigm of the application domain, i.e. a domain-specific modeling language (DSML). The modeling paradigm contains all the syntactic, semantic, and presentation information regarding the domain; which concepts will be used to construct models, what relationships may exist among those concepts, how the concepts may be organized and viewed by the modeler, and rules governing the construction of models. The modeling paradigm defines
2288-497: The modeling tool. The generated plug-in uses Eclipse's Graphical Editing Framework (GEF) and Draw2D plug-in to visualize the DSML as a diagram. GEMS extension points can be used to create an interpreter which traverses the domain-specific model and generates code. Interpreters can also interpret the model to provide executable semantics and perform complex analyses. Program synthesis The primary application of program synthesis
2340-470: The other hand, the loss of the high-level semantics of the underlying theories means that the Boolean SAT solver has to work a lot harder than necessary to discover "obvious" facts (such as x + y = y + x {\displaystyle x+y=y+x} for integer addition.) This observation led to the development of a number of SMT solvers that tightly integrate the Boolean reasoning of
2392-422: The preparatory step 15→16 was needed. Intuitively, line 18 could be read as "in case x ≤ y {\displaystyle x\leq y} , the output y {\displaystyle y} is valid (with respect to the original specification), while line 15 says "in case y ≤ x {\displaystyle y\leq x} , the output x {\displaystyle x}
2444-423: The program. SMT-LIB In computer science and mathematical logic , satisfiability modulo theories ( SMT ) is the problem of determining whether a mathematical formula is satisfiable . It generalizes the Boolean satisfiability problem (SAT) to more complex formulas involving real numbers , integers , and/or various data structures such as lists , arrays , bit vectors , and strings . The name
2496-481: The table. After that, appropriate proof rules are applied manually. The framework has been designed to enhance human readability of intermediate formulas: contrary to classical resolution , it does not require clausal normal form , but allows one to reason with formulas of arbitrary structure and containing any junctors (" non-clausal resolution "). The proof is complete when t r u e {\displaystyle {\it {true}}} has been derived in
2548-442: The theory solver must be incremental and backtrackable . Researchers study which theories or subsets of theories lead to a decidable SMT problem and the computational complexity of decidable cases. Since full first-order logic is only semidecidable , one line of research attempts to find efficient decision procedures for fragments of first-order logic such as effectively propositional logic . Another line of research involves
2600-566: The use of Linear Integer Arithmetic over inequalities. Other examples include: Most SMT solvers support only quantifier -free fragments of their logics. There is substantial overlap between SMT solving and automated theorem proving (ATP). Generally, automated theorem provers focus on supporting full first-order logic with quantifiers, whereas SMT solvers focus more on supporting various theories (interpreted predicate symbols). ATPs excel at problems with lots of quantifiers, whereas SMT solvers do well on large problems without quantifiers. The line
2652-430: The work only refers to circuits and not programs, the work is considered to be one of the earliest descriptions of program synthesis and some researchers refer to program synthesis as "Church's Problem". In the 1960s, a similar idea for an "automatic programmer" was explored by researchers in artificial intelligence. Since then, various research communities considered the problem of program synthesis. Notable works include
SECTION 50
#17328021978712704-481: Was inspired by counterexample-guided abstraction refinement (CEGAR). The framework of Manna and Waldinger , published in 1980, starts from a user-given first-order specification formula . For that formula, a proof is constructed, thereby also synthesizing a functional program from unifying substitutions. The framework is presented in a table layout, the columns containing: Initially, background knowledge, pre-conditions, and post-conditions are entered into
#870129