SNOBOL ("StriNg Oriented and symBOlic Language") is a series of programming languages developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber , Ralph Griswold and Ivan P. Polonsky, culminating in SNOBOL4. It was one of a number of text-string -oriented languages developed during the 1950s and 1960s; others included COMIT and TRAC .
39-493: COMIT was the first string processing language (compare SNOBOL , TRAC , and Perl ), developed on the IBM 700/7000 series computers by Dr. Victor Yngve , University of Chicago, and collaborators at MIT from 1957 to 1965. Yngve created the language for supporting computerized research in the field of linguistics , and more specifically, the area of machine translation for natural language processing . The creation of COMIT led to
78-522: A Snowball's chance in hell of finding a name". All of us yelled at once, "WE GOT IT — SNOBOL" in the spirit of all the BOL languages . We then stretched our mind to find what it stood for. Common backronyms of "SNOBOL" are 'String Oriented Symbolic Language' or (as a quasi-initialism ) 'StriNg Oriented symBOlic Language'. Douglas McIlroy Malcolm Douglas McIlroy (born 1932) is an American mathematician , engineer , and programmer . As of 2019 he
117-712: A graph (first discovered by George J. Minty in 1965). In 1995, he was elected as a Fellow of the American Association for the Advancement of Science . In 2004, he won both the USENIX Lifetime Achievement Award ("The Flame") and its Software Tools User Group (STUG) award. In 2006, he was elected as a member of the National Academy of Engineering . McIlroy is attributed the quote "The real hero of programming
156-466: A virtual machine to allow improved portability across computers. The SNOBOL4 language translator was still written in assembly language. However the macro features of the assembler were used to define the virtual machine instructions of the S NOBOL I mplementation L anguage, the SIL. This very much improved the portability of the language by making it relatively easy to port the virtual machine which hosted
195-409: A conditional branch dependent upon the success or failure of the subject evaluation, the pattern evaluation, the pattern match, the object evaluation or the final assignment. It can also be a transfer to code created and compiled by the program itself during a run. A SNOBOL pattern can be very simple or extremely complex. A simple pattern is just a text string (e.g. "ABCD"), but a complex pattern may be
234-515: A data type whose values can be manipulated in all ways permitted to any other data type in the programming language) and by providing operators for pattern concatenation and alternation . SNOBOL4 patterns are a type of object and admit various manipulations, much like later object-oriented languages such as JavaScript whose patterns are known as regular expressions . In addition SNOBOL4 strings generated during execution can be treated as programs and either interpreted or compiled and executed (as in
273-457: A large structure describing, for example, the complete grammar of a computer language. It is possible to implement a language interpreter in SNOBOL almost directly from a Backus–Naur form expression of it, with few changes. Creating a macro assembler and an interpreter for a completely theoretical piece of hardware could take as little as a few hundred lines, with a new instruction being added with
312-581: A pioneer of component-based software engineering and software product line engineering . McIlroy earned his bachelor's degree in engineering physics from Cornell University , and a Ph.D. in applied mathematics from MIT in 1959 for his thesis On the Solution of the Differential Equations of Conical Shells (advisor Eric Reissner ). He taught at MIT from 1954 to 1958. McIlroy joined Bell Laboratories in 1958; from 1965 to 1986
351-452: A result, several incompatible dialects arose. As SNOBOL3 became more popular, the authors received more and more requests for extensions to the language. They also began to receive complaints about incompatibility and bugs in versions that they hadn't written. To address this and to take advantage of the new computers being introduced in the late 1960s, the decision was taken to develop SNOBOL4 with many extra datatypes and features but based on
390-463: A result, the authors decided to extend it and tidy it up. SNOBOL2 did exist but it was a short-lived intermediate development version without user-defined functions and was never released. SNOBOL was rewritten to add functions, both standard and user-defined, and the result was released as SNOBOL3. SNOBOL3 became quite popular and was rewritten for other computers than the IBM 7090 by other programmers. As
429-583: A single garbage-collected heap. The "Hello, World!" program might be as follows... A simple program to ask for a user's name and then use it in an output sentence... To choose between three possible outputs... To continue requesting input until no more is forthcoming... The classic implementation was on the PDP-10 ; it has been used to study compilers , formal grammars , and artificial intelligence , especially machine translation and machine comprehension of natural languages . The original implementation
SECTION 10
#1732764968303468-405: A single line. Complex SNOBOL patterns can do things that would be impractical or impossible using the more primitive regular expressions used in most other pattern-matching languages. Some of this power derives from the so-called "SPITBOL extensions" (which have since been incorporated in basically all modern implementations of the original SNOBOL 4 language too), although it is possible to achieve
507-683: A way to express BNF grammars, which are equivalent to context-free grammars and more powerful than regular expressions. The "regular expressions" in current versions of AWK and Perl are in fact extensions of regular expressions in the traditional sense , but regular expressions, unlike SNOBOL4 patterns, are not recursive, which gives a distinct computational advantage to SNOBOL4 patterns. (Recursive expressions did appear in Perl 5.10 , though, released in December 2007. ) The later SL5 (1977) and Icon (1978) languages were designed by Griswold to combine
546-574: Is an Adjunct Professor of Computer Science at Dartmouth College . McIlroy is best known for having originally proposed Unix pipelines and developed several Unix tools, such as echo, spell , diff , sort , join , graph , speak , and tr . He was also one of the pioneering researchers of macro processors and programming language extensibility. He participated in the design of multiple influential programming languages, particularly PL/I , SNOBOL , ALTRAN , TMG and C++ . His seminal work on software componentization and code reuse makes him
585-459: Is distinctive in format and programming style, which are radically different from contemporary procedural languages such as Fortran and ALGOL . SNOBOL4 supports a number of built-in data types , such as integers and limited precision real numbers , strings , patterns , arrays , and tables (associative arrays), and also allows the programmer to define additional data types and new functions . SNOBOL4's programmer-defined data type facility
624-485: Is known as Macro SAP. His 1960 paper was also seminal in the area of extending any (including high-level ) programming languages through macro processors. These contributions started the macro-language tradition at Bell Labs ("everything from L6 and AMBIT to C"). McIlroy's macro processing ideas were also the main inspiration for TRAC macro processor. He also coauthored M6 macro processor in FORTRAN IV , which
663-402: Is practical to even attempt using regular expressions. SNOBOL4 pattern-matching uses a backtracking algorithm similar to that used in the logic programming language Prolog , which provides pattern-like constructs via DCGs . This algorithm makes it easier to use SNOBOL as a logic programming language than is the case for most languages. SNOBOL stores variables, strings and data structures in
702-779: The Association for Computing Machinery as national lecturer, Turing Award chairman, member of the publications planning committee, and associate editor for the Communications of the ACM , the Journal of the ACM , and ACM Transactions on Programming Languages and Systems . He also served on the executive committee of CSNET . McIlroy is considered to be a pioneer of macro processors . In 1959, together with Douglas E. Eastwood of Bell Labs, he introduced conditional and recursive macros into popular SAP assembler, creating what
741-752: The Early PL/I (EPL) compiler in TMG for the Multics project. Around 1965, McIlroy, together with W. Stanley Brown, implemented the original version of ALTRAN programming language for IBM 7094 computers. McIlroy has also made a significant influence on design of the programming language C++ (e.g., he proposed the stream output operator << ). In the 1990s, McIlroy worked on improving sorting techniques, particularly he co-authored an optimized qsort with Jon Bentley . In 1969, he contributed an efficient algorithm to generate all spanning trees in
780-663: The Michigan Terminal System (MTS) provided pattern matching based on SNOBOL4 patterns. Several implementations are currently available. Macro SNOBOL4 in C written by Phil Budne is a free, open source implementation, capable of running on almost any platform. Catspaw, Inc provided a commercial implementation of the SNOBOL4 language for many different computer platforms, including DOS, Macintosh, Sun, RS/6000, and others, and these implementations are now available free from Catspaw. Minnesota SNOBOL4, by Viktors Berstis,
819-413: The eval function of other languages). SNOBOL4 was quite widely taught in larger U.S. universities in the late 1960s and early 1970s and was widely used in the 1970s and 1980s as a text manipulation language in the humanities . In the 1980s and 1990s, its use faded as newer languages such as AWK and Perl made string manipulation by means of regular expressions fashionable. SNOBOL4 patterns include
SECTION 20
#1732764968303858-501: The backtracking of SNOBOL4 pattern matching with more standard ALGOL -like structuring. The initial SNOBOL language was created as a tool to be used by its authors to work with the symbolic manipulation of polynomials. It was written in assembly language for the IBM 7090 . It had a simple syntax, only one datatype, the string, no functions, and no declarations and very little error control. However, despite its simplicity and its "personal" nature its use began to spread to other groups. As
897-460: The closest PC implementation to the original IBM mainframe version (even including Fortran-like FORMAT statement support) is also free. Although SNOBOL itself has no structured programming features, a SNOBOL preprocessor called Snostorm was designed and implemented during the 1970s by Fred G. Swartz for use under the Michigan Terminal System (MTS) at the University of Michigan . Snostorm
936-402: The creation of SNOBOL , which stand out apart from other programming languages of the era (during the 50s and 60s) for having patterns as first class data type. Bob Fabry, University of Chicago, was responsible for COMIT II on Compatible Time Sharing System . SNOBOL SNOBOL4 stands apart from most programming languages of its era by having patterns as a first-class data type ( i.e.
975-516: The entire semester was focused on implementing SITBOL. It was over 80% complete by the end of the semester and was subsequently completed by Professor Gimpel and several students over the summer. SITBOL was a full-featured, high-performance SNOBOL4 interpreter. The Gnat Ada Compiler comes with a package (GNAT.Spitbol) that implements all of the Spitbol string manipulation semantics. This can be called from within an Ada program. The file editor for
1014-544: The equivalent capabilities normally thought of as "structured programming", most notably nested if/then/else type constructs. These features have since been added to most recent SNOBOL4 implementations. After many years as a commercial product, in April 2009 SPITBOL was released as free software under the GNU General Public License . According to Dave Farber, he, Griswold and Polonsky "finally arrived at
1053-534: The facilities that the interpreter provides. The classic implementation on the PDP-10 was quite slow, and in 1972 James Gimpel of Bell Labs, Holmdel, N.J. designed a native implementation of SNOBOL4 for the PDP-10 that he named SITBOL. He used the design as the basis of a graduate class in string processing that he taught that year at Stevens Institute of Technology (which is why it was named SITBOL). Students were given sections to implement (in PDP-10 assembler) and
1092-546: The idea of Unix pipelines. He also implemented TMG compiler-compiler in PDP-7 and PDP-11 assembly, which became the first high-level programming language running on Unix, prompting development and influencing Ken Thompson 's B programming language and Stephen Johnson's Yacc parser-generator. McIlroy also took over from Dennis Ritchie compilation of the Unix manual "as a labor of love". Particularly, he edited volume 1 of
1131-448: The initial SNOBOL implementation of 1962, and figured prominently in subsequent work, eventually leading to its machine-independent implementation language SIL. The table type ( associative array ) was added to SNOBOL4 on McIlroy's insistence in 1969. In 1960s, he participated in the design of PL/I programming language. He was a member of the IBM – SHARE committee that designed the language and, together with Robert Morris , wrote
1170-399: The manual pages for Version 7 Unix. According to Sandy Fraser : "The fact that there was a manual, that he [McIlroy] insisted on a high standard for the manual, meant that he insisted on a high standard for every one of the programs that was documented". McIlroy influenced the design and implementation of SNOBOL programming language. His string manipulation macros were used extensively in
1209-647: The name Symbolic EXpression Interpreter SEXI." All went well until one day I was submitting a batch job to assemble the system and as normal on my JOB card — the first card in the deck, I, in BTL standards, punched my job and my name — SEXI Farber. One of the Comp Center girls looked at it and said, "That's what you think" in a humorous way. That made it clear that we needed another name!! We sat and talked and drank coffee and shot rubber bands and after much too much time someone said — most likely Ralph — "We don't have
COMIT - Misplaced Pages Continue
1248-400: The pattern itself during the matching operation. Patterns can be saved like any other first-class data item, and can be concatenated, used within other patterns, and used to create very complex and sophisticated pattern expressions. It is possible to write, for example, a SNOBOL4 pattern which matches "a complete name and international postal mailing address", which is well beyond anything that
1287-438: The same power without them. Part of this power comes from the side effects that it is possible to produce during the pattern matching operation, including saving numerous intermediate/tentative matching results and the ability to invoke user-written functions during the pattern match which can perform nearly any desired processing, and then influence the ongoing direction the interrupted pattern match takes, or even to indeed change
1326-431: The translator by recreating its virtual instructions on any machine which included a macro assembler or indeed a high level language. The machine-independent language SIL arose as a generalization of string manipulation macros by Douglas McIlroy , which were used extensively in the initial SNOBOL implementation. In 1969, McIlroy influenced the language again by insisting on addition of the table type to SNOBOL4. SNOBOL
1365-419: Was advanced at the time—it is similar to the records of the earlier COBOL and the later Pascal programming languages. All SNOBOL command lines are of the form Each of the five elements is optional. In general, the subject is matched against the pattern . If the object is present, any matched portion is replaced by the object via rules for replacement. The transfer can be an absolute branch or
1404-579: Was head of its Computing Techniques Research Department (the birthplace of the Unix operating system), and thereafter was Distinguished Member of Technical Staff. From 1967 to 1968, McIlroy also served as a visiting lecturer at Oxford University . In 1997, McIlroy retired from Bell Labs, and took a position as an adjunct professor in the Dartmouth College Computer Science Department. He has previously served
1443-502: Was on an IBM 7090 at Bell Labs, Holmdel, N.J. SNOBOL4 was specifically designed for portability; the first implementation was started on an IBM 7094 in 1966 but completed on an IBM 360 in 1967. It was rapidly ported to many other platforms. It is normally implemented as an interpreter because of the difficulty in implementing some of its very high-level features, but there is a compiler , the SPITBOL compiler , which provides nearly all
1482-492: Was used at the eight to fifteen sites that ran MTS. It was also available at University College London (UCL) between 1982 and 1984. Snocone by Andrew Koenig adds block-structured constructs to the SNOBOL4 language. Snocone is a self-contained programming language, rather than a proper superset of SNOBOL4. The SPITBOL implementation also introduced a number of features which, while not using traditional structured programming keywords, nevertheless can be used to provide many of
1521-513: Was used in ALTRAN and later was ported to and included into early versions of Unix . Throughout the 1960s and 1970s McIlroy contributed programs for Multics (such as RUNOFF ) and Unix operating systems (such as diff , echo , tr , join and look ), versions of which are widespread to this day through adoption of the POSIX standard and Unix-like operating systems. He introduced
#302697