Misplaced Pages

Sed (disambiguation)

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

sed ("stream editor") is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs , and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed ("editor", 1971) and the earlier qed ("quick editor", 1965–66). It was one of the earliest tools to support regular expressions , and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl .

#698301

51-549: Sed is a Unix utility for processing text. Sed or SED may also refer to: Sed First appearing in Version 7 Unix , sed is one of the early Unix commands built for command line processing of data files. It evolved as the natural successor to the popular grep command. The original motivation was an analogue of grep (g/re/p) for substitution, hence "g/re/s". Foreseeing that further special-purpose programs for each command would also arise, such as g/re/d, McMahon wrote

102-619: A general-purpose line-oriented stream editor, which became sed. The syntax for sed, notably the use of / for pattern matching , and s/// for substitution, originated with ed , the precursor to sed, which was in common use at the time, and the regular expression syntax has influenced other languages, notably ECMAScript and Perl . Later, the more powerful language AWK developed, and these functioned as cousins, allowing powerful text processing to be done by shell scripts . sed and AWK are often cited as progenitors and inspiration for Perl, and influenced Perl's syntax and semantics, notably in

153-439: A line in a way more complicated than a regex extracting and template replacement, though arbitrarily complicated transforms are in principle possible by using the hold buffer. Conversely, for simpler operations, specialized Unix utilities such as grep (print lines matching a pattern), head (print the first part of a file), tail (print the last part of a file), and tr (translate or delete characters) are often preferable. For

204-603: A list of commands consisting of a Boolean-valued guard (corresponding to a condition ) and its corresponding statement. In GCL, exactly one of the statements whose guards is true is evaluated, but which one is arbitrary. In this code the G i 's are the guards and the S i 's are the statements. If none of the guards is true, the program's behavior is undefined. GCL is intended primarily for reasoning about programs, but similar notations have been implemented in Concurrent Pascal and occam . Up to Fortran 77 ,

255-426: A script file such as subst.sed , and then use the -f option to run the commands (such as s/x/y/g ) from the file: Any number of commands may be placed into the script file, and using a script file also avoids problems with shell escaping or substitutions. Such a script file may be made directly executable from the command line by prepending it with a " shebang line" containing the sed command and assigning

306-454: A space. A separate special buffer, the hold space , may be used by a few sed commands to hold and accumulate text between cycles. sed's command language has only two variables (the "hold space" and the "pattern space") and GOTO -like branching functionality; nevertheless, the language is Turing-complete , and esoteric sed scripts exist for games such as sokoban , arkanoid , chess , and tetris . A main loop executes for each line of

357-614: A structured programming language, structured programming makes this easier and enforces it. Structured if–then–else statements like the example above are one of the key elements of structured programming, and they are present in most popular high-level programming languages such as C , Java , JavaScript and Visual Basic . The else keyword is made to target a specific if–then statement preceding it, but for nested if–then statements, classic programming languages such as ALGOL 60 struggled to define which specific statement to target. Without clear boundaries for which statement

408-554: A switch statement, these can be produced by a sequence of else if statements. Many languages support if expressions , which are similar to if statements, but return a value as a result. Thus, they are true expressions (which evaluate to a value), not statements (which may not be permitted in the context of a value). ALGOL 60 and some other members of the ALGOL family allow if–then–else as an expression: In dialects of Lisp   – Scheme , Racket and Common Lisp   –

459-548: Is executed, or to the unit type () if no branch is executed. If a branch does not provide a return value, it evaluates to () by default. To ensure the if expression's type is known at compile time, each branch must evaluate to a value of the same type. For this reason, an else branch is effectively compulsory unless the other branches evaluate to () , because an if without an else can always evaluate to () by default. The Guarded Command Language (GCL) of Edsger Dijkstra supports conditional execution as

510-485: Is greater than zero" – and evaluates that condition. If the condition is true , the statements following the then are executed. Otherwise, the execution continues in the following branch – either in the else block (which is usually optional), or if there is no else branch, then after the end If . After either branch has been executed, control returns to the point after the end If . In early programming languages, especially some dialects of BASIC in

561-406: Is named 'bar'." is printed on the screen. In Visual Basic and some other languages, a function called IIf is provided, which can be used as a conditional expression. However, it does not behave like a true conditional expression, because both the true and false branches are always evaluated; it is just that the result of one of them is thrown away, while the result of the other is returned by

SECTION 10

#1732765820699

612-552: Is not used in structured programming . In practice it has been observed that most arithmetic IF statements reference the following statement with one or two of the labels. This was the only conditional control statement in the original implementation of Fortran on the IBM 704 computer. On that computer the test-and-branch op-code had three addresses for those three states. Other computers would have "flag" registers such as positive, zero, negative, even, overflow, carry, associated with

663-495: Is possible to combine several conditions. Only the statements following the first condition that is found to be true will be executed. All other statements will be skipped. For example, for a shop offering as much as a 30% discount for an item: In the example above, if the discount is 10%, then the first if statement will be evaluated as true and "you have to pay $ 30" will be printed out. All other statements below that first if statement will be skipped. The elseif statement, in

714-432: Is possible to write terse one-liner programs . For example, the sed program given by: will print the first 10 lines of input, then stop. The following example shows a typical, and the most common, use of sed: substitution. This usage was indeed the original motivation for sed: In some versions of sed, the expression must be preceded by -e to indicate that an expression follows. The s stands for substitute, while

765-427: Is used in many programming languages. Although the syntax varies from language to language, the basic structure (in pseudocode form) looks like this: For example: In the example code above, the part represented by (Boolean condition) constitutes a conditional expression , having intrinsic value (e.g., it may be substituted by either of the values True or False ) but having no intrinsic meaning. In contrast,

816-411: Is which, an else keyword could target any preceding if–then statement in the nest, as parsed. can be parsed as or depending on whether the else is associated with the first if or second if . This is known as the dangling else problem, and is resolved in various ways, depending on the language (commonly via the end if statement or {...} brackets). By using else if , it

867-420: The else if construct is not present, nor is it present in the many syntactical derivatives of C, such as Java , ECMAScript , and so on. This works because in these languages, any single statement (in this case if cond ...) can follow a conditional without being enclosed in a block. This design choice has a slight "cost". Each else if branch effectively adds an extra nesting level. This complicates

918-432: The g stands for global, which means that all matching occurrences in the line would be replaced. The regular expression (i.e. pattern) to be searched is placed after the first delimiting symbol (slash here) and the replacement follows the second symbol. Slash ( / ) is the conventional symbol, originating in the character for "search" in ed, but any other could be used to make syntax more readable if it does not occur in

969-425: The { instruction starts a subsequence of commands (up to the matching } ); in most cases, it will be conditioned by an address pattern. Under Unix, sed is often used as a filter in a pipeline : That is, a program such as "generateData" generates data, and then sed makes the small change of replacing x with y . For example: It is often useful to put several sed commands, one command per line, into

1020-502: The Ada language for example, is simply syntactic sugar for else followed by if . In Ada, the difference is that only one end if is needed, if one uses elseif instead of else followed by if . PHP uses the elseif keyword both for its curly brackets or colon syntaxes. Perl provides the keyword elsif to avoid the large number of braces that would be required by multiple if and else statements. Python uses

1071-416: The command line ( -e option) or read from a separate file ( -f option). Commands in the sed script may take an optional address, in terms of line numbers or regular expressions . The address determines when the command is run. For example, 2d would only run the d (delete) command on the second input line (printing all lines but the second), while /^ /d would delete all lines beginning with

SECTION 20

#1732765820699

1122-419: The pattern space . Each line read starts a cycle . To the pattern space, sed applies one or more operations which have been specified via a sed script . sed implements a programming language with about 25 commands that specify the operations on the text. For each input line, after running the script, sed ordinarily outputs the pattern space (the line as modified by the script) and begins the cycle again with

1173-497: The 1980s home computers , an if–then statement could only contain GOTO statements (equivalent to a branch instruction). This led to a hard-to-read style of programming known as spaghetti programming , with programs in this style called spaghetti code . As a result, structured programming , which allows (virtually) arbitrary statements to be put in statement blocks inside an if statement, gained in popularity, until it became

1224-523: The Algol-family if–then–else expressions (in contrast to a statement ) (and similar in Ruby and Scala, among others). To accomplish the same using an if-statement, this would take more than one line of code (under typical layout conventions), and require mentioning "my_variable" twice: Some argue that the explicit if/then statement is easier to read and that it may compile to more efficient code than

1275-533: The GNU Project wrote a new version of sed based on the new GNU regular expression library. The current minised contains some extensions to BSD sed but is not as feature-rich as GNU sed. Its advantage is that it is very fast and uses little memory. It is used on embedded systems and is the version of sed provided with Minix . sed is a line-oriented text processing utility: it reads text, line by line, from an input stream or file, into an internal buffer called

1326-508: The IIf function. In Tcl if is not a keyword but a function (in Tcl known as command or proc ). For example invokes a function named if passing 2 arguments: The first one being the condition and the second one being the true branch. Both arguments are passed as strings (in Tcl everything within curly brackets is a string). In the above example the condition is not evaluated before calling

1377-421: The class Boolean as an abstract method that takes two parameters, both closures . Boolean has two subclasses, True and False , which both define the method, True executing the first closure only, False executing the second closure only. JavaScript uses if-else statements similar to those in C languages. A Boolean value is accepted within parentheses between the reserved if keyword and

1428-446: The combination of this expression, the If and Then surrounding it, and the consequent that follows afterward constitute a conditional statement , having intrinsic meaning (e.g., expressing a coherent logical rule) but no intrinsic value. When an interpreter finds an If , it expects a Boolean condition – for example, x > 0 , which means "the variable x contains a number that

1479-453: The executable permission to the file. For example, a file subst.sed can be created with contents: The file may then be made executable by the current user with the chmod command: The file may then be executed directly from the command line: The -i option, introduced in GNU sed, allows in-place editing of files (actually, a temporary output file is created in the background, and then

1530-459: The first of which was inspired to a great extent by ALGOL: In Haskell 98, there is only an if expression , no if statement , and the else part is compulsory, as every expression must have some value. Logic that would be expressed with conditionals in other languages is usually expressed with pattern matching in recursive functions. Because Haskell is lazy , it is possible to write control structures, such as if , as ordinary expressions;

1581-499: The following text: The sed script below will turn the text above into the following text. Note that the script affects only the input lines that start with a space: The script is: This is explained as: This can be expressed on a single line via semicolons: While simple and limited, sed is sufficiently powerful for a large number of purposes. For more sophisticated processing, more powerful languages such as AWK or Perl are used instead. These are particularly used if transforming

Sed (disambiguation) - Misplaced Pages Continue

1632-426: The following uses the d command to filter out lines that only contain spaces, or only contain the end of line character: This example uses some of the following regular expression metacharacters (sed supports the full range of regular expressions): Complex sed constructs are possible, allowing it to serve as a simple, but highly specialized, programming language . Flow of control, for example, can be managed by

1683-418: The function. Instead, the implementation of the if function receives the condition as a string value and is responsible to evaluate this string as an expression in the callers scope. Such a behavior is possible by using uplevel and expr commands: Because if is actually a function it also returns a value: In Rust , if is always an expression. It evaluates to the value of whichever branch

1734-422: The input stream, evaluating the sed script on each line of the input. Lines of a sed script are each a pattern-action pair, indicating what pattern to match and which action to perform, which can be recast as a conditional statement . Because the main loop, working variables (pattern space and hold space), input and output streams, and default actions (copy line to pattern space, print pattern space) are implicit, it

1785-448: The job for the compiler (or the people who write the compiler), because the compiler must analyse and implement arbitrarily long else if chains recursively. If all terms in the sequence of conditionals are testing the value of a single expression (e.g., if x=0 ... else if x=1 ... else if x=2 ...), an alternative is the switch statement , also called case-statement or select-statement. Conversely, in languages that do not have

1836-483: The language Fortran has had an arithmetic if statement which jumps to one of three labels depending on whether its argument e is e < 0, e = 0, e > 0. This was the earliest conditional statement in Fortran. Where e is any numeric expression (not necessarily an integer). This is equivalent to this sequence, where e is evaluated only once. Arithmetic if is an unstructured control statement, and

1887-491: The last arithmetic operations and would use instructions such as 'Branch if accumulator negative' then 'Branch if accumulator zero' or similar. Note that the expression is evaluated once only , and in cases such as integer arithmetic where overflow may occur, the overflow or carry flags would be considered also. In contrast to other languages, in Smalltalk the conditional statement is not a language construct but defined in

1938-515: The lazy evaluation means that an if function can evaluate only the condition and proper branch (where a strict language would evaluate all three). It can be written like this: C and C-like languages have a special ternary operator ( ?: ) for conditional expressions with a function that may be described by a template like this: This means that it can be inlined into expressions, unlike if-statements, in C-like languages: which can be compared to

1989-467: The matching and substitution operators. GNU sed added several new features, including in-place editing of files. Super-sed is an extended version of sed that includes regular expressions compatible with Perl . Another variant of sed is minised , originally reverse-engineered from 4.1BSD sed by Eric S. Raymond and currently maintained by René Rebe . minised was used by the GNU Project until

2040-473: The next line. Other end-of-script behaviors are available through sed options and script commands, e.g. d to delete the pattern space, q to quit, N to add the next line to the pattern space immediately, and so on. Thus a sed script corresponds to the body of a loop that iterates through the lines of a stream, where the loop itself and the loop variable (the current line number) are implicit and maintained by sed. The sed script can either be specified on

2091-454: The norm even in most BASIC programming circles. Such mechanisms and principles were based on the older but more advanced ALGOL family of languages, and ALGOL-like languages such as Pascal and Modula-2 influenced modern BASIC variants for many years. While it is possible while using only GOTO statements in if–then statements to write programs that are not spaghetti code and are just as well structured and readable as programs written in

Sed (disambiguation) - Misplaced Pages Continue

2142-409: The original file is replaced by the temporary file). For example: This "Hello, world!" script is in a file (e.g., script.txt) and invoked with sed -f script.txt inputFileName , where "inputFileName" is the input text file. The script changes "inputFileName" line #1 to "Hello, world!" and then quits, printing the result before sed exits. Any input lines past line #1 are not read, and not printed. So

2193-415: The pattern or replacement; this is useful to avoid " leaning toothpick syndrome ". The substitution command, which originates in search-and-replace in ed, implements simple parsing and templating . The regexp provides both pattern matching and saving text via sub-expressions, while the replacement can be either literal text, or a format string containing the characters & for "entire match" or

2244-427: The sole output is "Hello, world!". The example emphasizes many key characteristics of sed: Below follow various sed scripts; these can be executed by passing as an argument to sed, or put in a separate file and executed via -f or by making the script itself executable. To replace any instance of a certain word in a file with "REDACTED", such as an IRC password, and save the result: To delete any line containing

2295-487: The special escape sequences \1 through \9 for the n th saved sub-expression. For example, sed -r "s/(cat|dog)s?/\1s/g" replaces all occurrences of "cat" or "dog" with "cats" or "dogs", without duplicating an existing "s": (cat|dog) is the 1st (and only) saved sub-expression in the regexp, and \1 in the format string substitutes this into the output. Besides substitution, other forms of simple processing are possible, using some 25 sed commands. For example,

2346-622: The special keyword elif because structure is denoted by indentation rather than braces, so a repeated use of else and if would require increased indentation after every condition. Some implementations of BASIC , such as Visual Basic , use ElseIf too. Similarly, the earlier UNIX shells (later gathered up to the POSIX shell syntax ) use elif too, but giving the choice of delimiting with spaces, line breaks, or both. However, in many languages more directly descended from Algol, such as Simula , Pascal , BCPL and C , this special syntax for

2397-775: The specific tasks they are designed to carry out, such specialized utilities are usually simpler, clearer, and faster than a more general solution such as sed. The ed/sed commands and syntax continue to be used in descendent programs, such as the text editors vi and vim . An analog to ed/sed is sam /ssam, where sam is the Plan 9 editor, and ssam is a stream interface to it, yielding functionality similar to sed. Conditional (computer programming) In computer science , conditionals (that is, conditional statements , conditional expressions and conditional constructs ) are programming language constructs that perform different computations or actions or return different values depending on

2448-415: The ternary operator, while others argue that concise expressions are easier to read than statements spread over several lines containing repetition. First, when the user runs the program, a cursor appears waiting for the reader to type a number. If that number is greater than 10, the text "My variable is named 'foo'." is displayed on the screen. If the number is smaller than 10, then the message "My variable

2499-423: The use of a label (a colon followed by a string) and the branch instruction b , as well as the conditional branch t . An instruction b followed by a valid label name will move processing to the command following that label. The t instruction will only do so if there was a successful substitution since the previous t (or the start of the program, in case of the first t encountered). Additionally,

2550-753: The value of a Boolean expression, called a condition . Conditionals are typically implemented by selectively executing instructions. Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime . Conditional statements are imperative constructs executed for side-effect, while conditional expressions return values. Many programming languages (such as C) have distinct conditional statements and conditional expressions. Although in pure functional programming , conditional expressions do not have side-effects , many languages with conditional expressions (such as Lisp) support conditional side-effects. The if–then or if–then–else construction

2601-439: The word "yourword" (the address is '/yourword/'): To delete all instances of the word "yourword": To delete two words from a file simultaneously: To express the previous example on one line, such as when entering at the command line, one may join two commands via the semicolon: In the next example, sed, which usually only works on one line, removes newlines from sentences where the second line starts with one space. Consider

SECTION 50

#1732765820699
#698301