sed

linux

Important:
http://www.grymoire.com/Unix/Sed.html - done reading
http://how-to.linuxcareer.com/learning-linux-commands-sed - a good collection of sed one-liners
http://sed.sourceforge.net/sed1line.txt - a very extensive collection of sed one-liners

http://www.panix.com/~elflord/unix/sed.html - done looking over
http://www.tutorialspoint.com/unix/unix-regular-expressions.htm - done looking over, good examples on ranges, regular expression, and things
https://www.digitalocean.com/community/tutorials/the-basics-of-using-the-sed-stream-editor-to-manipulate-text-in-linux - done looking over
http://www.linuxhowtos.org/System/sed_tutorial.htm - done looking over
http://stat.wharton.upenn.edu/~buja/STAT-540/sed-tutorial.htm - done looking over
https://quickleft.com/blog/command-line-tutorials-sed-awk/ - done looking over
http://www-users.york.ac.uk/~mijp1/teaching/2nd_year_Comp_Lab/guides/grep_awk_sed.pdf - done looking over
http://en.flossmanuals.net/command-line/sed/ - done looking over
http://www.linuxjournal.com/article/7231 - done looking over
http://unix.stackexchange.com/questions/148273/sed-ranges-arent-always-able-to-match-only-one-line
http://sed.sourceforge.net/sedfaq3.html

http://www.thegeekstuff.com/tag/sed-tips-and-tricks/
http://www.thegeekstuff.com/2009/10/unix-sed-tutorial-advanced-sed-substitution-examples/

What is sed?

A special editor for modifying text files. If you want to write a program to make changes in a text file, sed is the tool to use.

Sed is a stream editor. If that sounds strange, picture a stream flowing through a pipe (perhaps a pipe that is not completely filled up).

What is the basic syntax for using sed?

sed -e 'addressRange command'

The range component is optional, and we do not need to have a space between the address range and the command.

What is the default action in sed?

The default action is to print the line. If we do not specify the -n command line option, sed will print the line, even though none of the address range match.

What do we mean when we use $ as an address?

The $ denote the last line.

What are the commands available within sed?

  • s: substitution
  • a: append / add
  • i: insert lines
  • c: change
  • d: deletes the current pattern space, reads in the next line, puts the new line into the pattern space, and aborts the current command, and starts execution at the first sed command. This is called starting a new "cycle."
  • D: deletes the first portion of the pattern space, up to the new line character, leaving the rest of the pattern alone. Like "d," it stops the current command and starts the command cycle over again. However, it will not print the current pattern space. You must print it yourself, a step earlier. If the "D" command is executed with a group of other commands in a curly brace, commands after the "D" command are ignored. The next group of sed commands is executed, unless the pattern space is emptied. If this happens, the cycle is started from the top and a new line is read.
  • n: print out the current pattern space (unless the "-n" flag is used), empty the current pattern space, and read in the next line of input.
  • N: does not print out the current pattern space and does not empty the pattern space. It reads in the next line, but appends a new line character along with the input line itself to the pattern space.
  • p: prints the entire pattern space.
  • P: only prints the first part of the pattern space, up to the NEWLINE character. Neither the "p" nor the "P" command changes the patterns space.
  • q: quit
  • r: read the content of a file and print its content to the output stream
  • |: This is a pipe. prints the current pattern space. Useful for debugging.
  • =: print the line number
  • x: eXchanges the pattern space with the hold buffer
  • h: copies the pattern buffer into the hold buffer. The pattern buffer is unchanged.
  • H: allows you to combine several lines in the hold buffer. It acts like the "N" command as lines are appended to the buffer, with a "\n" between the lines. You can save several lines in the hold buffer, and print them only if a particular pattern is found later.
  • g: Instead of exchanging the hold space with the pattern space, you can copy the hold space to the pattern space with the "g" command. This deletes the pattern space. If you want to append to the pattern space, use the "G" command. This adds a new line to the pattern space, and copies the hold space after the new line.
  • G: Instead of exchanging the hold space with the pattern space, you can copy the hold space to the pattern space with the "g" command. This deletes the pattern space. If you want to append to the pattern space, use the "G" command. This adds a new line to the pattern space, and copies the hold space after the new line.
  • w: write to file
  • y: Transform like the tr shell command

What are the command line options?

  • -e: specify a sed command or expression
  • -n: tell sed not print anything unless an explicit request to print is found.
  • -f: specifies a file that contain sed commands
  • -V: prints the version of sed you are using
  • --posix: The GNU version of sed has many features that are not available in other versions. When portability is important, test your script with the -posix option.
  • -s: tells GNU sed to treat the files as independent files
  • -i: (lower case letter I) allows in-place editing
  • --follow-symlinks: handles symbolic link appropriately
  • -b: Binary. Handles MS-DOS newlines / line feed
  • -E or -r: turn on support for extended regular expression

What are some practical examples?

cat filename | sed '10q'        # uses piped input.  Print the first 10 lines and then quit.  Replace the Unix head command.
sed '10q' filename              # same effect, avoids a useless "cat"
sed '10q' filename > newfile    # redirects output to disk

sed s/eth0/eth1/g file1 >  file2 // changes every instance of eth0 in file1 to eth1 in file2.
sed s/"bad word"//g

// read in the file named old, replace all instance of night with day
// and send the result to a new file named new.
sed 's/day/night/' <old >new 

// switch two words
sed 's/\([a-z]*\) \([a-z]*\)/\2 \1/'

sed 's/Nick/John/g' report.txt
sed 's/Nick|nick/John/g' report.txt

sed 's/^/ /' file.txt >file_new.txt // Add 8 spaces to the left of a text for pretty printing.

# Display only one paragraph, starting with "Of course" and ending in "attention you pay"
sed -n '/Of course/,/attention you \
pay/p' myfile

sed -n '12,18p' file.txt // Show only lines 12-18 of file.txt

sed '12,18d' file.txt // Show all of file.txt except for lines from 12 to 18

sed G file.txt // Double-space file.txt

sed '5!s/ham/cheese/' file.txt // Replace ham with cheese in file.txt except in the 5th line

sed '$d' file.txt // Delete the last line

sed '/boom/!s/aaa/bb/' file.txt // Unless boom is found replace aaa with bb

sed '17,/disk/d' file.txt // Delete line 17 through the next line containing /disk/

sed 's/.$//' file.txt // A way to replace dos2unix :)

sed 's/^[ ^t]*//' file.txt // Delete all spaces in front of every line of file.txt

sed 's/[ ^t]*$//' file.txt // Delete all spaces at the end of every line of file.txt

sed 's/^[ ^t]*//;s/[ ^]*$//' file.txt // Delete all spaces in front and at the end of every line of file.txt

sed 's/foo/bar/' file.txt // Replace foo with bar only for the first instance in a line.
sed 's/foo/bar/4' file.txt // Replace foo with bar only for the 4th instance in a line.
sed 's/foo/bar/g' file.txt  // Replace foo with bar for all instances in a line.
sed '/baz/s/foo/bar/g' file.txt // Only if line contains baz, substitute foo with bar

sed -n '/RE/{p;q;}' file.txt // Print only the first match of RE (regular expression)

sed -r "s/\<(reg|exp)[a-z]+/\U&/g" // Convert any word starting with reg or exp to uppercase

sed '/regexp/!d' file.txt // A grep replacement

sed '/^$/q' // Get mail header.  Print all the lines until it encounter a blank line.
sed '1,/^$/d' // Get mail body.  Delete the lines until it encounter a blank line, and then use the default action to print all the lines after the blank line

sed 's@/usr/bin@&/local@g' path.txt // Replace /usr/bin with /usr/bin/local in path.txt.  Use @ as a separator instead of the default /

sed -e '/^#/d' // Remove lines starting with #

What is an address or range and how can we use it?

Sed has the ability to specify which lines are to be examined and/or modified, by specifying addresses before the command. For example:

// range by line numbers:
sed '1,100 s/A/a/'

A range is used to specify the scope for the command. It restrict the command. In the above example, we specify that the substitution command should only be applied to line 1 through line 100. Some useful restrictions might be:

  • Specifying a line by its number.
  • Specifying a range of lines by number.
  • All lines containing a pattern.
  • All lines from the beginning of a file to a regular expression
  • All lines from a regular expression to the end of the file.
  • All lines between two regular expressions.

Sed can do all that and more. Every command in sed can be proceeded by an address, range or restriction like the above examples. The restriction or address immediately precedes the command:

restriction command
sed '3 s/[0-9][0-9]*//' <file >new

The above command only do the substitution on the third line.

// restrict the substitution to lines starting with #
sed '/^#/ s/[0-9][0-9]*//'

// range by line numbers:
sed '1,100 s/A/a/'

// from line 101 through the end of the file:
sed '101,$ s/A/a/'

// range by pattern:
sed '/start/,/stop/ s/#.*//'

// range by number and pattern:
sed -e '1,/start/ s/#.*//'
sed -e '/start/,100 s/#.*//'

The first pattern turns on a flag that tells sed to perform the substitute command on every line. The second pattern turns off the flag. If the "start" and "stop" pattern occurs twice, the substitution is done both times. If the "stop" pattern is missing, the flag is never turned off, and the substitution will be performed on every line until the end of the file.

You should know that if the "start" pattern is found, the substitution occurs on the same line that contains "start." This turns on a switch, which is line oriented. That is, the next line is read and the substitute command is checked. If it contains "stop" the switch is turned off. Switches are line oriented, and not word oriented.

The $ is an address. It represent the end of the file.

What is the definition for pattern space?

When sed read each line into a temporary space, called pattern space, and each commands do its things to this line in this temporary space, and when all the commands are carried out, the result of this line is printed.

What is the definition for hold space / buffer?

Think of it as a spare pattern buffer. It can be used to "copy" or "remember" the data in the pattern space for later.

Does sed have flow control?

Yes. Search for "flow control" on this page.

What is the purpose of the s command?

Sed has several commands, but most people only learn the substitute command: s. The substitute command changes all occurrences of the regular expression into a new value. A simple example is changing "day" in the "old" file to "night" in the "new" file:

sed s/day/night/ <old >new

The above command read the content of a file named old, replace all instances of day with night and send the result to a new file named new.

When should we use quotes?

Always.

sed 's/day/night/' <old >new

What do we mean when we say that sed is line-oriented?

Sed is line oriented. It reads the file and process its input one line at a time. Suppose you have the input file:

one two three, one two three
four three two one
one hundred

and you used the command:

sed 's/one/ONE/' <file

The output would be:

ONE two three, one two three
four three two ONE
ONE hundred

Note that this changed "one" to "ONE" once on each line. The first line had "one" twice, but only the first occurrence was changed. That is the default behavior.

Can we use a different default delimiter that are used with the s command?

Yes. The character after the s is the delimiter. It is conventionally a slash, because this is what ed, more, and vi use. However, it can be anything you want:

sed 's_/usr/local/bin_/common/bin_' <old >new
sed 's:/usr/local/bin:/common/bin:' <old >new
sed 's|/usr/local/bin|/common/bin|' <old >new

Pick one you like. As long as it's not in the string you are looking for, anything goes. And remember that you need three delimiters. If you get a "Unterminated ‘s’ command" it's because you are missing one of them.

If the expression starts with a backslash, the next character is the delimiter.

What is the purpose of the special character & ?

It corresponds to the pattern found:

sed 's/[a-z]*/(&)/' <old >new

How can we use extended regular expression with sed?

Use the -r command line option, or the -E command line option depending on your operating system.

echo "123 abc" | sed -r 's/[0-9]+/& &/'

How can we do capturing?

The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern. Sed has up to nine remembered patterns.

sed 's/\([a-z]*\).*/\1/'

The above command keeps the first word of a line, and delete the rest of the line. Regular expressions are greedy, and try to match as much as possible. "[a-z]*" matches zero or more lower case letters, and tries to match as many characters as possible. The ".*" matches zero or more characters after the first match. Since the first one grabs all of the contiguous lower case letters, the second matches anything else.

What are the advantages and disadvantages of using extended regular expression?

Extended regular expression is more powerful, and therefore may perform a bit slower or faster in some circumstances.

By using extended regular expressions, the '(' and ')' no longer need to have a backslash:

sed -r 's/([a-z]+) ([a-z]+)/\2 \1/' # Using GNU sed
sed -E 's/([a-z]+) ([a-z]+)/\2 \1/' # Using Apple Mac OS X

Can the \1 .. \9 be anywhere?

Yes. The "\1" doesn't have to be in the replacement string (in the right hand side). It can be in the pattern you are searching for (in the left hand side) as well. For example, if you want to eliminate duplicated words, you can try:

sed 's/\([a-z]*\) \1/\1/'

If you want to detect duplicated words, you can use:

sed -n '/\([a-z][a-z]*\) \1/p'
sed -rn '/([a-z]+) \1/p' # GNU sed
sed -En '/([a-z]+) \1/p' # Mac OS X

Notice that in the above commands, we are not using the s command (the command for substitution).

What is the overall purpose of sed pattern flags?

These flags specify what happens when a match is found.

What is the purpose of the g pattern flag?

The g pattern flag specify global replacement:

sed 's/[^ ][^ ]*/(&)/g' <old >new

What is the purpose of the /1 - /512 pattern flag?

This pattern flag specify the occurrence. The number flag is not restricted to a single digit. It can be any number from 1 to 512.

To understand the number pattern flag, consider the following example. This next example keeps the first word on the line but deletes the second:

sed 's/\([a-zA-Z]*\) \([a-zA-Z]*\) /\1 /' <old >new

This is a mouthful. The second pattern is the same as the first. There is an easier way to do this. You can add a number after the substitution command to indicate you only want to match that particular pattern. Example:

sed 's/[a-zA-Z]* //2' <old >new

The above command instruct sed to do the substitution only on the second occurrence. The number flag is not restricted to a single digit. It can be any number from 1 to 512. If you wanted to add a colon after the 80th character in each line, you could type:

sed 's/./&:/80' <file >new

What is the purpose of the p pattern flag?

p is for print. By default, sed prints every line. If it makes a substitution, the new text is printed instead of the old one. If you use an optional argument to sed, "sed -n," it will not, by default, print any new lines. When the "-n" option is used, the "p" flag will cause the modified line to be printed. Here is one way to duplicate the function of grep with sed:

sed -n 's/pattern/&/p' <file

What is the purpose of the w pattern flag?

You can specify a file that will receive the modified data. An example is the following, which will write all lines that start with an even number, followed by a space, to the file even:

sed -n 's/^[0-9]*[02468] /&/w even' <file

You must have exactly one space between the w and the filename.

What is the purpose of the I pattern flag?

This flag makes the pattern match case insensitive.

sed -n '/abc/I p' <old >new

Note that a space after the '/I' and the 'p' (print) command emphasizes that the 'p' is not a modifier of the pattern matching process, but a command to execute after the pattern matching.

Can we combine pattern flags for the substitution command?

Yes. You can combine flags when it makes sense. Please note that the "w" has to be the last flag. For example the following command works:

sed -n 's/a/A/2pw /tmp/file' <old >new

How can we specify multiple commands?

If you need to make two changes, and you didn't want to read the manual, you could pipe together multiple sed commands:

sed 's/BEGIN/begin/' <old | sed 's/END/end/' >new

The above code used two processes instead of one. A sed guru never uses two processes when one can do. One method of combining multiple commands is to use multiple -e parameters (one -e before each command):

sed -e 's/a/A/' -e 's/b/B/' <old >new

How does sed handle file names on command line without the redirection operators?

You can specify files on the command line if you wish. If there is more than one argument to sed that does not start with an option, it must be a filename.

sed 's/^#.*//'  f1 f2 f3 | grep -v '^$' | wc -l

How can we quote large sed commands inside a shell script?

For C shell:

#!/bin/csh -f
sed 's/a/A/g  \
s/e/E/g \
s/i/I/g \
s/o/O/g \
s/u/U/g'  <old >new

The Bourne shell makes this easier as a quote can cover several lines:

#!/bin/sh
sed '
s/a/A/g 
s/e/E/g 
s/i/I/g 
s/o/O/g 
s/u/U/g'  <old >new

How can we make a sed script?

#!/bin/sed -f
s/a/A/g
s/e/E/g
s/i/I/g
s/o/O/g
s/u/U/g

Inside a sed script, how can we comment our code?

#!/bin/sed -nf

does work. http://www.grymoire.com/Unix/Sed.html

How can we pass arguments to our sed script?

If we pass one argument to our sed script, our sed script receive it as $1. A simple shell script that uses sed to emulate grep is:

#!/bin/sh
sed -n 's/'$1'/&/p'

However - there is a problem with this script. If you have a space as an argument, the script would cause a syntax error A better version would protect from this happening:

#!/bin/sh
sed -n 's/'"$1"'/&/p'

If this was stored in a file called sedgrep, you could type:

sedgrep '[A-Z][A-Z]' <file

This would allow sed to act as the grep command.

What is the purpose of the d command?

Delete lines:

// delete line 11 through the end of file (only display the line 1 through 10):
sed '11,$ d' <file 

// Chop off the header of a mail message, which is everything up to the first blank line:
sed '1,/^$/ d' <file

// Delete all lines that start with #
sed '/^#/ d'

What is the purpose of the p command?

It is for printing.

// Duplicate every lines:
sed 'p'

// Duplicate empty lines:
sed '/^$/ p'

// Print the first 10 lines:
sed -n '1,10 p' <file

// Emulate grep:
sed -n '/match/ p'

How can we reverse the restriction with ! ?

sed -n '/match/ !p' </tmp/b

You may expect the the ! should be specified with the address range, but it is specified with the command instead.

What is the purpose of the q command?

Quit.

// Quit when the 11th line is reached:
sed '11 q'

If you want to process lines between pattern1 and pattern2, and quit immediately after pattern2 (because the file is huge):

sed -n -e '/pattern1/,/pattern2/' -e '/pattern2/q' filename

What is the purpose of the {} construct?

The curly braces, "{" and "}," are used to group the commands.

How can we use {}?

#!/bin/sh
# This is a Bourne shell script that removes #-type comments
# between 'begin' and 'end' words.
sed -n '
    1,100 {
        /begin/,/end/ {
             s/#.*//
             s/[ ^I]*$//
             /^$/ d
             p
        }
    }'

We can do a substitute on a pattern range, like changing "old" to "new" between a begin/end pattern:

#!/bin/sh
sed '
    /begin/,/end/ s/old/new/
'

Another way to write this is to use the curly braces for grouping:

#!/bin/sh
sed '
    /begin/,/end/ {
        s/old/new/
    }
'

This makes the code cleaner and easier to understand, especially, when multiple commands should be invoked with the same pattern range.

You can place a "!" before a set of curly braces (or after the pattern range). This inverts the address.

How many files can be opened by sed?

10.

What is the purpose of the w command?

Write to file. You may remember that the substitute command can write to a file. Here again is the example that will only write lines that start with an even number (and followed by a space):

sed -n 's/^[0-9]*[02468] /&/w even' <file

I used the "&" in the replacement part of the substitution command so that the line would not be changed. A simpler example is to use the "w" command, which has the same syntax as the "w" flag in the substitute command:

sed -n '/^[0-9]*[02468]/ w even' <file

Remember - only one space must follow the command. Anything else will be considered part of the file name. The "w" command also has the same limitation as the "w" flag: only 10 files can be opened in sed.

What is the purpose of the r command?

Reading in a file. For example:

sed '$r end' <infile >outfile

The above command will append the file "end" at the end of the file (addressed by the $ pattern range, which is the end of the file). In other word, the above command, the shell will read the content of infile, set it up as the input for the sed command. The sed command will print each line, and when it hit the last line, the r command is executed, and it read the content of the file named end and print it, and whatever had been printed is redirected to the outfile.

The following will insert a file after the line with the word "INCLUDE:"

sed '/INCLUDE/ r file' <in >out

You can use the curly braces to delete the line having the "INCLUDE" command on it:

#!/bin/sh
sed '/INCLUDE/ {
    r file
    d
}'

The order of the delete command "d" and the read file command "r" is important. Change the order and it will not work. There are two subtle actions that prevent this from working. The first is the "r" command writes the file to the output stream. The file is not inserted into the pattern space, and therefore cannot be modified by any command. Therefore the delete command does not affect the data read from the file.

The other subtlety is the "d" command deletes the current data in the pattern space. Once all of the data is deleted, it does make sense that no other action will be attempted. Therefore a "d" command executed in a curly brace also aborts all further actions. As an example, the substitute command below is never executed:

#!/bin/sh
# this example is WRONG
sed -e '1 {
    d
    s/.*//
}'

Why is it an error to use the d command in the middle of a group?

The "d" command deletes the current data in the pattern space. Once all of the data is deleted, it does make sense that no other action will be attempted. Therefore a "d" command executed in a curly brace also aborts all further actions. As an example, the substitute command below is never executed:

#!/bin/sh
# this example is WRONG
sed -e '1 {
    d
    s/.*//
}'

What is the purpose of the a command?

Add a line. The "a" command appends a line after the range or pattern. This example will add a line after every line with "WORD:"

#!/bin/sh
sed '
/WORD/ a\
Add this line after every line with WORD
'

Because an entire line is added, the new line is on a line by itself to emphasize this. There is no option, an entire line is used, and it must be on its own line. If you are familiar with many UNIX utilities, you would expect sed to use a similar convention: lines are continued by ending the previous line with a "\". There must not be a space after the "\".

What is the purpose of the i (lowercase letter I) command?

You can insert a new line before the pattern with the "i" command:

#!/bin/sh
sed '
/WORD/ i\
Add this line before every line with WORD
'

What is the purpose of the c command?

You can change the current line with a new line.

#!/bin/sh
sed '
/WORD/ c\
Replace the current line with the line
'

How can we combine the a, i, and c command in one group {}?

You can combine all three actions using curly braces:

#!/bin/sh
sed '
/WORD/ {
i\
Add this line before
a\
Add this line after
c\
Change the line to this one
}'

How does sed handle leading spaces and tabs especially for the i, a, and c commands?

Sed ignores leading tabs and spaces in all commands. However these white space characters may or may not be ignored if they start the text following a "a," "c" or "i" command. In SunOS, both "features" are available. The Berkeley (and Linux) style sed is in /usr/bin, and the AT&T version (System V) is in /usr/5bin/.

To elaborate, the /usr/bin/sed command retains white space, while the /usr/5bin/sed strips off leading spaces. If you want to keep leading spaces, and not care about which version of sed you are using, put a "\" as the first character of the line:

#!/bin/sh
sed '
    a\
\    This line starts with a tab
'

How can we add more than one line?

All three commands (a, i, and c) will allow you to add more than one line. Just end each line with a "\"

#!/bin/sh
sed '
/WORD/ a\
Add this line\
This line\
And this line
'

What commands bypass the pattern space?

Most commands operate on the pattern space, and subsequent commands may act on the results of the last modification. The commands that bypass the pattern space:

  1. r (read)
  2. a (append, or add)
  3. i (insert)
  4. c (change)

Actually, these commands do not really bypass the pattern space. They just do not add lines to the pattern space.

Which commands can take address range, and which commands cannot take an address range?

Some commands can take a range of lines, and others cannot. To be precise, the commands "a," "i," "r," and "q" will not take a range like "1,100" or "/begin/,/end/." The documentation states that the read command can take a range, but I got an error when I tried this. The "c" or change command allows this, and it will let you change several lines into one:

#!/bin/sh
sed '
/begin/,/end/ c\
***DELETED***
'

The above command replaces several lines with just one line.

How can we add a blank line after every line?

#!/bin/sh
# add a blank line after every line
sed '1,$ {
    a\

}'

Can we use sub-range within a group?

Yes.

#!/bin/sh
sed '
/ONE/ {
# append a line
    N
# search for TWO on the second line    
    /\n.*TWO/ {
# found it - now edit making one line
        s/ONE.*\n.*TWO/ONE TWO/
    }
}' file

In the above example, N is a command. The /\n.*TWO/ is another address-range. In other words, it is a sub-range. Of course, the {} following that is just another grouping.

What is the purpose of the = command?

Print the line number.

sed -n '/PATTERN/ =' file

#!/bin/sh
lines=`sed -n '$=' file `

The "=" command only accepts one address, so if you want to print the number for a range of lines, you must use the curly braces:

#!/bin/sh
# Just print the line numbers 
sed -n '/begin/,/end/ {
=
d
}' file

Since the "=" command only prints to standard output, you cannot print the line number on the same line as the pattern.

What is the purpose of the y command?

If you wanted to change a word from lower case to upper case, you could write 26 character substitutions, converting "a" to "A," etc. Sed has a command that operates like the tr program. It is called the "y" command. For instance, to change the letters "a" through "f" into their upper case form, use:

sed 'y/abcdef/ABCDEF/' file
sed '/0x[0-9a-zA-Z]*/ y/abcdef/ABCDEF' file

What is the purpose of the | (pipe) command?

The "l" command prints the current pattern space. It is therefore useful in debugging sed scripts. It also converts unprintable characters into printing characters by outputting the value in octal preceded by a "\" character. It is useful to print out the current pattern space, while probing the subtleties of sed.

What is the purpose of the n command?

The "n" command will print out the current pattern space (unless the "-n" flag is used), empty the current pattern space, and read in the next line of input.

What is the purpose of the N command?

The "N" command does not print out the current pattern space and does not empty the pattern space. It reads in the next line, but appends a new line character along with the input line itself to the pattern space. Consider:

sed -e 'N'

The above command combines the first and second line, then prints them, combines the third and fourth line, and prints them, etc.

How does the d command work?

The "d" command deletes the current pattern space, reads in the next line, puts the new line into the pattern space, and aborts the current command, and starts execution at the first sed command. This is called starting a new "cycle."

What is the purpose of the D command?

The "D" command deletes the first portion of the pattern space, up to the new line character, leaving the rest of the pattern alone. Like "d," it stops the current command and starts the command cycle over again. However, it will not print the current pattern space. You must print it yourself, a step earlier. If the "D" command is executed with a group of other commands in a curly brace, commands after the "D" command are ignored. The next group of sed commands is executed, unless the pattern space is emptied. If this happens, the cycle is started from the top and a new line is read.

What is the difference between the p command and the P command?

The "p" command prints the entire pattern space. The "P" command only prints the first part of the pattern space, up to the NEWLINE character. Neither the "p" nor the "P" command changes the patterns space.

How can we search for a line that ended with the character "#," and append the next line to it?

#!/bin/sh
sed '
# look for a "#" at the end of the line
/#$/ {
# Found one - now read in the next line
    N
# delete the "#" and the new line character, 
    s/#\n//
}' file

How can we process multiple lines with sed?

By default, sed process one line at a time. However, we can use the N, P, and D commands mentioned above to do more advance stuff that involve multiple lines.

#!/bin/sh
sed '/skip3/ {
           N
           N
           s/skip3\n.*\n.*/# 3 lines deleted/
}'
#!/bin/sh
sed '
/one/ {
      N
      /two/ {
            N
            /three/ {
                    N
                    s/one\ntwo\nthree/1+2+3/
                    }
            }
      }
'

How can we merge two lines together?

#!/bin/sh
sed '=' file | \
sed '{
    N
    s/\n/ /
}'

In the above code, we have two instance of sed. The first sed instance use the = command to print the line number and also print the line. The result is pipe to the second sed instance which merge the lines. What we really interested in is the second sed instance.

How can we split a line?

If you find it necessary, you can break one line into two lines, edit them, and merge them together again. As an example, if you had a file that had a hexadecimal number followed by a word, and you wanted to convert the first word to all upper case, you can use the "y" command, but you must first split the line into two lines, change one of the two, and merge them together. That is, a line containing:

0x1fff table2

will be changed into two lines:

0x1fff
table2

and the first line will be converted into upper case. I will use tr to convert the space into a new line, and then use sed to do the rest. The command would be:

./sed_split <file

and sed_split would be:

#!/bin/sh
tr ' ' '\012' | 
sed ' {
    y/abcdef/ABCDEF/
    N
    s/\n/ /
}'

In other words, don't try to use sed to split the line.

sed could be used instead of tr. You can embed a new line in a substitute command, but you must escape it with a backslash. It is unfortunate that you must use "\n" in the left side of a substitute command, and an embedded new line in the right hand side.

#!/bin/sh
sed '
s/ /\
/' | \
sed ' {
    y/abcdef/ABCDEF/
    N
    s/\n/ /
}'

The above code also involve two sed instances.

How can we substitute something with newlines?

If you are inserting a new line, don't use "\n" - instead insert a literal new line character:

(echo a;echo x;echo y) | sed 's:x:\
:'

The above code substitute the letter x with a newline. Notice that the colon is used as the separator, and the last separator is on a separate line.

What is the purpose of the x command?

The "x" command eXchanges the pattern space with the hold buffer. By itself, the command isn't useful. Executing the sed command:

sed 'x'

The hold buffer starts out containing a blank line. When the "x" command modifies the first line, line 1 is saved in the hold buffer, and the blank line takes the place of the first line. The second "x" command exchanges the second line with the hold buffer, which contains the first line. Each subsequent line is exchanged with the preceding line. The last line is placed in the hold buffer, and is not exchanged a second time, so it remains in the hold buffer when the program terminates, and never gets printed. This illustrates that care must be taken when storing data in the hold buffer, because it won't be output unless you explicitly request it.

One use of the hold buffer is to remember previous lines. An example of this is a utility that acts like grep as it shows you the lines that match a pattern. In addition, it shows you the line before and after the pattern. That is, if line 8 contains the pattern, this utility would print lines 7, 8 and 9.

One way to do this is to see if the line has the pattern. If it does not have the pattern, put the current line in the hold buffer. If it does, print the line in the hold buffer, then the current line, and then the next line. After each set, three dashes are printed. The script checks for the existence of an argument, and if missing, prints an error. Passing the argument into the sed script is done by turning off the single quote mechanism, inserting the "$1" into the script, and starting up the single quote again:

#!/bin/sh
# grep3 - prints out three lines around pattern
# if there is only one argument, exit

case $# in 
    1);;
    *) echo "Usage: $0 pattern";exit;;
esac;
# I hope the argument doesn't contain a /
# if it does, sed will complain

# use sed -n to disable printing 
# unless we ask for it
sed -n '
'/$1/' !{
    #no match - put the current line in the hold buffer
    x
    # delete the old one, which is 
    # now in the pattern buffer
    d
}
'/$1/' {
    # a match - get last line
    x
    # print it
    p
    # get the original line back
    x
    # print it
    p
    # get the next line 
    n
    # print it
    p
    # now add three dashes as a marker
    a\
---
    # now put this line into the hold buffer
    x
}'

What is the purpose of the h command?

The "x" command exchanges the hold buffer and the pattern buffer. Both are changed. The "h" command copies the pattern buffer into the hold buffer. The pattern buffer is unchanged. An identical script to the above uses the hold commands:

#!/bin/sh
# grep3 version b - another version using the hold commands
# if there is only one argument, exit

case $# in 
    1);;
    *) echo "Usage: $0 pattern";exit;;
esac;

# again - I hope the argument doesn't contain a /

# use sed -n to disable printing 

sed -n '
'/$1/' !{
    # put the non-matching line in the hold buffer
    h
}
'/$1/' {
    # found a line that matches
    # append it to the hold buffer
    H
    # the hold buffer contains 2 lines
    # get the next line
    n
    # and add it to the hold buffer
    H
    # now print it back to the pattern space
    x
    # and print it.
    p
    # add the three hyphens as a marker
    a\
---
}'

What is the purpose of the H command?

The "H" command allows you to combine several lines in the hold buffer. It acts like the "N" command as lines are appended to the buffer, with a "\n" between the lines. You can save several lines in the hold buffer, and print them only if a particular pattern is found later.

As an example, take a file that uses spaces as the first character of a line as a continuation character. The files /etc/termcap, /etc/printcap, makefile and mail messages use spaces or tabs to indicate a continuing of an entry. If you wanted to print the entry before a word, you could use this script. I use a "^I" to indicate an actual tab character:

#!/bin/sh 
# print previous entry
sed -n '
/^[ ^I]/!{
    # line does not start with a space or tab,
    # does it have the pattern we are interested in?
    '/$1/' {
        # yes it does. print three dashes
        i\
---
        # get hold buffer, save current line
        x
        # now print what was in the hold buffer
        p
        # get the original line back
        x
    }
    # store it in the hold buffer
    h
}
# what about lines that start
# with a space or tab?
/^[ ^I]/ {
    # append it to the hold buffer
    H
}'

What is the purpose of the g command and what is the purpose of the G command?

Instead of exchanging the hold space with the pattern space, you can copy the hold space to the pattern space with the "g" command. This deletes the pattern space. If you want to append to the pattern space, use the "G" command. This adds a new line to the pattern space, and copies the hold space after the new line.

#!/bin/sh
# grep3 version c: use 'G'  instead of H

# if there is only one argument, exit

case $# in 
    1);;
    *) echo "Usage: $0 pattern";exit;;
esac;

# again - I hope the argument doesn't contain a /

sed -n '
'/$1/' !{
    # put the non-matching line in the hold buffer
    h
}
'/$1/' {
    # found a line that matches
    # add the next line to the pattern space
    N
    # exchange the previous line with the 
    # 2 in pattern space
    x
    # now add the two lines back
    G
    # and print it.
    p
    # add the three hyphens as a marker
    a\
---
    # remove first 2 lines
    s/.*\n.*\n\(.*\)$/\1/
    # and place in the hold buffer for next time
    h
}'

How can we implement flow control / logic?

There are three commands sed uses for this. You can specify a label with an text string preceded by a colon. The "b" command branches to the label. The label follows the command. If no label is there, branch to the end of the script. The "t" command is used to test conditions.

This example remembers paragraphs, and if it contains the pattern (specified by an argument), the script prints out the entire paragraph.

#!/bin/sh
sed -n '
# if an empty line, check the paragraph
/^$/ b para
# else add it to the hold buffer
H
# at end of file, check paragraph
$ b para
# now branch to end of script
b
# this is where a paragraph is checked for the pattern
:para
# return the entire paragraph
# into the pattern space
x
# look for the pattern, if there - print
/'$1'/ p
'

You can execute a branch if a pattern is found. You may want to execute a branch only if a substitution is made. The command "t label" will branch to the label if the last substitute command modified the pattern space.

One use for this is recursive patterns. Suppose you wanted to remove white space inside parenthesis. These parentheses might be nested. That is, you would want to delete a string that looked like "( ( ( ())) )." The sed expressions

sed 's/([ ^I]*)/g'

would only remove the innermost set. You would have to pipe the data through the script four times to remove each set or parenthesis. You could use the regular expression

sed 's/([ ^I()]*)/g'

but that would delete non-matching sets of parenthesis. The "t" command would solve this:

#!/bin/sh
#delete_nested_parens.sh
sed '
:again
    s/([ ^I]*)//
    t again
'

An earlier version had a 'g' after the 's' expression. This is not needed.

How can we comment our sed script if it does not allow commenting?

You can normally put the # character at the start of a line to indicate that it is a comment. However, some version of sed only allow commenting at the very top of the script. There is another way to add comments in a sed script if you don't have a version that supports it. Use the "a" command with the line number of zero:

#!/bin/sh
sed '
/begin/ {
0i\
    This is a comment\
    It can cover several lines\
    It will work with any version of sed
}'

What is the purpose of the ; command?

There is one more sed command that isn't well documented. It is the ";" command. This can be used to combined several sed commands on one line. Here is the grep4 script I described earlier, but without the comments or error checking and with semicolons between commands:

#!/bin/sh
sed -n '
'/$1/' !{;H;x;s/^.*\n\(.*\n.*\)$/\1/;x;}
'/$1/' {;H;n;H;x;p;a\
---
}'

How can we build a sed script that takes a regular expression from the command line?

In the earlier scripts, I mentioned that you would have problems if you passed an argument to the script that had a slash in it. In fact, regular expression might cause you problems. A script like the following is asking to be broken some day:

#!/bin/sh
sed 's/'"$1"'//g'

If the argument contains any of these characters in it, you may get a broken script: "/\.*[]^$" For instance, if someone types a "/" then the substitute command will see four delimiters instead of three. You will also get syntax errors if you provide a "]" without a "]". One solution is to have the user put a backslash before any of these characters when they pass it as an argument. However, the user has to know which characters are special. Another solution is to add a backslash before each of those characters in the script:

#!/bin/sh
arg=`echo "$1" | sed 's:[]\[\^\$\.\*\/]:\\\\&:g'`
sed 's/'"$arg"'//g'

If you were searching for the pattern "^../," the script would convert this into "\^\.\.\/" before passing it to sed.

How does sed support binary characters?

Dealing with binary characters can be trick, expecially when writing scripts for people to read. I can insert a binary character using an editor like EMACS but if I show the binary character, the terminal may change it to show it to you. The easiest way I have found to do this in a script in a portable fashion is to use the tr(1) command. It understands octal notations, and it can be output into a variable which can be used.

Here's a script that will replace the string "ding" with the ASCII bell character:

#!/bin/sh
BELL=`echo x | tr 'x' '\007'`
sed "s/ding/$BELL/"

Please note that I used double quotes. Since special characters are interpreted, you have to be careful when you use this mechanism. http://www.grymoire.com/Unix/Sed.html

What is the purpose of the —posix command line argument?

The GNU version of sed has many features that are not available in other versions. When portability is important, test your script with the -posix option. If you had an example that used a feature of GNU sed, such as the 'v' command to test the version number, such as

#this is a sed command file
v 4.0.1
# print the number of lines
$=

And you executed it with the command:

sed -nf sedfile --posix <file

then the GNU version of sed program would give you a warning that your sed script is not compatible. It would report:

sed: -e expression #1, char 2: unknown command: `v'

What is the purpose of the -e command line parameter?

The -e command line parameter specify a sed command.

What is the purpose of the -n command line parameter?

The "-n" option will not print anything unless an explicit request to print is found. The "/p" flag to the substitute command as one way to turn printing back on.

What are the long option versions of the -n command line parameter?

sed --quiet 's/PATTERN/&/p' file
sed --silent 's/PATTERN/&/p' file

What happens if the p pattern flag is used without the -n command line option?

The line is printed twice. By default, without the -n command line option, sed will print the matched line, and the p pattern flag cause it to be printed again.

What is the purpose of the -f command line option?

It specifies a file that contain sed commands. If you have a large number of sed commands, you can put them into a file and use:

sed -f sedscript <old >new

where sedscript could look like this:

# sed comment - This script changes lower case vowels to upper case
s/a/A/g
s/e/E/g
s/i/I/g
s/o/O/g
s/u/U/g

When there are several commands in one file, each command must be on a separate line.

What is the purpose of the -V command line parameter?

Print the version of sed you are using.

What is the purpose of the -h command line option?

The -h option will print a summary of the sed commands.

What is the purpose of the -s command line option?

Normally, when you specify several files on the command line, sed concatenates the files into one stream, and then operates on that single stream. If you had three files, each with 100 lines, then the command:

sed -n '1,10 p' file1 file2 file3

would only print the first 10 lines of file file1. The '-s' command tells GNU sed to treat the files as independent files, and to print out the first 10 lines of each file, which is similar to the head command. Here's another example: If you wanted to print the number of lines of each file, you could use 'wc -l' which prints the number of lines, and the filename, for each file, and at the end print the total number of lines. Here is a simple shell script that does something similar, just using sed:

#!/bin/sh
FILES=$*
sed -s -n '$=' $FILES # print the number of lines for each file
sed -n '$=' $FILES # print the total number of lines.

The 'wc -l' command does print out the filenames, unlike the above script. A better emulation of the 'wc -l' command would execute the command in a loop, and print the filenames. Here is a more advanced script that does this, but it doesn't use the '-s' command:

#!/bin/sh
for F in "$@"
do
 NL=`sed -n '$=' < "$F" ` &&  printf "  %d %s\n" $NL "$F"
done
TOTAL=`sed -n '$=' "$@"`
printf "  %d total\n" $TOTAL

What is the purpose of the -i command line option?

This allows in-place editing. Normally, sed does not directly modify the file that you specify as input. The -i command line option tells sed to directly modify those files.

Let's assume that we are going to make the same simple change - adding a tab before each line. This is a way to do this for all files in a directory with the ".txt" extension in the current directory:

sed -i 's/^/\t/' *.txt

This verison deletes the original file. If you are as cautious as I am, you may prefer to specify an extension, which is used to keep a copy of the original:

sed -i.tmp 's/^/\t/' *.txt

The GNU version of sed allows you to use "-i" without an argument. The FreeBSD/Mac OS X does not. You must provide an extension for the FreeBSD/Mac OS X version. If you want to do in-place editing without creating a backup, you can use:

sed -i ''  's/^/\t/'  *.txt

What is the purpose of the —follow-symlinks options?

The in-place editing feature is handy to have. But what happens if the file you are editing is a symbolic link to another file? Let's assume you have a file named "b" in a directory called "tmp", with a symbolic link to this file:

$ ls -l b
lrwxrwxrwx 1 barnett adm 6 Mar 16 16:03 b.txt -> tmp/b.txt

If you executed the above command to do in place editing, there will be a new file called "b.txt" in the current directory, and "tmp/b.txt" will be unchanged. Now you have two versions of the file, one is changed (in the current directory), and one is not (in the "tmp" directory). And where you had a symbolic link, it has been replaced with a modified version of the original file. If you want to edit the real file, and keep the symbolic link in place, use the "—follow-symlinks" command line option:

sed -i --follow-symlinks 's/^/\t/' *.txt

This follows the symlink to the original location, and modifies the file in the "tmp" directory, If you specify an extension, the original file will be found with that extension in the same directory ar the real source. Without the —follow-symlinks command line option, the "backup" file "b.tmp" will be in the same directory that held the symbolic link, and will still be a symbolic link - just renamed to give it a new extension.

What is the purpose of the -b command line option?

Binary. Unix and Linux systems consider the new line character "\n" to be the end of the line. However, MS-DOS, Windows, and Cygwin systems end each line with "\r\n" - Carriage return and line-feed. If you are using any of these operating systems, the "-b" or —binary" command line option will treat the carriage return/new line combination as the end of the line. Otherwise the carriage return is treated as an unprintable character immediately before the end-of-line. I think. (Note to self - verify this).

What is the purpose of the -E or -r command line option?

Turn on support for extended regular expression.

What is the purpose of the -u command line option?

Unbuffer. Normally - Unix and Linux systems apply some intelligence to handling standard output. It's assumed that if you are sending results to a terminal, you want the output as soon as it becomes available. However, if you are sending the output to a file, then it's assumed you want better performance, so it buffers the output until the buffer is full, and then the contents of the buffer is written to the file.

Mac OS X and FreeBSD use the argument "-l". GNU Sed 4.2.2 and later will also be unbuffered while reading files, not just writing them.

What is the purpose of the -z command line option?

Null Data. Normally, sed reads a line by reading a string of characters up to the end-of-line character (new line or carriage return). See the -b Binary command line argument The GNU version of sed added a feature in version 4.2.2 to use the "NULL" character instead. This can be useful if you have files that use the NULL as a record separator. Some GNU utilities can genertae output that uses a NULL instead a new line, such as "find . -print0" or "grep -lZ". This feature is useful if you are operating on filenames that might contain spaces or binary characters.

For instance, if you wanted to use "find" to search for files and you used the "-print0" option to print a NULL at the end of each filename, you could use sed to delete the directory pathname:

find . -type f -print0  | sed -z 's:^.*/::' | xargs -0 echo

The above example is not terribly useful as the "xargs" use of echo does not retain the ability to retain spaces as part of the filename. But is does show how to use the sed "-z" command.

GNU grep also has a -Z option to search for strings in files, placing a "NULL" at the end of each filename instead of a new line. And with the -l command, grep will print the filename that contains the string, retaining non-printing and binary characters:

grep -lZ STRING */*/* | sed -z 's:^.*/::' | xargs -0 echo

This feature is very useful when users have the ability to create their own filenames.

What is the purpose of the -a command line option?

Normally, as soon as sed starts up, it opens all files that are refered to by the "w" command. The FreeBSD version of sed has an option to delay this action until the "w" command is executed.

What is the purpose of the -I command line option?

FreeBSD added a "-I" option that is similar to the -i option. The "-i" option treats the editing each file as a separate instance of sed. If the "-I" option is used, then line numbers do not get reset at the beginning of each line, and ranges of addresses continue from one file to the next. That is, if you used the range '/BEGIN/,/END/' and you used the "-I" option, you can have the "BEGIN" in the first file, and "END" in the second file, and the commands executed within the range would span both files. If you used "-i", then the commands would not.

And like the -i option, the extension used to store the backup file must be specified.

Does sed support user-defined function?

I DO NOT KNOW THIS YET.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License