Solution Checker

Question

I have a file like the following and I would like to print the lines between two given patterns PAT1 and PAT2.

1
2
PAT1
3    - first block
4
PAT2
5
6
PAT1
7    - second block
PAT2
8
9
PAT1
10    - third block

I have read How to select lines between two marker patterns which may occur multiple times with awk/sed but I am curious to see all the possible combinations of this, either including or excluding the pattern.

How can I print all lines between two patterns?

Answer 1

Solution 1

Print lines between PAT1 and PAT2

$ awk '/PAT1/,/PAT2/' file
PAT1
3    - first block
4
PAT2
PAT1
7    - second block
PAT2
PAT1
10    - third block

Or, using variables:

awk '/PAT1/{flag=1} flag; /PAT2/{flag=0}' file

How does this work?

/PAT1/ matches lines having this text, as well as /PAT2/ does.
/PAT1/{flag=1} sets the flag when the text PAT1 is found in a line.
/PAT2/{flag=0} unsets the flag when the text PAT2 is found in a line.
flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed. This way, it will print all those lines occurring from the time PAT1 occurs and up to the next PAT2 is seen. This will also print the lines from the last match of PAT1 up to the end of the file.

Print lines between PAT1 and PAT2 - not including PAT1 and PAT2

$ awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' file
3    - first block
4
7    - second block
10    - third block

This uses next to skip the line that contains PAT1 in order to avoid this being printed.

This call to next can be dropped by reshuffling the blocks: awk '/PAT2/{flag=0} flag; /PAT1/{flag=1}' file.

Print lines between PAT1 and PAT2 - including PAT1

$ awk '/PAT1/{flag=1} /PAT2/{flag=0} flag' file
PAT1
3    - first block
4
PAT1
7    - second block
PAT1
10    - third block

By placing flag at the very end, it triggers the action that was set on either PAT1 or PAT2: to print on PAT1, not to print on PAT2.

Print lines between PAT1 and PAT2 - including PAT2

$ awk 'flag; /PAT1/{flag=1} /PAT2/{flag=0}' file
3    - first block
4
PAT2
7    - second block
PAT2
10    - third block

By placing flag at the very beginning, it triggers the action that was set previously and hence print the closing pattern but not the starting one.

Print lines between PAT1 and PAT2 - excluding lines from the last PAT1 to the end of file if no other PAT2 occurs

This is based on a solution by Ed Morton.

awk 'flag{
        if (/PAT2/)
           {printf "%s", buf; flag=0; buf=""}
        else
            buf = buf $0 ORS
     }
     /PAT1/ {flag=1}' file

As a one-liner:

$ awk 'flag{ if (/PAT2/){printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /PAT1/{flag=1}' file
3    - first block
4
7    - second block

# note the lack of third block, since no other PAT2 happens after it

This keeps all the selected lines in a buffer that gets populated from the moment PAT1 is found. Then, it keeps being filled with the following lines until PAT2 is found. In that point, it prints the stored content and empties the buffer.

Answer 2

Solution 2

What about the classic sed solution?

Print lines between PAT1 and PAT2 - include PAT1 and PAT2

sed -n '/PAT1/,/PAT2/p' FILE

Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2

GNU sed

sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE

Any sed¹

sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p;};}' FILE

or even (Thanks Sundeep):

GNU sed

sed -n '/PAT1/,/PAT2/{//!p}' FILE

Any sed

sed -n '/PAT1/,/PAT2/{//!p;}' FILE

Print lines between PAT1 and PAT2 - include PAT1 but not PAT2

The following includes just the range start:

GNU sed

sed -n '/PAT1/,/PAT2/{/PAT2/!p}' FILE

Any sed

sed -n '/PAT1/,/PAT2/{/PAT2/!p;}' FILE

Print lines between PAT1 and PAT2 - include PAT2 but not PAT1

The following includes just the range end:

GNU sed

sed -n '/PAT1/,/PAT2/{/PAT1/!p}' FILE

Any sed

sed -n '/PAT1/,/PAT2/{/PAT1/!p;}' FILE

¹ Note about BSD/Mac OS X sed

A command like this here:

sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE

Would emit an error:

 sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE
sed: 1: "/PAT1/,/PAT2/{/PAT1/!{/ ...": extra characters at the end of p command

For this reason this answer has been edited to include BSD and GNU versions of the one-liners.

Answer 3

Solution 3

Using grep with PCRE (where available) to print markers and lines between markers:

$ grep -Pzo "(?s)(PAT1(.*?)(PAT2|\Z))" file
PAT1
3    - first block
4
PAT2
PAT1
7    - second block
PAT2
PAT1
10    - third block

-P perl-regexp, PCRE. Not in all grep variants
-z Treat the input as a set of lines, each terminated by a zero byte instead of a newline
-o print only matching
(?s) DotAll, ie. dot finds newlines as well
(.*?) nongreedy find
\Z Match only at end of string, or before newline at the end

Print lines between markers excluding end marker:

$ grep -Pzo "(?s)(PAT1(.*?)(?=(\nPAT2|\Z)))" file
PAT1
3    - first block
4
PAT1
7    - second block
PAT1
10    - third block

(.*?)(?=(\nPAT2|\Z)) nongreedy find with lookahead for \nPAT2 and \Z

Print lines between markers excluding markers:

$ grep -Pzo "(?s)((?<=PAT1\n)(.*?)(?=(\nPAT2|\Z)))" file
3    - first block
4
7    - second block
10    - third block

(?<=PAT1\n) positive lookbehind for PAT1\n

Print lines between markers excluding start marker:

$ grep -Pzo "(?s)((?<=PAT1\n)(.*?)(PAT2|\Z))" file
3    - first block
4
PAT2
7    - second block
PAT2
10    - third block

Answer 4

Solution 4

Here is another approach

Include both patterns (default)

$ awk '/PAT1/,/PAT2/' file
PAT1
3    - first block
4
PAT2
PAT1
7    - second block
PAT2
PAT1
10    - third block

Mask both patterns

$ awk '/PAT1/,/PAT2/{if(/PAT2|PAT1/) next; print}' file
3    - first block
4
7    - second block
10    - third block

Mask start pattern

$ awk '/PAT1/,/PAT2/{if(/PAT1/) next; print}' file
3    - first block
4
PAT2
7    - second block
PAT2
10    - third block

Mask end pattern

$ awk '/PAT1/,/PAT2/{if(/PAT2/) next; print}' file
PAT1
3    - first block
4
PAT1
7    - second block
PAT1
10    - third block

Answer 5

Solution 5

For completeness, here is a Perl solution:

Print lines between PAT1 and PAT2 - include PAT1 and PAT2

perl -ne '/PAT1/../PAT2/ and print' FILE

or:

perl -ne 'print if /PAT1/../PAT2/' FILE

Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2

perl -ne '/PAT1/../PAT2/ and !/PAT1/ and !/PAT2/ and print' FILE

or:

perl -ne 'if (/PAT1/../PAT2/) {print unless /PAT1/ or /PAT2/}' FILE

Print lines between PAT1 and PAT2 - exclude PAT1 only

perl -ne '/PAT1/../PAT2/ and !/PAT1/ and print' FILE

Print lines between PAT1 and PAT2 - exclude PAT2 only

perl -ne '/PAT1/../PAT2/ and !/PAT2/ and print' FILE

Solution 6

Alternatively:

sed '/START/,/END/!d;//d'

This deletes all lines except for those between and including START and END, then the //d deletes the START and END lines since // causes sed to use the previous patterns.

Answer 7

Solution 7

You can do what you want with sed by suppressing the normal printing of pattern space with -n. For instance to include the patterns in the result you can do:

$ sed -n '/PAT1/,/PAT2/p' filename
PAT1
3    - first block
4
PAT2
PAT1
7    - second block
PAT2
PAT1
10    - third block

To exclude the patterns and just print what is between them:

$ sed -n '/PAT1/,/PAT2/{/PAT1/{n};/PAT2/{d};p}' filename
3    - first block
4
7    - second block
10    - third block

Which breaks down as

sed -n '/PAT1/,/PAT2/ - locate the range between PAT1 and PAT2 and suppress printing;
/PAT1/{n}; - if it matches PAT1 move to n (next) line;
/PAT2/{d}; - if it matches PAT2 delete line;
p - print all lines that fell within /PAT1/,/PAT2/ and were not skipped or deleted.

Answer 8

Solution 8

This is like a foot-note to the 2 top answers above (awk & sed). I needed to run it on a large number of files, and hence performance was important. I put the 2 answers to a load-test of 10000 times:

sedTester.sh

for i in `seq 10000`;do sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p;};}' patternTester >> sedTesterOutput; done

awkTester.sh

 for i in `seq 10000`;do awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' patternTester >> awkTesterOutput; done

Here are the results:

zsh sedTester.sh  11.89s user 39.63s system 81% cpu 1:02.96 total
zsh awkTester.sh  38.73s user 60.64s system 79% cpu 2:04.83 total

sed solutions seems to be twice as fast as the awk solution (Mac OS).

Answer 9

Solution 9

This might work for you (GNU sed) on the proviso that PAT1 and PAT2 are on separate lines:

sed -n '/PAT1/{:a;N;/PAT2/!ba;p}' file

Turn off implicit printing by using the -n option and act like grep.

N.B. All solutions using the range idiom i.e. /PAT1/,/PAT2/ command suffer from the same edge case, where PAT1 exists but PAT2 does not and therefore will print from PAT1 to the end of the file.

For completeness:

# PAT1 to PAT2 without PAT1
sed -n '/PAT1/{:a;N;/PAT2/!ba;s/^[^\n]*\n//p}' file 

# PAT1 to PAT2 without PAT2
sed -n '/PAT1/{:a;N;/PAT2/!ba;s/\n[^\n]*$//p}' file 

# PAT1 to PAT2 without PAT1 and PAT2   
sed -n '/PAT1/{:a;N;/PAT2/!ba;/\n.*\n/!d;s/^[^\n]*\n\|\n[^\n]*$/gp}' file

N.B. In the last solution PAT1 and PAT2 may be on consecutive lines and therefore a further edge case may arise. IMO both are deleted and nothing printed.

How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?

Solution 1

Print lines between PAT1 and PAT2

Print lines between PAT1 and PAT2 - not including PAT1 and PAT2

Print lines between PAT1 and PAT2 - including PAT1

Print lines between PAT1 and PAT2 - including PAT2

Print lines between PAT1 and PAT2 - excluding lines from the last PAT1 to the end of file if no other PAT2 occurs

Solution 2

Print lines between PAT1 and PAT2 - include PAT1 and PAT2

Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2

Print lines between PAT1 and PAT2 - include PAT1 but not PAT2

Print lines between PAT1 and PAT2 - include PAT2 but not PAT1

Solution 3

Solution 4

Solution 5

Print lines between PAT1 and PAT2 - include PAT1 and PAT2

Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2

Print lines between PAT1 and PAT2 - exclude PAT1 only

Print lines between PAT1 and PAT2 - exclude PAT2 only

Solution 6

Solution 7

Solution 8

sedTester.sh

awkTester.sh

Solution 9

More Questions

© 2022 Solution Checker - All Rights Reserved