Problem
I have a file to process with sed
but I am not quite familiar with the commands with capital letters used for multi-line patterns. I use seq
to test the script and I have converted it to the problem described below. The sed
script and expected output are also attached. I believe the script can be written in a much better way but I am not sure how to do it.
Problem description:
Filter the output of
seq $m
, where$m
is a given integer. removing the nth line if either n−2, n−1, n, or n+1 contains the digit 7.
Sed script (together with the seq
pipe, note this is GNU sed
):
seq "$m"|sed ':c N;N;N;:a N;s/.*n.*7.*n.*n.*n//g;tc;P;s/^[^n]*n//g;ba;'
I believe setting two labels (a
and c
) is not necessary.
Edit: There seems to be a nicer alternative, based on the answer to Delete 5 Lines Before and 6 Liens After Pattern Match Using sed
, as follows,
seq "$m"|sed 'N;/7/!{P;D};:b N;s/n/&/3;Tb;d'
This avoids writing out a lot of N;
and n
‘s explicitly. Still I believe it can be improved.
Sample output for m=100
1
2
3
4
5
10
11
12
13
14
15
20
21
22
23
24
25
30
31
32
33
34
35
40
41
42
43
44
45
50
51
52
53
54
55
60
61
62
63
64
65
82
83
84
85
90
91
92
93
94
95
100
Solution
N;/7/!{P;D}
Nice job there, this very concisely (and clearly) allows you to capture both lines n and n+1.
:b N;s/n/&/3;Tb;d
Now the loop that follows it is mostly redundant. You are essentially appending lines of input until you are left with 4 lines in total. You already have 2 in your pattern space and need 2 more, i.e. lines n−1 and n−2, which can simply be expressed in N;N
followed by a d
.