How to match everything from after the last YAML delimiter onwards

3

I normally use grep to search for a pattern in a string. However in this particular instance I have to identify a YAML header, also, it ends with a triple dash.

My test.info file has the following content

---
title: dont't know
draft: true
---
this is a test to add some extra content

I want the following output, i.e. everything from after the last YAML delimiter until the end of the file:

this is a test to add some extra content

When I enter the dashes, bash return the following error:

$ cat test.info | grep '---' -A1
grep: unrecognized option `---'

I tried to “escape” the dashes unsuccessfully. Any idea? This is for BSD grep. The things that get me confused is that I can get what I want if I do something like the following.

$ cat test.info | grep 'this' -A1

Problem is that I don't know what's the first word.

I can grep the file as recommended, but the tool returns the pattern and not everything immediately after:

$ grep -m 1 -e '---' test.info 
---
$ grep -- --- test.info | tail -1
---

Andrea Moro

Posted 2014-11-19T20:25:06.613

Reputation: 141

Answers

0

So, without taking any credit as I don't want, I just post the solution that worked on my mac thanks to KasiyA

tail -r file| awk '/---/ {exit} {print}'| tail -r

Andrea Moro

Posted 2014-11-19T20:25:06.613

Reputation: 141

2

How about this command?

tac file| awk '/---/ {exit} {print}'|tac

On OSX just replace both tac commands with tail -r

From man tac:

 tac - concatenate and print files in reverse - reverse of cat command ;)

Output of tac file:

next line
this is a test to add some extra content
---
draft: true
title: dont't know
---

The awk command awk '/---/ {exit} {print}' prints all lines until first matched pattern found.

Output:

next line
this is a test to add some extra content

then run tac command again to reverse to default.

Output:

this is a test to add some extra content
next line

αғsнιη

Posted 2014-11-19T20:25:06.613

Reputation: 424

this seems to work from my test with his test.info file – barlop – 2014-11-23T14:06:08.570

mmmm ... that's strange, why I don't have tac on the system? Is this a default command? – Andrea Moro – 2014-11-25T08:08:10.310

@AndreaMoro Just replace both tac commands with tail -r

– αғsнιη – 2014-11-25T08:22:38.303

1Great. this works. – Andrea Moro – 2014-11-25T17:57:36.830

1

$ line=$(grep -n -- --- test.info | tail -n 1 | cut -d: -f1);tail -n +$(( $line + 1 )) test.info 
this is a test to add some extra content

Appropriate error checking needs to be added, as in if $line 'not numeric' ...

The original problem comes from the fact that you need to escape - or tell the program that it is not an option:

$ grep -n -- --- test.info
1:---
4:---

Most(?) gnu software has "--" as an option; telling to stop parsing for more options after that point.

Note:
$ grep --version
should tell if it is a GNU grep utility or not.

$ grep -h
or
$ grep --help
usually tells the options it understands.

Hannu

Posted 2014-11-19T20:25:06.613

Reputation: 4 950

This works, upvoted. – slhck – 2014-11-22T12:21:26.983

That is team work ;-) – Hannu – 2014-11-22T12:47:03.750

just in case he has an issue with the dash (he's using bsd grep so who knows) then for escaping dash it seems like using hex for dash worked for him, this worked of gnu grep and from what he said seems to have worked in his bsd grep too $ echo - | grep -P '\x2d' matches his dash. – barlop – 2014-11-23T14:24:42.867

1

Using awk:

awk 'END{print}' RS='---' file

RS defines --- as record separator and with END{print} we only prints the last record.

Using sed:

sed -r ':a;$!{N;ba};s:^(.*\n?)---::' file

αғsнιη

Posted 2014-11-19T20:25:06.613

Reputation: 424

0

I see at least two different ways to do it. In fact you need to escape the first "-"

Using option -e:

grep -e "---" test.info

Escaping the first "-" with backslash:

grep "\---" test.info

Zimmi

Posted 2014-11-19T20:25:06.613

Reputation: 341

None of the works. – Andrea Moro – 2014-11-19T20:43:54.100

I've added more details so it is easier to reproduce. – Andrea Moro – 2014-11-19T20:46:18.840

??? Sorry, then is a mystery for me. I've tested before posting: it works here (GNU grep 2.12) – Zimmi – 2014-11-19T20:51:43.857

I've added few details above in my answer with the file content, just in case. – Andrea Moro – 2014-11-19T20:57:48.477

-1

Addressing one aspect of your (now original) question.

(And your new questions stills talks about grep and -A1 and pretends that you can't find a way to specify 3 hiphens and that's just wrong because in your comments you've shown that you can).

[your original question asked how to show results FROM a point. when actually you want FROM AFTER. Your new question still stated 'from' I updated it to 'from after' ]

If does look like grep and -A1 aren't what you want though it's still mentioned even in your updated question

I don't see any funny results. Perhaps you can paste the results you think are funny. [you now have]

If you do grep pattern -A1 then it dumps the pattern followed by the next line. And it dumps -- after each match.

e.g.

$ cat t3.info
,,,
qwerty
uiop
,,,




werwer
werwer
,,,,,
werwerwer
werwerwer
werwerwer
,,,,



$ cat t3.info | grep -P ',,,' -A1
,,,
qwerty
--
,,,

--
,,,,,
werwerwer
--
,,,,



$

You could do grep -P '(?=---)...' -A1 Another thing you can try is this grep -P '\x2d{3}' -A1

And if your file has ---

$ cat t3.info
---
qwerty
uiop
---




werwer
werwer
---
werwerwer
werwerwer
werwerwer
---

Sure this doesn't work

$ cat t3.info | grep -P '---' -A1 grep: unrecognized option '---' Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.

But this works

$ cat t3.info | grep -P '\---' -A1
---
qwerty
--
---

--
---
werwerwer
--
---

and this works

$ cat t3.info | grep -P '(?=\x2d{3})...' -A1
---
qwerty
--
---

--
---
werwerwer
--
---

grep version 2.16

$ grep --version
grep (GNU grep) 2.16

If you don't want to be including your pattern then whatever option you use it wouldn't be -A1 and I don't see why you are having trouble with matching ---

It may be that to do what you want with grep or something grep like, you need to match new line e.g. e.g. regex of positive lookbehind.. for ---\n but apparently grep can't match new lines, in which case you might have more luck with pcregrep https://stackoverflow.com/questions/2686147/how-to-find-patterns-across-multiple-lines-using-grep or some other way. S oyour question is more, how can you match what follows on the line after a pattern.

barlop

Posted 2014-11-19T20:25:06.613

Reputation: 18 677

My bad. I didn't mention I'm on OS X and the grep version I have is grep (BSD grep) 2.5.1-FreeBSD – Andrea Moro – 2014-11-19T23:04:47.160

@AndreaMoro can you paste the command and results you are getting? and try the alternative grep lines I suggeested eg cat t3.info | grep -P '(?=\x2d{3})...' -A1 or cat t3.info | grep -P '\x2d\x2d\x2d' -A1 – barlop – 2014-11-19T23:25:39.297

the -P parameter corresponds to -e in the BSD version. As for the output the one with regular expression doesn't produce anything. the second simply don't remove everything before the last --- as expected. That's the output `--- title: dont't know

--

this is a test to add some extra content` – Andrea Moro – 2014-11-19T23:30:53.257

@AndreaMoro it's not so clear in comment as new lines aren't visible and one cannot really format much in a comment. Can you paste the examples (converted to bsd as you have) into your question, with the results. – barlop – 2014-11-19T23:40:48.160

@AndreaMoro I see in that last comment of yours, you got the right result though from grep's point of view. It outputs the match --- then the line after. For both occurrences of --- and after each one it puts a --. If you only want what is after the last --- then grep is the wrong tool unless perhaps you are willing to remove all new lines prior to running grep. – barlop – 2014-11-19T23:43:50.147

I've edited the question to pass the results. I would appreciate your input. – Andrea Moro – 2014-11-20T09:42:03.097

BTW the question is exactly that: How can I match what follows on the line after a pattern. – Andrea Moro – 2014-11-20T09:50:25.473

1@AndreaMoro That is what you intended your question to be but that's not what your question says. Your question says grep and your question says a specific delimiter/pattern being a problem, and your question says -A1(which would actually include the pattern), and your question says it's not working,and you can't escape the dash/hiphen.I suggest u post another question,this time without saying anything of grep and nothing about --- being a special problem.And ur question may be more about either a)matching text outside a pair of tags.or b)matching text after a second occurrence of a pattern. – barlop – 2014-11-20T10:53:22.417

Let us continue this discussion in chat.

– barlop – 2014-11-20T11:13:58.367

See the OP's update of the question. The problem turns out to be a different one. – slhck – 2014-11-21T11:23:17.337

@slhck his problem is what he described in comment and chat..just above your comment, and is clear in his updated question. However, his question is still not very up to date. The '---' isn't much of a problem because he dealt with it doing one of my suggestions. It's not clear which one he used. But he said in comment that his output was --- and the line that followed it.So he has no issue with grepping for --- anymore.And as mentioned, -A1 isn't really what he is looking for, as that'd include the pattern and he doesn't want to do that.As4my answer,I answered the question he asked then. – barlop – 2014-11-21T21:17:09.717

Yes, but the OP was confused about the real problem and it turned out to be different. So you need to delete your answer or update it. I also deleted my answer here. – slhck – 2014-11-22T07:26:24.177

@slhck Why can't the OP ask a new question rather than expect everybody to delete their answers.. Perhaps his old question can be of use to others? (and if not, then does the site really want questions that are only of use to the one person asking them?) – barlop – 2014-11-22T12:00:31.677

The OP should be asking the question about the problem they need to solve, not morph their problem from question to question until we finally get to the root of it. The question was ambiguously defined at first, later unclear, and now it is fixed. We shouldn't tell people to abandon their ill-defined questions to ask new, better ones. And this here is not a question that's only helpful or specific to the OP. – slhck – 2014-11-22T12:19:36.903

Another user updated their answer to address the OP's problem. It's quite common to have to delete your answer when you answer a question that is not well-defined. Just be prepared for that. (And like I said, I had to do it, too, and I spent a bit of time on it.) – slhck – 2014-11-22T12:22:35.413

@slhck It's not that his question wasn't defined well it's that his question defined a different problem to one he wanted. And his question still has something that's not his problem (the not matching three dashes,he can match them now). And the inability to match three dashes is a different problem to the one he wants.His question was defined but his words defined a different or additional problem to the one he was talking about re dumping text after a pattern(and his wording saying he wanted to match from, then it turning out he meant from after, was changing the definition, not clarifying) – barlop – 2014-11-22T12:37:19.677

Let us continue this discussion in chat.

– barlop – 2014-11-22T12:39:16.350