21
4
Task: convert a HTML page into a mountain!
When HTML pages are indented, they can look like:
<div>
<div>
<div>
</div>
<div>
<div>
</div>
</div>
</div>
</div>
But to be honest, a mountain is more representative of this structure.
So we can rewrite it as:
/\
/\/ \
/ \
/ \
The outermost slashes on the left and right correspond to the outer div - each pair of HTML tags should be represented as /
for the starting tag and \
for the ending tag - inside all tags are "higher", with the same structure.
Input:
- There will be no
<!DOCTYPE>
- There will be no self-closing tags e.g.
<img />
or<br />
- There may be attributes or content inside the tags
- There may be spaces or tabs - your program should ignore these
- There will be no spaces between
<
or</
and the tag name - All input will be valid HTML
Output - a mountain representing the tags as above.
More testcases:
Input:
<div id="123"> HI </div><a><span></span></a>
Output:
/\
/\/ \
Input:
<body id="<"></body>
Output:
/\
18
A word of caution for golfers...
– Luis Mendo – 2018-01-09T00:07:39.360Will there ever be
</ div>
? or can we assume the slash is always adjacent to thediv
– Rɪᴋᴇʀ – 2018-01-09T00:16:28.963hmmm, I'll be nice - you can assume no spaces after
<
or</
(until the tag name) - however there may still be spaces for attributes e.g.<div id="aDiv">
– Solver – 2018-01-09T00:18:03.3475Perhaps this could use a few more test cases? – Birjolaxew – 2018-01-09T18:05:51.013
Can we assume the document would be valid? (i.e. every starting tag has a valid ending one, and the order of the tags is nested properly) – Uriel – 2018-01-10T21:41:10.700
@Uriel yes, of course. – Solver – 2018-01-10T21:41:49.507
Can to-be-ignored text appear before and after the first tag in the document? – Uriel – 2018-01-10T21:43:21.960
No, this will not appear. – Solver – 2018-01-10T21:44:12.410
OK, thanks for the clarifications – Uriel – 2018-01-10T21:50:39.047
Will there ever be any case where there is a poorly written page with incomplete tags such as
<div
? – juniorRubyist – 2018-01-11T07:54:04.9071This really needs more test cases, and an exact description (in BNF, say) of what the input will look like. I don’t know what "Valid HTML" means exactly, and how many edge cases I should support. (First one that comes to mind: space between the tag name and
>
such as<a >b</a >
.) – Lynn – 2018-01-13T17:13:23.323For what it’s worth I think the challenge is interesting enough focusing only on strings like
<body><a>xxx</a>yyy<section><div>zzz</div></section></body>
. No attributes, only simple matched tags and lowercase letters as text content. HTML is very complex. – Lynn – 2018-01-13T17:17:43.910I feel like this challenge has turned into an excuse to get people to use JavaScript for golfing. :) though if everyone thinks this is too restrictive, I'll change the question. – Solver – 2018-01-13T17:33:07.107
I call dibs on GNU sed. – SIGSTACKFAULT – 2018-01-14T16:46:42.470