-6
0
Challenge
Make an error-checker in the least amount of possible bytes! If you received an HTML program as a string through standard input, your program will be able to find all the errors and return to standard output. If the program is over 2000 characters, return that as an error too.
Errors
As there has been discrepancy, I will provide the errors you can check for.
Regular:
- File too long
- Tag not closed
- Non-empty element not closed
- XHTML self-closing will not give you any bonus.
- Entity not closed (i.e.
&
vs&
) - CSS rule with no selector attached (i.e.
{color:white;}
vsspan {color:white;}
- CSS unclosed property-value pair (i.e.
color:white
vscolor:white;
) - Errors with brackets (e.g. no closing bracket, rules outside brackets, etc.)
- Each different rule about brackets counts separately.
- Undefined tag (worth x10)
- Undefined attribute (worth x4)
- Undefined CSS property (worth x10)
Let me make this clear:
You are not expected to complete all or even most of these.
Completing 7 of them would be incredible, but not necessary. Please though, try as many as you can while keeping the byte count relatively low!
Scoring
- -10 bytes for every type of SyntaxError checked (e.g. undefined tag, oversized file [see above], etc.)
- -30 bytes for every type of SyntaxError checked that returns a customized error message and line number. (for a total of -40 bytes per error)
- You may not have a customized message less than 20char.
- -100 bytes if your program can check embedded CSS in the document for at least 3 types of errors. (this includes
<element style="">
and<style></style>
. - +100 bytes if you check only 3 or less types of errors.
- ~0 bytes if your final score is negative.
As this is my first code-golf challenge, please comment if you think the scores are out of range.
Rules
- You may not use a built-in function to check for errors.
- You must check for at least 1 error type. No blank answers! :D
- You must return at minimum the number of errors in the document.
- When you post your score, you should use this Markdown format:
Output:My Language,
Unbonused bytesFinal Scorecode here
Solves:
- First Error
- Second Error
- Et Cetera
Why/How it works
Markdown:
# My Language, <s>Unbonused bytes</s> Final Score #
code here
**Solves:**
* First Error
* Second Error
* Et Cetera
Why/How it works
* You may add additional strikethroughed numbers if you wish in between the two values.
Winning
After one week, the answer with the smallest number of bytes will win. If there is a tie, the answer with the most upvotes will win. If there is still a tie, I hold the divine power to choose. The winning answer will be accepted.
After the deadline, others can still post their answers, but no reward will be given.
Good Luck!
This is my first challenge, so if anything is lacking in any way, please comment. Thank you!
10-1 too many bonuses. Scoring should be simple. – Mego – 2015-11-27T17:51:51.980
3You should add some example HTML programs for testing. – LegionMammal978 – 2015-11-27T17:56:50.473
2@Mego To me, this seems like a complicated challenge, so IMO there should be bonuses to bring the bytes back down. Also, there are literally 3 different bonuses: You dock 10 bytes if you check an error, and an additional 15 for customizing it. You dock 100 if you check CSS too. You add on 100 for cheaping out. If your score is negative, you go back to 0. Simple. – OldBunny2800 – 2015-11-27T17:57:01.173
@LegionMammal978 I will, but not yet. I want to wait a while, maybe at least a day. EDIT: My reasons for that? Not saying nuthin'. – OldBunny2800 – 2015-11-27T17:57:49.363
Also, are we allowed to check for errors that merely cause XHTML non-compliance (such as
<br>
instead of<br/>
)? – LegionMammal978 – 2015-11-27T18:00:41.753This is specifically HTML, so I would say no. You could do it, but I just wouldn't give you any bonus for it. – OldBunny2800 – 2015-11-27T18:05:11.967
3Considering that most challenges are scored by the byte count of the solution (without any modifications), this is anything but simple. – Mego – 2015-11-27T18:06:06.883
7Have you tried to solve the task by yourself? Writing a full DOM-parser is not an easy challenge. There are many tags, many attributes many levels and ways of nesting. This can very fast become huge and complex. – insertusernamehere – 2015-11-27T18:22:33.570
1I never said it was easy. >:) – OldBunny2800 – 2015-11-27T18:24:12.347
If we aren't testing for XHTML, what should we test for? Different browsers allow different errors. – LegionMammal978 – 2015-11-27T18:36:46.347
1Any errors that are errors in HTML (E.G. not closing tags that should be closed, unclosed strings, etc.) as well as the 2000char limit I used. – OldBunny2800 – 2015-11-27T18:39:44.340
...you just restated the XHTML standard. – LegionMammal978 – 2015-11-27T18:49:18.750
What I'm saying is you don't need to do things that are only in XHTML, such as closing empty elements (i.e. <img src="" alt="" / >), uppercase tags (i.e. <IMG> not <img>) etc. – OldBunny2800 – 2015-11-27T19:27:01.790
Under what curcumstance does the custom error method with line number bonus apply to size checking. – pppery – 2015-11-27T20:43:01.287
You don't to provide a non-existent line number. Just give a custom error message that's intuitive (e.g. not just "l") It should be at least 20char long. Adding that to OP now. – OldBunny2800 – 2015-11-27T20:54:09.463
7I'm voting to close as "Unclear what you're asking" because "every type of SyntaxError checked" is far too vague for a spec. If you gave 10 programmers a copy of e.g. the HTML 4.0 spec and asked them how many possible syntax errors there are, you would get at least 10 different answers. – Peter Taylor – 2015-11-27T21:35:20.800
@Peter I don't think you understood what I meant. I used "every" to mean "each", not "all". – OldBunny2800 – 2015-11-27T21:40:42.993
Yes, I understood that. My point is that the number of errors checked by a serious error-checking program is highly debatable. I might claim that X, Y, and Z are three different types of errors, whereas someone else might claim that they're really all the same error. – Peter Taylor – 2015-11-27T21:45:13.217
4So really, you want us to write a DOM parser that can handle malformed input by definition and unambiguously mark where it went wrong? I'm sorry, but I want to be paid by the hour before going there. – Sanchises – 2015-11-27T22:53:44.440
The "specific error message" bonus is never worth obtaining, because it is only worth 15 bytes and the error message needs to be at least 20 bytes. – pppery – 2015-11-27T22:57:38.213
Oh :$ sorry. Changing that now. – OldBunny2800 – 2015-11-27T23:20:53.263
@PeterTaylor,@sanchises You don't need to comprehensively solve every single error. Keep it simple! Strategy: Do 4 different types so you don't get the penalty, but it is still keeping the bytes relatively low. – OldBunny2800 – 2015-11-27T23:26:09.600
I have clarified the allowed errors, comment if I left any out. – OldBunny2800 – 2015-12-01T05:26:47.437
I'm surprised that in the time since this was posted nobody asked about Parsing the HTML with Regex. – NoOneIsHere – 2016-07-23T02:57:36.670