Convert Valve KV into JSON

7

Valve's KV file format is as follows (in pseudo-EBNF):

<pair> ::= <text> <value>
<value> ::= <text> | <block>
<text> ::= "\"" <char>* "\""
<block> ::= "{" <pair>* "}"

The parsing starts on <pair> state. Whitespace is allowed anywhere, but is only required between two consecutive <text> tokens ("a""b" is invalid, but "a" "b" is valid, and so is "a"{}).

And since I'm lazy, <char> is any character, except ", unless preceded by \.

You can also see KV as JSON without commas and colons, and with only strings and objects as possible types.

Your task is to write a program or function that converts a input string (stdin, file or argument) in KV to a output string in JSON. You can output anywhere (stdout, file or return value), as long as it's a valid json string.

It isn't necessary to preserve whitespace in the output, as long as quoted text doesn't change.

Standard loopholes apply, of course.

Bonus

You get a -20 byte bonus if the output is pretty-printed, regardless of input. Pretty printed means indenting each level of blocks with 4 spaces, separating colons from values by 1 space and have one pair per line. Closing braces should stand on their own line.

Test cases

Input:

"foo" {
    "bar" "baz"
    "oof" {
        "rab" "zab"
    }
}

Output:

{
    "foo": {
        "bar": "baz",
        "oof": {
            "rab": "zab"
        }
    }
}

Input:

"obj" {
    "emptyo" {}
    "troll" "}{"
}

Output:

{
    "obj": {
        "emptyo": {},
        "troll": "}{"
    }
}

Input:

"newline" "
"

Output:

{
    "newline": "\n"
}

Input:

"key
with_newline" {
    "\"escaped\"quotes" "escaped_end\""
}

Output:

{
    "key\nwith_newline": {
        "\"escaped\"quotes": "escaped_end\""
    }
}

Kroltan

Posted 2015-11-02T22:58:01.613

Reputation: 517

In terms of the EBNF, what does the entire input need to be? – feersum – 2015-11-02T23:00:55.247

@feersum Sorry. It starts on <pair>. Edited question. – Kroltan – 2015-11-02T23:01:29.700

Can we assume that all tokes are separated by at least one whitespace? – feersum – 2015-11-02T23:13:53.033

@feersum No. You can assume a space only between two <text>s. "a"{} is valid, "a""b" isn't, but "a" "b" is. Should I include whitespace on the EBNF? – Kroltan – 2015-11-02T23:22:50.947

I think it's better to leave the EBNF uncluttered and state the whitespace rules elsewhere. – feersum – 2015-11-02T23:24:00.083

If was going to submit import json,vdf;lambda x:json.dumps(vdf.parse(x),indent=4) (Python 2, 58 bytes before bonus), but vdf requires brackets to be on their own line. It also reorders the dictionaries, and I'm not sure if this is allowed.

– Dennis – 2015-11-03T05:50:47.147

In your pseudo BNF, it seems that text can contain exactly 1 char – edc65 – 2015-11-03T06:59:22.387

2It's important to note that certain characters must be escaped in JSON strings. Since they aren't escaped in your KV format (I don't know whether your comment about laziness means that they are in the original KV spec), it's important to have some test cases which cover them. – Peter Taylor – 2015-11-03T07:18:00.410

1Also, your spec forces text to contain exactly one char, but all of your examples have texts with multiple chars. What should the spec say? 0 or more chars? 1 or more chars? – Peter Taylor – 2015-11-03T07:22:20.880

@PeterTaylor That`s correct, edited the question. Also, thanks for the edit. – Kroltan – 2015-11-03T10:50:59.207

2So, when you say any character, do you mean Unicode, ASCII, printable ASCII and newlines, or something else? – Dennis – 2015-11-03T13:28:50.910

1Also, you might want to include some test cases that involve backslashes and escaped double quotes. – Dennis – 2015-11-03T15:40:04.177

@Dennis Regarding definition of character: Whatever your language supports. I'll add some escaping cases. – Kroltan – 2015-11-03T21:54:01.293

OK, the last test cases is completely different from how vdf behaves (it doesn't allow newlines in keys and \" is a backash and a quote). Before I rewrite my code once again, can backslashes escape anything else, including themselves? What should the output of "\n\N" "\\" be?

– Dennis – 2015-11-04T04:45:56.877

@Dennis It is different from the usual, yes. I thought this made it simpler as a challenge, since you don't need extra rules. Can you edit the post to explain backslashes? I think I'd make it too confusing. ("\n\N" "\\" => {"\n\N": "\\"}, so yes it escapes newlines and quotes, but can be escaped too) – Kroltan – 2015-11-05T00:55:03.547

What do you mean by Can you edit the post to explain backslashes?? – Dennis – 2015-11-05T00:59:19.483

Answers

1

CJam, 57 54 bytes

{[~]2/{`_'{#{1>W<"\u%04x"fe%s`}{~J}?}f%':f*',*{}s\*}:J

This is a named function that pops a VDF string from the stack and pushes a JSON string in return.

CJam has no JSON built-ins, so the bonus won't be worth it. Since conditionally escaping some characters would be a pain for the same reason, I decided to escape everything.

The result isn't pretty, but it's valid JSON. Try it online in the CJam interpreter.

Dennis

Posted 2015-11-02T22:58:01.613

Reputation: 196 637