16
1
A variable-length quantity (also referred to as VLQ or uintvar
) is a way to encode up to a 28 bit integer value using only as many bytes as necessary. This was used in MIDI file format as a way to minimize the size of certain event data.
The way it works is fairly simple. As a big-endian series of bytes, the most significant bit (MSB) of each byte is a 1
to indicate that another VLQ byte follows. The remaining 7 bits of each byte make up the decoded value.
Example (from Wikipedia):
[ 0x86, 0xc3, 0x17 ] => 106903
Additional references: Wikipedia, Some Guy.
Challenge:
Given a variable-length quantity, convert it to it's integer value.
Input:
A list of one to four bytes or a 32-bit value type representing a valid VLQ of an integer.
Output:
The integer value of the VLQ input.
Rules and scoring:
- This is code-golf, so shortest answer in bytes for each language wins.
- Standard rules and default I/O rules apply.
- Loopholes forbidden (of course).
- Please provide link with a test for your code (TIO.run, etc).
- A clear explanation for your answer is highly recommended.
- Built-ins that handle this conversion are not banned, however not using them is a lot more interesting.
Test cases:
Input (VLQ) Output (int)
[ 0x00 ] => 0
[ 0x07 ] => 7
[ 0x7f ] => 127
[ 0x81, 0x00 ] => 128
[ 0xC0, 0x00 ] => 8192
[ 0xff, 0x7f ] => 16383
[ 0x81, 0x80, 0x00 ] => 16384
[ 0x86, 0xc3, 0x17 ] => 106903
[ 0xbd, 0x84, 0x40 ] => 1000000
[ 0xff, 0xff, 0x7f ] => 2097151
[ 0xC0, 0x80, 0x80, 0x00 ] => 134217728
[ 0xFF, 0xFF, 0xFF, 0x7F ] => 268435455
Note: you are not required to use hex literals to represent a byte as your input or output. You may use decimal literal ([ 129, 128, 0 ]
), integer (0x80818000
) or any other reasonable byte/octet representation if better suited to your platform. Format is flexible as long as it represents 1-4 byte/octets.
Golf away!
Is the last byte of the input guaranteed to be < 128? Or do we need to support cases such as
[0x01, 0x80, 0x02] => 1
? – Arnauld – 2019-08-07T16:07:56.353Inputs are guaranteed to be valid VLQ values, so last byte will always be MSB = 0. – 640KB – 2019-08-07T16:09:03.337
5@Arnauld I see better now what you were getting at, and in retrospect the case you mention would have been good to have required. In MIDI, for example, you'll be reading a stream of data and so you don't necessarily know which byte is the last one. The way it is written, guaranteeing that the last byte in the list is the last byte in the VLQ makes the MSB irrelevant and in fact kind of defeats the purpose of VLQ. As in, you wouldn't necessarily be able to use these (valid for the challenge) routines for MIDI file decoding. Totally my bad! Should have kept this in sandbox much longer... – 640KB – 2019-08-07T17:35:30.850
5This is partly my bad as well because I think I've seen the challenge in the sandbox and didn't pay enough attention. But yes, that's what I had in mind: given a list of bytes, either output the first VLQ value or output all VLQ values. – Arnauld – 2019-08-07T17:40:19.977
I like this challenge, though it may have been more interesting if the input was an arbitrary number of bytes, and you had to return a list of values (of different length than the input). This would have tested the "variable length" aspect of the encoding. – Jonah – 2019-08-08T01:39:26.400
Can we take the input-list in reversed order? – Kevin Cruijssen – 2019-08-08T14:41:04.670
1@KevinCruijssen, a variable length quantity is supposed to be a stream of bytes, where technically you only know when it ends by reaching the MSB=0 marker. I know the rules didn't quite capture that, but by the definition of VLQ I'm going to have to say no. – 640KB – 2019-08-08T14:47:42.457
1@gwaugh Ok, that makes sense and I can understand the reasoning. I'll modify my MathGolf answer accordingly. :) – Kevin Cruijssen – 2019-08-08T14:48:25.120
do we need to handle garbage bytes before or after the VLQ ends? – qwr – 2019-08-09T19:46:19.983
so if we use a 32 bit int as input can we pad the other bytes with zeros? It's not clear to me how input is taken. From your example 0x818000 it looks like the rest is zero padded. – qwr – 2019-08-09T19:50:26.617
@qwr, I see what you mean. In the case of taking input as a 32 bit integer then my example is incorrect, it should be 0x80818000. I will update. – 640KB – 2019-08-09T20:04:07.543