Converting rational numbers to an “alien” data format

For a story I'm writing, I needed to convert 101⅐ (101 + 1/7 if that doesn’t show up for you) into base 47.

Expressing an integer in some base brings back memories of first learning how to program back in the days of 8-bit systems; it’s a fairly easy exercise. Doing so using a floating-point format is a bit harder, though. The final twist, identifying the repeating digits, is more challenging to do elegantly.

So I thought I’d do it in a different language from what I’m used to, just for fun and learning the new dialect. And I thought it would be good to share, so here I am. The vast majority of the posts here are (char or byte of source) code golf, but I’m more interested in an elegant expression in the target language. Cyclomatic complexity isn’t a category, but atomic-code-golf is pretty close.

The program should take a rational number in a natural way, such as supplying three integers on the command line of the script invocation or standard input. This does not count towards the score (nor does the error checking and defaults if you give it one value only). The function (or part that is counted for score) should receive the three values; e.g. 101,1,7 for the example above.

The output is a sequence of code points that are transliterated into text output as follows:

digits 0 through 9 inclusive: ‘0’ through ‘9’ in the normal way
digits 10 through 46: express as ‘[10]’ through ‘[46]’
negative sign: the normal ASCII ‘-’
separator: ‘:’
prefix: ‘#’

A number starts with the prefix ‘#’ and has up to three parts, separated by ‘:’. Leading parts are omitted if zero.

The basic natural number (BNN) is encoded using the digits in base 47, outputted in little-endian. That is, 99181 = 11×47⁰ + 42×47¹ + 44×47² so the digits, in order output, are [11][42][44].

The last (possibly only) part of a number is an integer, which is a leading ‘-’ if the value is negative, followed by the digits of the BNN. So, 99181 would be encoded as #[11][42][44] and −99181 would be encoded as #-[11][42][44].

The middle part, if needed, is an exponent as for scientific notation. It states which power of the base (position) is the first digit listed. In the case of integers, this is 47⁰ so the 0-valued part is omitted.

Suppose you divide that by 47, shifting the radix point by one. That is, 99181 ÷ 47 is the same digits but starting with 47⁻¹; that is, 11×47⁻¹ + 42×47⁰ + 44×47¹, and the output becomes #-1:[11][42][44]. There are two parts, the first is the encoding for the integer −1, and the second is the digits as before.

Now it comes out so neat only if you divide by some power of 47. Normally, a rational number made of small components will end up with a repeating sequence of digits. In decimal, for example, ⅓ is .3333… and ⅐ (1/7) is .142857142857142857… in the first case one digit repeats, in the second case 6 digits repeat.

In the alien real number format, the first part, if present, states how many digits are to be repeated. So, the number 8½ would have a single digit [23] repeating forever, so we just encode one of them and state that 1 repeats: #1:-1:[23]8. The first part is 1 (the encoded BNN for the value 1) which means that one digit is to be repeated. The second part (negative sign and BNN) indicates that the first digit given is the 47^-1 position. The third part is the two digits needed: 26×47^-1 + 8×47⁰.

Somebody should check my results, but I think the correct output for (101 1 7) is:
#6:-6:[20][13][40][26][33]672.

JDługosz

Posted 2017-03-09T01:57:20.287

Reputation: 119

What counts as an "atom"? Are you using the description in the tag wiki? – Rɪᴋᴇʀ – 2017-03-09T01:58:20.907

I'm new here. I'm using the established conventions. So yes, I mean what was stated in the tag wiki. – JDługosz – 2017-03-09T02:00:09.493

3You might want to make output a bit less stringent. I personally feel that restrictive output formats make the answers more about the output formatter than the actual algorithm. – Post Rock Garf Hunter – 2017-03-09T02:56:20.177

4I am also not sure why you chose [tag:atomic-code-golf] over plain old [tag:code-golf]. It seems a bit random in the context of the entire question. – Post Rock Garf Hunter – 2017-03-09T02:57:26.157

Welcome to the site! As WheatWizard has said we encourage looser I/O specs. Other than that it would be good idea to include a table of example inputs and outputs to help people with testing their code. – 0 ' – 2017-03-09T03:17:11.530

Ok, I’ll relax the output with some edits tomorrow. Won’t score the actual output routine, just produce the lists of digits. – JDługosz – 2017-03-09T07:08:27.587

@WheatWizard because I want elegance in the target language, not reduced readability and tricks to save source bytes. – JDługosz – 2017-03-09T07:10:34.897

Considering how hard the question was to understand due to your I/O formats, it's unlikely that any answer will be readable. Why not just use a list of lists? Why is it little-endian? Why do you use an exponent instead of leading zeros? All these things make it harder to read/write. That may be the format for the story you're writing, but here it'd be best to focus on the actual algorithm, not the format. Maybe pick a few things to keep if they're important and remove some. – mbomb007 – 2017-03-09T21:44:57.253

Converting rational numbers to an “alien” data format

Answers

Batch, 531 bytes