16
5
Introduction
We've have a few base conversion challenges here in the past, but not many designed to tackle arbitrary length numbers (that is to say, numbers that are long enough that they overflow the integer datatype), and of those, most felt a little complicated. I'm curious how golfed down a change of base code like this can get.
Challenge
Write a program or function in the language of your choice that can convert a string of one base to a string of another base. Input should be the number to be converted (string), from-base (base-10 number), to-base (base-10 number), and the character set (string). Output should be the converted number (string).
Some further details and rules are as follows:
- The number to be converted will be a non-negative integer (since
-
and.
may be in the character set). So too will be the output. - Leading zeroes (the first character in the character set) should be trimmed. If the result is zero, a single zero digit should remain.
- The minimum supported base range is from 2 to 95, consisting of the printable ascii characters.
- The input for the number to be converted, the character set, and the output must all be of the string datatype. The bases must be of the base-10 integer datatype (or integer floats).
- The length of the input number string can be very large. It's hard to quantify a sensible minimum, but expect it to be able to handle at least 1000 characters, and complete 100 characters input in less than 10 seconds on a decent machine (very generous for this sort of problem, but I don't want speed to be the focus).
- You cannot use built in change-of-base functions.
- The character set input can be in any arrangement, not just the typical 0-9a-z...etc.
- Assume that only valid input will be used. Don't worry about error handling.
The winner will be determined by the shortest code that accomplishes the criteria. They will be selected in at least 7 base-10 days, or if/when there have been enough submissions. In the event of a tie, the code that runs faster will be the winner. If close enough in speed/performance, the answer that came earlier wins.
Examples
Here's a few examples of input and output that your code should be able to handle:
F("1010101", 2, 10, "0123456789")
> 85
F("0001010101", 2, 10, "0123456789")
> 85
F("85", 10, 2, "0123456789")
> 1010101
F("1010101", 10, 2, "0123456789")
> 11110110100110110101
F("bababab", 2, 10, "abcdefghij")
> if
F("10", 3, 2, "0123456789")
> 11
F("<('.'<)(v'.'v)(>'.'>)(^'.'^)", 31, 2, "~!@#$%^v&*()_+-=`[]{}|';:,./<>? ")
> !!~~~~~~~!!!~!~~!!!!!!!!!~~!!~!!!!!!~~!~!~!!!~!~!~!!~~!!!~!~~!!~!!~~!~!!~~!!~!~!!!~~~~!!!!!!!!!!!!~!!~!~!~~~~!~~~~!~~~~~!~~!!~~~!~!~!!!~!~~
F("~~~~~~~~~~", 31, 2, "~!@#$%^v&*()_+-=`[]{}|';:,./<>? ")
> ~
F("9876543210123456789", 10, 36, "0123456789abcdefghijklmnopqrstuvwxyz")
> 231ceddo6msr9
F("ALLYOURBASEAREBELONGTOUS", 62, 10, "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
> 6173180047113843154028210391227718305282902
F("howmuchwoodcouldawoodchuckchuckifawoodchuckcouldchuckwood", 36, 95, "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ~`!@#$%^&*()_-+=[{]}\\|;:'\",<.>/? ")
> o3K9e(r_lgal0$;?w0[`<$n~</SUk(r#9W@."0&}_2?[n
F("1100111100011010101010101011001111011010101101001111101000000001010010100101111110000010001001111100000001011000000001001101110101", 2, 95, "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ~`!@#$%^&*()_-+=[{]}\\|;:'\",<.>/? ")
> this is much shorter
We have had one designed to tackle arbitrary length numbers. – Peter Taylor – 2016-01-11T18:50:57.103
@PeterTaylor Well dang, somehow missed that one in my search. Still, I would argue they are different enough. The other one involves a default character set, multi-byte sequences, error handling, and sequence-to-sequence conversion. All these add to much larger bloat in the answers, and focus on different optimizations. This challenge is much more trimmed down, and will result in completely different code from the other challenge (short of the core algorithm). – Mwr247 – 2016-01-11T19:04:42.873
@PeterTaylor Plus, the other question was asked 4 years ago and received only two valid answers (and with one already accepted, little reason to bump). I'm willing to bet the community would enjoy this challenge, with little impact from the previous one, or feelings of "repetitiveness". – Mwr247 – 2016-01-11T19:10:02.380
7While this challenge is very similar to the previous one, I'd actually be in favor of closing the previous one as a dupe of this one. This challenge is much clearer and higher quality than the old one. – Mego – 2016-01-11T21:04:21.190
Could you elaborate a bit on
You cannot use built in change-of-base functions to convert the entire input string/number at once
? Specifically, could I use a built-in to convert the input to a intermediate base? Can I then use a built-in to convert to the target base? Would something likeconvert input with canonical form for given base; convert to base 10; convert to target base; convert back to specified character set with string replacement
? – Mego – 2016-01-11T23:38:33.077Now that this question has an answer, I have voted to close the older one as a dupe of this one. – Mego – 2016-01-12T03:21:49.327
"7 base-10 days" I guess since this thing uses arbitrary characters as digits, "7" could stand for any number. For instance, if the character set is the ASCII characters from code 46 to code 55, then this challenge will end in 10 days. On the other hand, if using the standard digits, why specify a base? The symbol 7 represents the number 7 in any base that uses it. – quintopia – 2016-01-12T15:21:21.623
@Mego Good point on the base conversion loophole. It'd probably be simpler to just say no built-in base change at all, especially since no answers use it yet. – Mwr247 – 2016-01-12T15:59:56.710
@quintopia It was really more a self reference to a previous challenge I made, from which I copied the post template for this one, as well as a reference to the challenge itself and bases. For that matter, isn't "base-10" indeterminate itself, since every base has a 10? ;)
– Mwr247 – 2016-01-12T16:04:32.477@Mwr247 clearly you meant "standard-digit 7 days" or "standard-digit 7 base-standard-digit-A days" which is redundant with the former. – quintopia – 2016-01-12T17:48:22.873
There's no such thing as a base-10 integer datatype. Just say input will be in base ten. – lirtosiast – 2016-01-12T23:31:17.747
@ThomasKwa But then someone might try a base-10 string ;) – Mwr247 – 2016-01-12T23:41:14.663