Convert __DATE__-style string to sortable string

7

Goal is to write the shortest possible C89 and C99-compliant single-module C program which will compute and print out a single-line string whose sort order will correspond with the date given by the predefined __DATE__ macro (in other words, later dates will yield later-sorting strings). The programmer is free to arbitrarily select the mapping between dates and strings, but every entry should specify the mapping and it should be obvious that it will sort correctly (e.g. a programmer could decide to compute (day + month*73 + year*4129) and output that as a number, though it's likely that particular encoding would probably require a longer program than some others).

The program should yield identical results on any standards-compliant compiler on which 'int' is 32 bits or larger and both source and target character sets are 7-bit ASCII, and should not rely upon any implementation-defined or undefined behavior nor print any characters outside the 32-126 range except for a single newline at the end. The program should contain the following aspects indicated below (replacing «CODE» with anything desired):

♯include <stdio.h>
«CODE»int main(void){«CODE»}

All output shall be produced by the printf at the end (i.e. the correct value will be in an int called z). The required elements will be included in the character total for each entry.

Code should operate correctly for all future dates through Dec 31 9999. Libraries which are standard in both C89 and C99 may be used, provided that appropriate headers are included. Note that standard date libraries may not be assumed to operate beyond the Unix limits.

Note: Code is permitted to perform Undefined Behavior if and only if the __DATE__ macro expands to a macro other than a date between Feb 11 2012 and Dec 31 9999 (expressed in that format, using C-standard abbreviated English month names Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec)

Note 2: For compliance with C99 standards, 'main' should return an arbitrary, but defined value, and the newline is required. The first requirement, btw, adds 7 characters to my best effort; the single new-line adds 5.

supercat

Posted 2012-02-10T23:19:41.450

Reputation: 273

4This seems kind of pointless because there is already a UNIX timestamp that you can perform operations on to determine the date, and is constantly increasing. also, is this code golf? you should tag it as such – Blazer – 2012-02-10T23:38:25.343

Unix date does not match criteria to reach 2099: LC_ALL=C date -d "2060/02/13" date: invalid date2060/02/13'` – user unknown – 2012-02-11T00:02:22.817

@Blazer: I would consider the use of a Unix timestamp to violate the prohibition against date-parting routines. The primary goal is to figure out an efficient way of converting "Jan/Feb/Mar/etc" into a sortable number. – supercat – 2012-02-11T00:17:00.677

@userunknown: Depends on the unix. 32 bit unix time runs out in 2038, I think, but 64 bit unix time...Scientific Linux 5.3 gives $ LC_ALL=C date -d "2060/02/13" Fri Feb 13 00:00:00 CST 2060 on x86_64. – dmckee --- ex-moderator kitten – 2012-02-11T00:47:24.710

1That said, this question need to be formed as a CodeGolf.Se compliant puzzle (with the winning condition specified and the like). Supercat, please feel welcome here, but also read the FAQ and examine some of the other questions here. I'll be happy to reopen this when you've tuned it up a little. Just submit a flag for help. – dmckee --- ex-moderator kitten – 2012-02-11T00:49:15.427

so is this for any language, or is it only limited to C? – Blazer – 2012-02-11T08:03:23.697

@Blazer: I'd be interested in seeing approaches for approaching similar problems in other languages, but for purposes of scoring I think it's probably easiest to restrict "official entries" to C programs which are compliant with C89/C99. – supercat – 2012-02-11T08:15:36.073

@supercat: I'll submit my Python answer anyways, then! :) – Blazer – 2012-02-11T08:18:23.440

I have a 185 char C program but I can't post it. This makes me sad :-( – Gareth – 2012-02-11T11:11:51.487

Thanks @supercat, that's just the kind of thing we're looking for. Personally I'm not a big fan of language limits, but there seems to be a consensus to allow them (or at least no consensus to not allow them). – dmckee --- ex-moderator kitten – 2012-02-11T18:59:44.670

@dmckee: I think the issue is that different language have different amounts of unavoidable baggage, and the number of characters required to solve a problem in one language may be less than the number of characters to 'get out the starting gate' in another. My original concept for this problem was to have a function return an 'int', but there seems to be a preference for standalone programs. Using a function rather than a program could have eliminate the #include, for example, though for another puzzle I might specify a 'test' program with which the function must work. – supercat – 2012-02-11T22:28:35.907

@Gareth: The topic is open now, if you'd like to come back. My program, including the trailing newline and a defined return value, is 115 characters and I wouldn't be at all surprised if it could shrink a tiny bit. – supercat – 2012-02-11T22:29:23.897

Challenges on the site come in all varieties, there are those that specify functions and those that specify whole programs. As for the differences between language, there is no consensus on who to deal with it: see Language Handicap, should imports/includes count in golf, What programming language should we consider for the code-golf solution ? and may other questions on meta.

– dmckee --- ex-moderator kitten – 2012-02-11T22:38:25.533

@supercat 115! I'm embarrassed by my pitiful attempt now... – Gareth – 2012-02-11T22:41:53.477

@Gareth: I'll admit I have something of an unfair advantage, since I was trying to figure out a way to have an embedded system convert __DATE__ into a version number, stumbled upon a really nice trick, and then decided it would make a cute puzzle. Admittedly my goal was smallest compiled code size, but I think my trick works out well by any metric. – supercat – 2012-02-11T23:39:28.253

@dmckee: Wow. The first time in my live I'm thinking about a 64 bit system. :) – user unknown – 2012-02-12T02:06:57.010

Answers

2

C, 137 184 184 140 120 106 103 characters

Replaced the month name lookup with a magic formula.
The formula (m[1]*4388^m[2]*7)%252 is ascending for month names.
Changed it to nicely return 0, at no cost.
It no longer prints a number. Instead it prints a string, which should sort right.
Implemented supercat's %*s idea, which inserts more spaces for earlier months, along with a function that's descending for month names - (m[1]*29^m[2]+405)%49.

#include<stdio.h>
int main(void){
    char*m=__DATE__"%*.6s\n"+1;
    return!printf(m+6,(*m*29^m[1]+405)%49,m);
}

I thought single digit days are represented as Jan_1_2012 (_ being a space), when in fact it's Jan__1_2012 (extra space). This complicated things, so my previous versions were more complicated:

#include<stdio.h>
int main(void){
    char*m=__DATE__+1,*t=m+m[4]/16;
    return!printf("%s%3d%s\n",t+3,(*m*4388^m[1]*7)%252,t);
}

ugoren

Posted 2012-02-10T23:19:41.450

Reputation: 16 527

I hate to be nit-picky, but is the use of an unprototyped strstr function C99 compliant? Also, my original plan for the puzzle was to compute a single integer, but given the present requirements if you can save a few characters by assembling an output string you should do so. – supercat – 2012-02-12T17:51:04.287

Added #include<string.h>, sacrificing some more characters. I don't know if it's required by c99, but considering 64bit, where assuming strstr returns int can end badly, it's better like this. So now I'm barely ahead of the competition (but being ahead thanks to omitted includes isn't such a big deal). – ugoren – 2012-02-12T19:46:43.397

That sort of formula is the type of thing I was hoping to see in answers. I spent a lot of time crunching mine, but it still has an 11-character string literal in it. I think your program relies upon Undefined Behavior on a 32-bit machine when the year exceeds 8191, though. – supercat – 2012-02-14T14:26:18.317

Indeed, there's an integer overflow, which is harmless on normal compilers. y*252 keeps the numbers small, but is two characters more. Perhaps a U on one of the constants will promote everything to unsigned (but how to test it, given that it works as is in all compilers I know?) – ugoren – 2012-02-14T14:49:20.103

Adding a "u" to the constant 4388, 7, or 252 would I believe eliminate any Undefined Behavior, but compare the strings output by your program on "Jan 01 6000" with those for "Feb 14 2012". – supercat – 2012-02-14T16:01:36.930

Indeed, you want strings to sort, not numbers. y*252 fixes both issues, but costs 2 characters. U won't help, because y*y overflows, and promoting to unsigned later won't help. – ugoren – 2012-02-14T19:47:27.010

The quantity 99999999 won't overflow a 32-bit 'int' until it's shifted left, so adding "u" to the indicated constants would prevent overflow. It won't solve the sorting problem, though; nor will using "y252". – supercat – 2012-02-14T20:12:29.253

You're right, the overflow is only after the shift. I think y*252 does the sorting problem, because the result is always 8 digits. – ugoren – 2012-02-14T20:14:16.467

You're right about eight digits; mea culpa. In any case, you've got some more golfing before you beat my solution, though with your nice hash function you could do it. – supercat – 2012-02-14T20:24:55.430

I figured that a sortable string, rather than a number, doesn't have to be a bad thing. So now I'm 5 characters from your solution (but a very hard 5 chars...) – ugoren – 2012-02-15T21:38:43.370

Is there any standard which permits DATE to be anything other than an eleven-character string? I certainly did not intend to require programs to deal with ten-character strings of the form Feb 1 2012. Your handling of ten-character date strings is magnificent, and would be worth keeping as a note, "if support for strings of the form 'Feb 1 2012' were required...". Without that requirement, see if you can get down to 105. – supercat – 2012-02-15T23:06:29.967

@supercat, If only you published this challenge on February 9th, not 10th... Tests with __DATE__ will only reveal this next month, and tests with generated strings depend on how you generate them. I need to set the clock on a machine and see __DATE__ – ugoren – 2012-02-16T05:40:40.487

Yeah, sorry about that. I realized after I saw your answer that I was perhaps unclear. In any case, your handling of variable-length date field is amazing, and I'm glad I got to see it. BTW, I'm trying to figure out, as a separate challenge, if there's any way to abuse the preprocessor to convert __DATE__ into a static integer constant. I figured out a way to surround __DATE__ with doubled quotes, so as to expose the month name to the preprocessor. I can't manage to gobble the whitespace characters, though, nor can I figure any context where two integer literals can appear adjacent. – supercat – 2012-02-16T07:39:22.870

I checked the standard, and it says a space is added before the day number if it's a single digit. About making it a number - I think __DATE__[xxx] is the way. I don't think there's any way to get rid of the quotes (I tried to define a macro to be a single ", but failed). – ugoren – 2012-02-16T07:57:59.917

Subscripted elements of a literal are not considered to be compile-time constants. The trick would have to be something more like #define p2(x,y) x##y #define p1(x,y) p2(x,y) #define Feb February p1(/,*"*/"__DATE__"/*"*/) which is thoroughly evil, but might be workable. – supercat – 2012-02-16T08:07:10.730

You're right about constants, but I doubt if you can do better. Your code does warning: pasting "/" and "*" does not give a valid preprocessing token, and pasting results in / *. I don't think it can be done. – ugoren – 2012-02-16T08:56:53.983

Hmm... Microsoft is probably not being standards-compliant then. In any case, I think there are two separately-applicable improvements left. – supercat – 2012-02-16T13:24:48.303

When do you think I should post my original answer? – supercat – 2012-02-17T17:14:50.527

Now would be a good time. I don't see how to improve it further (I guess there's a way if you say so, but I can't find it). – ugoren – 2012-02-17T21:18:15.730

A couple hints: (1) Were it not for the requirement that there be exactly one newline, concatenating the format string to __DATE__ would save %s and a comma (three characters). Fixing the newline requires adding two characters, but still leaves a net shrinkage of one character. – supercat – 2012-02-18T00:27:40.037

(2) There's exists at least one hash function which is conceptually similar to yours, but one character shorter, where months will either be in order or will 'tie', and where numerical ties will still result in proper sequencing. Incidentally, if one had a hash function which was the same length, but returned numbers in the reverse of month order, and always returned numbers larger than six (but preferably less than a few thousand) one could save another character, one could save a character on the formatting. – supercat – 2012-02-18T00:30:44.447

(1) I did try concatenating the format string and __DATE__, and had the newline problem. But it cost me too much, even without fixing the newline (though I didn't see the cheapest %.4s). Adding "\n" after __DATE__ was an extra pair of quotes, and m+7->m+14 was another wasted character. – ugoren – 2012-02-19T08:31:58.240

(2) I'm tired of looking for hash function. I used code to search for constants, but played manually with operator combinations, so the search wasn't exhaustive. I also didn't allow ties, though indeed they can be be broken with the month first letter (or second, if I use __DATE__+1 and m[1]->*m). With the reverse order, I guess you mean something like %*s, though I'm not sure about the details. Anyway, I'm out, let's see your code (and maybe the merge of both). – ugoren – 2012-02-19T08:40:32.883

OK, I'm not 100% out yet. I did manage to combine the format string with __DATE__, with no newline problems. 2 characters saved. So perhaps it's possible, with an improved hash function, to reach 100? – ugoren – 2012-02-19T16:02:03.887

Also replaced the hash function by a shorter one, that ties for Jan/Feb, but I break it correctly using the second letter. – ugoren – 2012-02-19T16:23:23.930

Doesn't your %s output a newline in addition to the one in the format string? My best is 105, which was like yours but with a %.5s to avoid outputting the extra newline. I don't think I would have found the operator combination for your hash function by myself, but I did search for and find the same combination you had. I'll post my previous as an answer. – supercat – 2012-02-19T16:57:52.103

(see last comment also) I hope you liked the challenge. My "reverse order" thought would have involved using the "*" format sub-specifier as an alternative to "%3d". If the * required took argument of type unsigned rather than int, one might be able to shave a character by having the format range from e.g. INT_MAX-94 to INT_MAX. That might yield a solution which was technically correct, but which would be incredibly icky on anything larger than a 16-bit machine (on a 64-bit machine, one would have a program whose output would be correct if it could be run to completion... – supercat – 2012-02-19T17:22:16.223

...but in practice would be completely unworkable. (Un?)fortunately the C standard doesn't allow for that. – supercat – 2012-02-19T17:24:26.173

I did manage to get %*s to work, along with the format string concatenated with __DATE__. I found another hash function, which is descending (not strictly, but tie broken correctly). It does go below 6, but only for December, which is still OK. 103 again. – ugoren – 2012-02-20T10:16:52.567

Bottom line, there was certainly a surprising amount of challenge in here. It's a very simple task, with the eventual solution much shorter and a lot different from initial attempts. But I still wonder if 100 is possible... – ugoren – 2012-02-22T08:28:08.067

I'm glad you enjoyed the challenge. What do you think of my solutions? – supercat – 2012-02-24T20:44:36.547

Too bad I required the single newline after the output. Otherwise you could do 99 easily. – supercat – 2012-02-24T22:37:05.750

Yes, m+6->m and %*.6s->%*s. Better, if you didn't insist on C99 compliance, I think I could go below 80. – ugoren – 2012-02-25T12:43:51.487

1

C, 194 characters

#include <stdio.h>
#include <string.h>
d,y,n[3];
int main(void)
{
sscanf(__DATE__,"%s %d %d",n,&d,&y);
return printf("%d%02d%02d",y,13-strlen(strstr("JanFebMarAprMayJunJulAugSepOctNovDec",n))/3,d);
}

I think most of the newlines are unnecessary, but I've left them in for readability.

Not sure what your feelings are about compiler warnings - this throws a few but runs fine. Also not sure whether the declarations with no type are valid C89 or not.

Gareth

Posted 2012-02-10T23:19:41.450

Reputation: 11 678

I don't believe C99 allows for implicit declarations of variable type. Even if such declarations were permitted for variables of type int, your substitution of an array of int for an array of char is almost certainly illegitimate. I'll also suggest that trying to convert the month into a nice number from 1 to 12 isn't necessary. As a simplification, arrange the months in the other order and you could drop the "13-" and the "/3". December would yield 36 and January would yield 3. That would save you five characters right there. Also, you if you append the declaration of a char*... – supercat – 2012-02-11T23:33:52.490

..with your declaration of n[], it might be profitable to define a variable for your "JanFeb"etc. string so you can subtract the strstr result from its base address. I don't think the strstr approach will get as short as my solution, no matter what you do, but it would be interesting to see how close it gets. – supercat – 2012-02-11T23:36:32.267

1

Here are the solutions that I had come up with (for brevity, I'm just writing the #include line once--copy and paste as needed to assemble a testable program).

(my entries)
#include <stdio.h>
int main(void){char*d=__DATE__"%s%sDFCwu-vBxE-t%c"+1;return printf(d+6,d+(*d+d[1])%17+9,d,10);}
int main(void){char*d="h-elbj-ikcfdga-"__DATE__+16;return printf("%s%s\n",d+6,d-(*d+d[1])%17);}
int main(void){char*d=__DATE__+1;return printf("%s%x%s\n",d+6,3**d+"![ WT -#[ 8"[d[1]%12],d);}
int main(void){char*d=__DATE__"%x%.5s\n"+1;return printf(d+6,3**d+"![ WT -#[ 8"[d[1]%12],d);}
int main(void){char*d=__DATE__"%x%.5s\n";return printf(d+7,d[1]+"1C0EB042E0:"[d[2]%12],d);}
(ugoren's entry, golfed, for comparison, and adding .5 to the %s format specifier)
int main(void){char*m=__DATE__"%3d%.5s\n"+1;return!printf(m+6,(*m*803^m[1]*95)%94,m);}

My initial approaches made use of a month-to-number approach which can turn each month into an arbitrary character. If in a production embedded-systems environment one had to turn an alphabetic month into a number 1-12, the approach might actually be an efficient algorithm to do so (perhaps using repeated subtract-17 instead of mod, if no other divisions are required).

Allowing the computation to return values which were sortable but not consecutive made it possible to use a smaller table indexed using the third character of the month, and add to that character the second. A further savings was achieved by realizing that the multiplication by three I'd used to facilitate "tiebreaking" was in fact not needed, since the months which would otherwise yield matching hash values were correctly sortable by their first character.

I tried quite a few variations, but nothing came close to the operator combination that ugoren found, which don't require any table. I was also impressed by his entry which would have accommodated single-digit dates without padding, which would have been a challenge in its own right, and one which I doubt that I could have handled as nicely.

supercat

Posted 2012-02-10T23:19:41.450

Reputation: 273

Certainly nice solutions. All based on converting the month to a unique number, then using a table to make it sort right. Using just one character with the table, while breaking the ties using the other is also a nice idea. – ugoren – 2012-02-24T21:54:16.583

0

C, 163 characters

A different approach from my other solution.
I can save 13 characters by making t an int array, and relying on the internal layout of struct tm. But I guess it violates the rules.

#define _XOPEN_SOURCE
#include<time.h>
#include<stdio.h>
int main(void){
    struct tm t;
    strptime(__DATE__,"%b%d%Y",&t);
    return printf("%d\n",t.tm_year*366+t.tm_yday);
}

ugoren

Posted 2012-02-10T23:19:41.450

Reputation: 16 527

I should have been clearer in my wording, as I meant to imply that code should operate correctly in any standards-conforming C89 compiler and <i>also</i> work correctly with any standards-conforming C99 compiler. Still, since I was unclear about that, that wouldn't be a disqualifying factor if the C99 standard in fact states that the libraries will work all the way through Dec 31, 9999. Does the spec in fact promise that? – supercat – 2012-02-14T05:01:49.970

I actually have no idea. But since my other answer is much shorter, and makes no such assumptions, I don't think I'll bother checking it. – ugoren – 2012-02-14T07:12:40.163