Compute the CRC32 table at compile-time

17

3

The reference implementation of CRC32 computes a lookup table at runtime:

/* Table of CRCs of all 8-bit messages. */
unsigned long crc_table[256];

/* Flag: has the table been computed? Initially false. */
int crc_table_computed = 0;

/* Make the table for a fast CRC. */
void make_crc_table(void)
{
    unsigned long c;

    int n, k;
    for (n = 0; n < 256; n++) {
        c = (unsigned long) n;
        for (k = 0; k < 8; k++) {
            if (c & 1) {
                c = 0xedb88320L ^ (c >> 1);
            } else {
                c = c >> 1;
            }
        }
        crc_table[n] = c;
    }
    crc_table_computed = 1;
}

Can you compute the table at compile-time, thus getting rid of the function and the status flag?

fredoverflow

Posted 2011-07-29T15:31:41.447

Reputation: 2 671

Question was closed 2014-09-20T22:35:49.107

2What is the objective primary winning criterion here? – John Dvorak – 2014-09-20T17:58:51.717

also, language specific questions are frowned upon here. you should remove the c++ tag. – proud haskeller – 2014-09-20T18:16:44.370

Answers

13

Here's a plain C solution:

crc32table.c

#if __COUNTER__ == 0

    /* Initial setup */
    #define STEP(c) ((c)>>1 ^ ((c)&1 ? 0xedb88320L : 0))
    #define CRC(n) STEP(STEP(STEP(STEP(STEP(STEP(STEP(STEP((unsigned long)(n)))))))))
    #define CRC4(n) CRC(n), CRC(n+1), CRC(n+2), CRC(n+3)

    /* Open up crc_table; subsequent iterations will fill its members. */
    const unsigned long crc_table[256] = {

    /* Include recursively for next iteration. */
    #include "crc32table.c"

#elif __COUNTER__ < 256 * 3 / 4

    /* Fill the next four CRC entries. */
    CRC4((__COUNTER__ - 3) / 3 * 4),

    /* Include recursively for next iteration. */
    #include "crc32table.c"

#else

    /* Close crc_table. */
    };

#endif

It relies on the nonstandard __COUNTER__ macro, as well as an evaluation semantics where __COUNTER__ is evaluated before it is passed as an argument to a macro.

Note that, since STEP evaluates its argument twice, and CRC uses eight nested invocations of it, there is a small combinatorial explosion in the code size:

$ cpp crc32table.c | wc -c
4563276

I tested this in GCC 4.6.0 and Clang 2.8 on 32-bit Linux, and both produce the correct table.

Joey Adams

Posted 2011-07-29T15:31:41.447

Reputation: 9 929

Awesome, I certainly wasn't expecting this. Have a +1. – R. Martinho Fernandes – 2011-07-29T20:28:25.703

9

The core loop

for (k = 0; k < 8; k++) {
    if (c & 1) {
        c = 0xedb88320L ^ (c >> 1);
    } else {
        c = c >> 1;
    }
}

can be converted into a meta-function:

template <unsigned c, int k = 8>
struct f : f<((c & 1) ? 0xedb88320 : 0) ^ (c >> 1), k - 1> {};

template <unsigned c>
struct f<c, 0>
{
    enum { value = c };
};

Then, 256 calls to this meta-function (for the array initializer) are generated by the preprocessor:

#define A(x) B(x) B(x + 128)
#define B(x) C(x) C(x +  64)
#define C(x) D(x) D(x +  32)
#define D(x) E(x) E(x +  16)
#define E(x) F(x) F(x +   8)
#define F(x) G(x) G(x +   4)
#define G(x) H(x) H(x +   2)
#define H(x) I(x) I(x +   1)
#define I(x) f<x>::value ,

unsigned crc_table[] = { A(0) };

If you have Boost installed, generating the array initializer is a bit simpler:

#include <boost/preprocessor/repetition/enum.hpp>

#define F(Z, N, _) f<N>::value

unsigned crc_table[] = { BOOST_PP_ENUM(256, F, _) };

Finally, the following test driver simply prints all array elements to the console:

#include <cstdio>

int main()
{
    for (int i = 0; i < 256; ++i)
    {
        printf("%08x  ", crc_table[i]);
    }
}

fredoverflow

Posted 2011-07-29T15:31:41.447

Reputation: 2 671

6

A C++0x solution

template<unsigned long C, int K = 0>
struct computek {
  static unsigned long const value = 
    computek<(C & 1) ? (0xedb88320L ^ (C >> 1)) : (C >> 1), K+1>::value;
};

template<unsigned long C>
struct computek<C, 8> {
  static unsigned long const value = C;
};

template<int N = 0, unsigned long ...D>
struct compute : compute<N+1, D..., computek<N>::value> 
{ };

template<unsigned long ...D>
struct compute<256, D...> {
  static unsigned long const crc_table[sizeof...(D)];
};

template<unsigned long...D>
unsigned long const compute<256, D...>::crc_table[sizeof...(D)] = { 
  D...
};

/* print it */
#include <iostream>

int main() {
  for(int i = 0; i < 256; i++)
    std::cout << compute<>::crc_table[i] << std::endl;
}

Works on GCC (4.6.1) and Clang (trunk 134121).

Johannes Schaub - litb

Posted 2011-07-29T15:31:41.447

Reputation: 211

Regarding C >> 1, Isn't shifting negative values to the right unspecified behavior? ;) – fredoverflow – 2011-07-29T16:01:34.933

Also, can you explain the part where you define the array? I'm a bit lost there... – fredoverflow – 2011-07-29T16:02:38.680

@Fred you are right. I shall also make C a unsigned long. The constant array is defined to be initialized by the pack expansion D.... D is a non-type template parameter pack. Once GCC supports it, one can also declare the array in-class with static unsigned long constexpr crc_table[] = { D... };, but GCC doesn't parse braced in-class initializers yet. The benefit will be that compute<>::crc_table[I] could be used within constant expressions later in the code. – Johannes Schaub - litb – 2011-07-29T16:05:08.277

5

C++0x with constexpr. Works on GCC4.6.1

constexpr unsigned long computek(unsigned long c, int k = 0) {
  return k < 8 ? computek((c & 1) ? (0xedb88320L ^ (c >> 1)) : (c >> 1), k+1) : c;
}

struct table {
  unsigned long data[256];
};

template<bool> struct sfinae { typedef table type; };
template<> struct sfinae<false> { };

template<typename ...T>
constexpr typename sfinae<sizeof...(T) == 256>::type compute(int n, T... t) { 
  return table {{ t... }}; 
}

template<typename ...T>
constexpr typename sfinae<sizeof...(T) <= 255>::type compute(int n, T... t) {
  return compute(n+1, t..., computek(n));
}

constexpr table crc_table = compute(0);

#include <iostream>

int main() {
  for(int i = 0; i < 256; i++)
    std::cout << crc_table.data[i] << std::endl;
}

You can then use crc_table.data[X] at compile time because crc_table is constexpr.

Johannes Schaub - litb

Posted 2011-07-29T15:31:41.447

Reputation: 211

4

This is my first metaprogram:

#include <cassert>
#include <cstddef>

template <std::size_t N, template <unsigned long> class T, unsigned long In>
struct times : public T<times<N-1,T,In>::value> {};

template <unsigned long In, template <unsigned long> class T>
struct times<1,T,In> : public T<In> {};

template <unsigned long C>
struct iter {
    enum { value = C & 1 ? 0xedb88320L ^ (C >> 1) : (C >> 1) };
};

template <std::size_t N>
struct compute : public times<8,iter,N> {};

unsigned long crc_table[] = {
    compute<0>::value,
    compute<1>::value,
    compute<2>::value,
    // .
    // .
    // .
    compute<254>::value,
    compute<255>::value,
};

/* Reference Table of CRCs of all 8-bit messages. */
unsigned long reference_table[256];

/* Flag: has the table been computed? Initially false. */
int reference_table_computed = 0;

/* Make the table for a fast CRC. */
void make_reference_table(void)
{
    unsigned long c;

    int n, k;
    for (n = 0; n < 256; n++) {
        c = (unsigned long) n;
        for (k = 0; k < 8; k++) {
            if (c & 1) {
                c = 0xedb88320L ^ (c >> 1);
            } else {
                c = c >> 1;
            }
        }
        reference_table[n] = c;
    }
    reference_table_computed = 1;
}

int main() {
    make_reference_table();
    for(int i = 0; i < 256; ++i) {
        assert(crc_table[i] == reference_table[i]);
    }
}

I "hardcoded" the calls to the template that does the computation :)

R. Martinho Fernandes

Posted 2011-07-29T15:31:41.447

Reputation: 2 135

1If you're going to do it that way, why wouldn't you just hardcode the actual values into the program? (After obtaining them with another method, of course.) Saves on compile time. – Matthew Read – 2011-07-29T16:11:00.830

+1 for the test and the times template – fredoverflow – 2011-07-29T16:11:37.163

@Matthew: with only C++03, there's little choice. You could use the preprocessor, like Fred did, but that won't shorten compilation time. And apparently, his preprocessor chokes on his solution :) – R. Martinho Fernandes – 2011-07-29T16:12:40.913

@Matthew The point is to actually compute it at compile-time, not to have them hardcoded. Fred's answer generates an array of this form: unsigned crc_table[] = { f<0>::value , f<0 + 1>::value , f<0 + 2>::value , f<0 + 2 + 1>::value , f<0 + 4>::value , f<0 + 4 + 1>::value , f<0 + 4 + 2>::value , f<0 + 4 + 2 + 1>::value , f<0 + 8>::value , using the preprocessor. It takes about as long to compile as mine. If you want, you can read that last paragraph as "I unrolled the outer loop". There's no other choice in C++03 – R. Martinho Fernandes – 2011-07-29T16:27:31.517

Ah, I wasn't paying enough attention to the requirements of the question. +1, though I'm not sure I like the question anymore. I like my code challenges practical :P – Matthew Read – 2011-07-29T16:38:21.727

3

D

import std.stdio, std.conv;

string makeCRC32Table(string name){

  string result = "immutable uint[256]"~name~" = [ ";

  for(uint n; n < 256; n++){
    uint c = n;
    for(int k; k < 8; k++)
      c = (c & 1) ? 0xedb88320L ^ (c >> 1) : c >>1;
    result ~= to!string(c) ~ ", ";
  }
  return result ~ "];";
}

void main(){

  /* fill table during compilation */
  mixin(makeCRC32Table("crc_table"));

  /* print the table */
  foreach(c; crc_table)
    writeln(c);
}

It really puts C++ to shame, doesn't it?

Arlen

Posted 2011-07-29T15:31:41.447

Reputation: 181

No need to use a string mixin for this, here is how we do it in D's standard library, genTables and call-site to store the result in const data segment.

– Martin Nowak – 2017-07-07T07:44:59.260

2Actually, I prefer the non-stringly-typed solutions. This looks a lot like compile-time eval. – R. Martinho Fernandes – 2011-09-10T22:39:58.667

3

I would modify the previous answer by replacing the last three lines with:

#define crc4( x)    crcByte(x), crcByte(x+1), crcByte(x+2), crcByte(x+3)
#define crc16( x)   crc4(x), crc4(x+4), crc4(x+8), crc4(x+12)
#define crc64( x)   crc16(x), crc16(x+16), crc16(x+32), crc16(x+48)
#define crc256( x)  crc64(x), crc64(x+64), crc64(x+128), crc64(x+192)

Where crcByte is his K macro without the trailing comma. Then build the table itself with:

static const unsigned long crc32Table[256] = { crc256( 0)};

And never leave out the size of the array as the compiler will then verify that you have the correct amount of elements.

Jay

Posted 2011-07-29T15:31:41.447

Reputation: 31

3

C/C++, 306 295 bytes

#define C(c)((c)>>1^((c)&1?0xEDB88320L:0))
#define K(c)(C(C(C(C(C(C(C(C(c))))))))),
#define F(h,l)K((h)|(l+0))K((h)|(l+1))K((h)|(l+2))K((h)|(l+3))
#define R(h)F(h<<4,0)F(h<<4,4)F(h<<4,8)F(h<<4,12)
unsigned long crc_table[]={R(0)R(1)R(2)R(3)R(4)R(5)R(6)R(7)R(8)R(9)R(10)R(11)R(12)R(13)R(14)R(15)};

Working in reverse, we wind up with an unsigned long array named crc_table. We can omit the size of the array as the macros will ensure there are exactly 256 elements in the array. We initialize the array with 16 'rows' of data by using 16 invocations of the macro R.

Each invocation of R expands into four fragments (macro F) of four constants (macro K) for a total of 16 'columns' of data.

The macro K is the unrolled loop indexed by k in the code from the original question. It updates the value c eight times by invoking the macro C.

This preprocessor based solution uses quite a bit of memory during macro expansion. I tried to make it a little shorter by having an extra level of macro expansion and my compiler puked. The code above compiles (slowly) with both Visual C++ 2012 and g++ 4.5.3 under Cygwin (Windows 7 64 bit 8GB RAM).

Edit:

The fragment above is 295 bytes including white space. After expanding all of the macros except for C it grows to 9,918 bytes. As each level of C macro is expanded the size grows quickly:

  1. 25,182
  2. 54,174
  3. 109,086
  4. 212,766
  5. 407,838
  6. 773,406
  7. 1,455,390
  8. 2,721,054

So by the time all the macros have been expanded, that little 295 byte file expands into over 2.7 megabytes of code that must be compiled to generate the original 1024 byte array (assuming 32 bit unsigned long values)!

Another edit:

I modified the C macro based on a macro from another answer to squeeze an extra 11 bytes out, and greatly reduced the full expanded macro size. While 2.7 MB isn't as bad as 54 MB (the previous final size of all macro expansion), it is still significant.

CasaDeRobison

Posted 2011-07-29T15:31:41.447

Reputation: 736

This isn't [tag:code-golf], so you don't need to minimize the character count. – Ilmari Karonen – 2012-09-30T10:00:07.920

Ah. So it is. My bad on that part. Though I do think this implementation is portable (by which I mean it conforms the the C language and preprocessor; its true portability would depend on the environment's exact limits on macro expansion). – CasaDeRobison – 2012-10-01T05:09:43.843