C++ with pthreads
This gets to n=14 in just under 1 minute on my machine. But since that's just a 2-core laptop, I hope that the 8-core test machine can finish n=15 in under 2 minutes. It takes about 4:20 minutes on my machine.
I was really hoping to come up with something more efficient. There has got to be a way to calculate the determinate of a binary matrix more efficiently. I wanted to come up with some kind of dynamic programming approach that counts the +1 and -1 terms in the determinant calculation. But it just hasn't quite come together so far.
Since the bounty is about to expire, I implemented the standard brute force approach:
- Loop over all possible Toeplitz matrices.
- Skip one of the two in each transposed matrix pair. Since the matrix is described by bitmask values, this is simple to do by skipping all values where the reverse of the bitmask is smaller than the bitmask itself.
- The determinate is calculated with a text book LR decomposition. Except for some minor performance tuning, the main improvement I made to the algorithm from my college numerical methods book is that I use a simpler pivot strategy.
- Parallelization is done with pthreads. Just using regular spacing for the values processed by each thread caused very bad load balancing, so I introduced some swizzling.
I tested this on Mac OS, but I used similar code on Ubuntu before, so I hope this will compile and run without a hitch:
- Save the code in a file with a
.cpp
extension, e.g. optim.cpp
.
- Compile with
gcc -Ofast optim.cpp -lpthread -lstdc++
.
- Run with
time ./a.out 14 8
. The first argument is the maximum n
. 14 should finish in under 2 minutes for sure, but it would be great if you could try 15 as well. The second argument is the number of threads. Using the same value as the number of cores of the machine is normally a good start, but trying some variations could potentially improve the times.
Let me know if you have any problems building or running the code.
#include <stdint.h>
#include <pthread.h>
#include <cstdlib>
#include <iostream>
static int NMax = 14;
static int ThreadCount = 4;
static pthread_mutex_t ThreadMutex;
static pthread_cond_t ThreadCond;
static int BarrierCount = 0;
static float* MaxDetA;
static uint32_t* MaxDescrA;
static inline float absVal(float val)
{
return val < 0.0f ? -val : val;
}
static uint32_t reverse(int n, uint32_t descr)
{
uint32_t descrRev = 0;
for (int iBit = 0; iBit < 2 * n - 1; ++iBit)
{
descrRev <<= 1;
descrRev |= descr & 1;
descr >>= 1;
}
return descrRev;
}
static void buildMat(int n, float mat[], uint32_t descr)
{
int iDiag;
for (iDiag = 1 - n; iDiag < 0; ++iDiag)
{
float val = static_cast<float>(descr & 1);
descr >>= 1;
for (int iRow = 0; iRow < n + iDiag; ++iRow)
{
mat[iRow * (n + 1) - iDiag] = val;
}
}
for ( ; iDiag < n; ++iDiag)
{
float val = static_cast<float>(descr & 1);
descr >>= 1;
for (int iCol = 0; iCol < n - iDiag; ++iCol)
{
mat[iCol * (n + 1) + iDiag * n] = val;
}
}
}
static float determinant(int n, float mat[])
{
float det = 1.0f;
for (int k = 0; k < n - 1; ++k)
{
float maxVal = 0.0f;
int pk = 0;
for (int i = k; i < n; ++i)
{
float q = absVal(mat[i * n + k]);
if (q > maxVal)
{
maxVal = q;
pk = i;
}
}
if (pk != k)
{
det = -det;
for (int j = 0; j < n; ++j)
{
float t = mat[k * n + j];
mat[k * n + j] = mat[pk * n + j];
mat[pk * n + j] = t;
}
}
float s = mat[k * n + k];
det *= s;
s = 1.0f / s;
for (int i = k + 1; i < n; ++i)
{
mat[i * n + k] *= s;
for (int j = k + 1; j < n; ++j)
{
mat[i * n + j] -= mat[i * n + k] * mat[k * n + j];
}
}
}
det *= mat[n * n - 1];
return det;
}
static void threadBarrier()
{
pthread_mutex_lock(&ThreadMutex);
++BarrierCount;
if (BarrierCount <= ThreadCount)
{
pthread_cond_wait(&ThreadCond, &ThreadMutex);
}
else
{
pthread_cond_broadcast(&ThreadCond);
BarrierCount = 0;
}
pthread_mutex_unlock(&ThreadMutex);
}
static void* threadFunc(void* pData)
{
int* pThreadIdx = static_cast<int*>(pData);
int threadIdx = *pThreadIdx;
float* mat = new float[NMax * NMax];
for (int n = 1; n <= NMax; ++n)
{
uint32_t descrRange(1u << (2 * n - 1));
float maxDet = 0.0f;
uint32_t maxDescr = 0;
uint32_t descrInc = threadIdx;
for (uint32_t descrBase = 0;
descrBase + descrInc < descrRange;
descrBase += ThreadCount)
{
uint32_t descr = descrBase + descrInc;
descrInc = (descrInc + 1) % ThreadCount;
if (reverse(n, descr) > descr)
{
continue;
}
buildMat(n, mat, descr);
float det = determinant(n, mat);
if (det > maxDet)
{
maxDet = det;
maxDescr = descr;
}
}
MaxDetA[threadIdx] = maxDet;
MaxDescrA[threadIdx] = maxDescr;
threadBarrier();
// Let main thread output results.
threadBarrier();
}
delete[] mat;
return 0;
}
static void printMat(int n, float mat[])
{
for (int iRow = 0; iRow < n; ++iRow)
{
for (int iCol = 0; iCol < n; ++iCol)
{
std::cout << " " << mat[iRow * n + iCol];
}
std::cout << std::endl;
}
std::cout << std::endl;
}
int main(int argc, char* argv[])
{
if (argc > 1)
{
NMax = atoi(argv[1]);
if (NMax > 16)
{
NMax = 16;
}
}
if (argc > 2)
{
ThreadCount = atoi(argv[2]);
}
MaxDetA = new float[ThreadCount];
MaxDescrA = new uint32_t[ThreadCount];
pthread_mutex_init(&ThreadMutex, 0);
pthread_cond_init(&ThreadCond, 0);
int* threadIdxA = new int[ThreadCount];
pthread_t* threadA = new pthread_t[ThreadCount];
for (int iThread = 0; iThread < ThreadCount; ++iThread)
{
threadIdxA[iThread] = iThread;
pthread_create(threadA + iThread, 0, threadFunc, threadIdxA + iThread);
}
float* mat = new float[NMax * NMax];
for (int n = 1; n <= NMax; ++n)
{
threadBarrier();
float maxDet = 0.0f;
uint32_t maxDescr = 0;
for (int iThread = 0; iThread < ThreadCount; ++iThread)
{
if (MaxDetA[iThread] > maxDet)
{
maxDet = MaxDetA[iThread];
maxDescr = MaxDescrA[iThread];
}
}
std::cout << "n = " << n << " det = " << maxDet << std::endl;
buildMat(n, mat, maxDescr);
printMat(n, mat);
threadBarrier();
}
delete[] mat;
delete[] MaxDetA;
delete[] MaxDescrA;
delete[] threadIdxA;
delete[] threadA;
return 0;
}
@AlexA. You are right and I have apologised. Luckily the two problems are very similar so he should easily be able to modify his code. – None – 2015-07-09T16:47:41.717
The solution by @Vioz comes up with a sequence that starts with 1, 1, 2, 3, 5, 9, 32. So the value for n=5 is different from what you list. Since all other values match, it looks like the solution is probably correct, and this is just a typo in the question? – Reto Koradi – 2015-07-09T18:56:51.030
@RetoKoradi Thank you. Fixed. – None – 2015-07-09T19:03:00.427
Here are 10 possible binary Toeplitz matrices with maximum determinants for
– Tyilo – 2015-07-10T03:55:07.340n = 1..10
: https://ghostbin.com/paste/axkpaDoes our code need to prove that we have the matrix with the greatest determinant (e.g. by checking all of them), or can we just have an algorithm that generates a matrix with greatest determinant without proving it's optimal? – lirtosiast – 2015-07-10T19:33:21.667
@ThomasKwa Well a proof is fine even it is cleverer than checking all possibilities. But I am not sure exactly what you meant. – None – 2015-07-10T19:36:37.460
Say I have a hill-climbing algorithm that tries to optimize for greatest determinant, which correctly finds a matrix with greatest determinant for n=1..15 but gives a matrix with suboptimal determinant for n>15. Would that be a valid submission for n=15? – lirtosiast – 2015-07-10T19:42:37.050
@ThomasKwa That's a tricky one as someone else would have to write code to verify your answers! I would love to see such a hill climbing answer but I don't think I can award a win to it sadly. – None – 2015-07-11T08:51:04.820
@Tyilo How high does it get? – None – 2015-07-11T17:03:23.923
Here's a better version of the Mathematica one:
Print["1: 1"];Do[m=Max@Table[d=IntegerDigits[c,2,2*n-1];Det[ToeplitzMatrix@@(Prepend@First@d/@Partition[Drop[d,1],n-1])],{c,0,2^(2*n-1)-1}];Print[n,": ",m],{n,2,100}]
It only gets to 9 in 2 minutes, so it's actually worse than my C version. (To actually run it you have to remove the‌̣
by stackexchange.) – Tyilo – 2015-07-11T20:24:58.090@Lembik I can get to 11 in 2 mins with:
Print["1: 1"];Quiet@Do[m=Max@Table[d=IntegerDigits[c,2,2*n-1];Tr[First@LUDecomposition[ToeplitzMatrix@@(Prepend@First@d/@Partition[Drop[d,1],n-1])],Times],{c,0,2^(2*n-1)-1}];Print[n,": ",m],{n,2,10}]
– Tyilo – 2015-07-11T22:18:27.4572As an observation that may help others but I can't verify beyond 14. It appears respective means of the top row and the first column of the Toeplitz matrix are always 0.4 <= m <= 0.6 for the maximum determinant. – MickyT – 2015-07-16T01:52:20.550
I notice that all the matrices with the highest determinants have a 1 in the main diagonal. I figure it would be considered cheating to hardwire that? – Reto Koradi – 2015-07-19T14:26:14.570
@RetoKoradi The max is sometimes reached with 0s in the main diagonal, e.g. for
n=3
there is[0,1,1;1,0,1;1,1,0]
, and the next one occurs forn=6
. I checked that forn=1..12
there is always at least one matrix with 1s in the main diagonal that reaches the max, but that's not the same as what you wrote. – Mitch Schwartz – 2015-07-19T21:26:35.597@MitchSchwartz The best solution my code got for n=17 has 0 in the main diagonal. I did not check if there are tied solutions that have a 1 diagonal. So yes, assuming that the main diagonal is 1 would not work for very long. – Reto Koradi – 2015-07-20T23:48:34.667
One thing I find fascinating when looking at the numbers is that some results are powers of 2. So far, I got powers of 2 for n=1, 2, 3, 7, 15. Except for n=2, these are all values of n that are of the form 2^k-1. I know it's risky to draw conclusions from a few values. But the value for n=15 being exactly 2^17 is almost too good to be a coincidence. I wonder if there's some kind of pattern. – Reto Koradi – 2015-07-20T23:53:12.000
@RetoKoradi "So yes, assuming that the main diagonal is 1 would not work for very long." does not follow logically from "The best solution my code got for n=17 has 0 in the main diagonal. I did not check if there are tied solutions that have a 1 diagonal." – Mitch Schwartz – 2015-07-21T05:17:06.943
@MitchSchwartz Not strictly, yes. But at least it shoots down the initial idea that only matrices with diagonal 1 are worth trying. I guess your counter-example for n=3 was sufficient. I should probably extend my code to keep track of ties if we want to know for sure. :) Well, at least for the sizes it can complete, which should be up to n=20 in a weekend run. – Reto Koradi – 2015-07-21T06:15:06.257
@RetoKoradi It is always worth comparing to https://oeis.org/A003432 . For n = 15 the max for general (0,1) matrices is also 2^17 and this is a much more widely studied question..
– None – 2015-07-21T06:33:21.423@MitchSchwartz My code output 927472 (which is smaller than the maximum) for n=17 with the main diagonal = 1. – Min_25 – 2015-07-23T03:13:55.753