Count how many distance sequences are far from all others

13

1

The Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.

Let P be a binary string of length n and T be a binary string of length 2n-1. We can compute the n Hamming distances between P and every n-length substring of T in order from left to right and put them into an array (or list).

Example Hamming distance sequence

Let P = 101 and T = 01100. The sequence of Hamming distances you get from this pair is 2,2,1.

Definition of closeness

Now let's consider two such sequences of Hamming distances. Say x = (0, 2, 2, 3, 0) and y = (2, 1, 4, 4, 2) as examples. We say that x and y are close if y <= x <= 2*y or if x <= y <= 2*x. Here the scalar multiplication and inequality are taken elementwise. That is to say, for two sequences A and B, A <= B iff A[i] <= B[i] for all indices i.

Note that sequences of Hamming distances form a partial order under this way of comparing them. In other words, many pairs of sequences are neither greater than or equal nor less than or equal to each other. For example (1,2) and (2,1).

So using the example above, (0, 2, 2, 3, 0) <= 2*(2, 1, 4, 4, 2) = (4, 2, 8, 8, 4) but (0, 2, 2, 3, 0) is not bigger than (2, 1, 4, 4, 2). Also (2, 1, 4, 4, 2) is not smaller than or equal to 2*(0, 2, 2, 3, 0) = (0, 4, 4, 6, 0). As a result x and y are not close to each other.

Task

For increasing n starting at n=1, consider all possible pairs of binary strings P of length n and T of length 2n-1. There are 2^(n+2n-1) such pairs and hence that many sequences of Hamming distances. However many of those sequences will be identical . The task is to find the size of the largest set of Hamming distance sequences so that no two sequences are close to each other.

Your code should output one number per value of n.

Score

Your score is broadly speaking the highest n your code reaches on my machine in 5 minutes (but read on). The timing is for the total running time, not the time just for that n.

In order to give scores for non-optimal answers, because finding optimal answers is likely to be hard, we will need a slightly subtle scoring system. Your score is the highest value of n for which no one else has posted a higher correct answer for any size which is smaller than equal to this. For example, if you output 2, 4, 21 and someone else outputs 2, 5, 15 then you would only score 1 as someone else has a better answer for n = 2. If you output 2, 5, 21 then you would score 3 no matter what anyone else outputs because those answers are all optimal. Clearly if you have all optimum answers then you will get the score for the highest n you post. However, even if your answer is not the optimum, you can still get the score if no one else can beat it.

Example answers and worked example

(This answers are as yet unchecked. Independent verification would be gratefully received.)

Thanks to ETHproductions:

  • n = 1 gives 2.
  • n = 2 gives 5.
  • n = 3 gives 21.

Let's look at n = 2 in more detail. In this case the full list of Hamming distance sequences (represented by tuples here) is:

[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

We can see that (0,0) is not close to any other tuple. In fact if we take (0, 0), (0, 1), (1, 0), (2, 1), (1,2) then none of those tuples are close to any of the others. This gives a score of 5 for n = 2.

For n = 3 the full list of distinct Hamming distance sequences is:

 [(0, 0, 0), (0, 0, 1), (0, 1, 1), (0, 1, 2), (0, 1, 3), (0, 2, 1), (0, 2, 2), (0, 2, 3), (0, 3, 0), (0, 3, 1), (1, 0, 0), (1, 0, 1), (1, 0, 2), (1, 1, 0), (1, 1, 1), (1, 1, 2), (1, 1, 3), (1, 2, 0), (1, 2, 1), (1, 2, 2), (1, 2, 3), (1, 3, 0), (1, 3, 1), (1, 3, 2), (2, 0, 1), (2, 0, 2), (2, 0, 3), (2, 1, 0), (2, 1, 1), (2, 1, 2), (2, 1, 3), (2, 2, 0), (2, 2, 1), (2, 2, 2), (2, 2, 3), (2, 3, 1), (2, 3, 2), (2, 3, 3), (3, 0, 2), (3, 0, 3), (3, 1, 0), (3, 1, 1), (3, 1, 2), (3, 2, 0), (3, 2, 1), (3, 2, 2), (3, 3, 2), (3, 3, 3)]

Of those 48 sequences, we can pick out a set of size 21 so that no pair in that set are close to each other.

Languages and libraries

You can use any available language and libraries you like. Where feasible, it would be good to be able to run your code so please include a full explanation for how to run/compile your code in Linux if at all possible.

My Machine The timings will be run on my 64-bit machine. This is a standard ubuntu install with 8GB RAM, AMD FX-8350 Eight-Core Processor and Radeon HD 4250. This also means I need to be able to run your code.

Leading answer

  • Score of 4 for 2, 5, 21, 83, 361 by Christian Sievers. C++
  • Score of 5 for 2, 5, 21, 83, 372 by fəˈnɛtɪk. Javascript

user9206

Posted 2017-03-13T09:16:13.153

Reputation:

After looking at your question, it shows some similarities to spies, revised on hackerrank, which is a NP-complete problem

– fəˈnɛtɪk – 2017-03-15T11:40:37.850

@fəˈnɛtɪk Great! Note that my question doesn't require optimal solutions to get a good score. – None – 2017-03-15T12:45:09.493

@fəˈnɛtɪk Can you also confirm the answers for 1,2,3 in the question? – None – 2017-03-15T17:20:07.553

@fəˈnɛtɪk I very much doubt it is NP-hard. You would have to encode Set Packing or another NP-complete problem into a single integer with only a polynomial change in problem size. – None – 2017-03-15T18:27:37.283

297 unique hamming arrays for 4, 2040 unique arrays for 5 – fəˈnɛtɪk – 2017-03-16T15:54:20.247

The graphic representation of the possible sequences is a n-dimensional hypercube with side length n – fəˈnɛtɪk – 2017-03-16T16:19:11.530

bash: echo -e "2\n5\n21\n81"; yes 356 ;-) - Starting from the fourth entry, these are just lower bounds... – Christian Sievers – 2017-03-17T16:56:20.347

@ChristianSievers Nice try my friend :) – None – 2017-03-17T16:59:46.480

Not good enough. So I do s/81/83/, but there is a lot of work to do before I can change the 356. – Christian Sievers – 2017-03-17T19:59:05.120

@ChristianSievers Do you have code you could post as an answer? – None – 2017-03-17T20:00:53.533

I'll later continue to work on that – Christian Sievers – 2017-03-17T20:06:24.670

Answers

5

C++ using the igraph library

Thanks for a nice opportunity to learn a new library!

This program now computes 2, 5, 21, 83, 361 fast. You can control printing of the nodes with the PRINTNODES constant.

The graph used has extra edges between nodes corresponding to distance vectors where one is near (but not equal) to the other one reversed. That speeds up the computation, and any independent set found is of course also one of the original graph. Also, even if it is not completely enforced, the computed independent set is closed under reversion. I believe there always exists a maximal independent set with that property. At least there is one for n<=4. (I'm sure I can show 83 is optimal.)

#include<iostream>
#include<vector>
#include<set>
#include<algorithm>
#include<igraph.h>

using vect = std::vector<int>;

constexpr int MAXDIRECT=100;
constexpr int PRINTNODES=1;

std::set<int> avoid{};
igraph_t graph;
std::vector<vect> distance_vectors{};
int count;

int close_h(const vect &a, const vect &b ){
  // check one direction of the closeness condition
  for(auto i=a.begin(), j=b.begin(); i!=a.end(); i++,j++)
    if ( (*i > *j) || (*j > 2 * *i))
      return 0;
  return 1;
}

int close(const vect &a, const vect &b ){
  return close_h(a,b) || close_h(b,a);
}

vect distances(int n, int p, int t){
  vect res{};
  for (int i=0; i<n; ++i){
    int count = 0;
    for (int j=0; j<n; ++j)
      count += 1 & ((p>>j)^(t>>j));
    res.push_back(count);
    t >>= 1;
  }
  return res;
}

void print_vect( vect &v ){
  std::cout << "(";
  auto i=v.begin();
  std::cout << *i++;
  for( ; i!=v.end(); ++i)
    std::cout << "," << *i ;
  std::cout << ")\n";
}

void use_node( int n ){
  if(PRINTNODES)
    print_vect( distance_vectors[n] );
  ++count;
  avoid.insert( n );
  igraph_vector_t neighs;
  igraph_vector_init( &neighs, 0 );
  igraph_neighbors( &graph , &neighs, n, IGRAPH_OUT );
  for(int i=0; i<igraph_vector_size( &neighs ); ++i)
    avoid.insert( VECTOR(neighs)[i] );
  igraph_vector_destroy( &neighs );
}

void construct(int n){
  std::set<vect> dist_vects;
  for(int p=0; p>>n == 0; ++p)
    for(int t=0; t>>(2*n-2) == 0; ++t)   // sic! (because 0/1-symmetry)
      dist_vects.insert(distances(n,p,t));
  int nodes = dist_vects.size();
  std::cout << "distinct distance vectors: " << nodes << "\n";

  distance_vectors.clear();
  distance_vectors.reserve(nodes);
  std::copy(dist_vects.begin(), dist_vects.end(),
            back_inserter(distance_vectors));

  igraph_vector_t edges;
  igraph_vector_init( &edges, 0 );
  igraph_vector_t reversed;
  igraph_vector_init_seq( &reversed, 0, nodes-1 );
  for (int i=0; i<nodes-1; ++i){
    vect &x = distance_vectors[i];
    vect xr ( x.rbegin(), x.rend() );
    for(int j=i+1; j<nodes; ++j){
      vect &y = distance_vectors[j];
      if( xr==y ){
        VECTOR(reversed)[i] = j;
        VECTOR(reversed)[j] = i;
      }else if( close( x, y ) || close( xr, y) ){
        igraph_vector_push_back(&edges,i);
        igraph_vector_push_back(&edges,j);
      }
    }
  }
  std::cout << "edges: " << igraph_vector_size(&edges)/2 << "\n";

  igraph_create( &graph, &edges, nodes, IGRAPH_UNDIRECTED);
  igraph_vector_destroy( &edges );

  igraph_cattribute_VAN_setv( &graph, "r", &reversed );
  igraph_vector_destroy( &reversed );

  igraph_vector_t names;
  igraph_vector_init_seq( &names, 0, nodes-1 );
  igraph_cattribute_VAN_setv( &graph, "n", &names );
  igraph_vector_destroy( &names );

}

void max_independent( igraph_t *g ){
  igraph_vector_ptr_t livs;
  igraph_vector_ptr_init( &livs , 0 );
  igraph_largest_independent_vertex_sets( g, &livs );

  igraph_vector_t *nodes = (igraph_vector_t *) VECTOR(livs)[0];
  igraph_vector_t names;
  igraph_vector_init( &names, 0 );
  igraph_cattribute_VANV( g, "n", igraph_vss_vector( nodes ), &names );

  for(int i=0; i<igraph_vector_size(&names); ++i)
    use_node( VECTOR(names)[i] );
  igraph_vector_destroy( &names );
  igraph_vector_ptr_destroy_all( &livs );
}

void independent_comp( igraph_t *g );

void independent( igraph_t *g ){
  if(igraph_vcount( g ) < MAXDIRECT){
    max_independent( g );
    return;
  }
  igraph_vector_ptr_t components;
  igraph_vector_ptr_init( &components, 0 );
  igraph_decompose( g, &components, IGRAPH_WEAK, -1, 1);
  for(int i=0; i<igraph_vector_ptr_size( &components ); ++i)
    independent_comp( (igraph_t *) VECTOR(components)[i] );
  igraph_decompose_destroy( &components );
}

void independent_comp( igraph_t *g ){
  if (igraph_vcount( g ) < MAXDIRECT){
    max_independent( g );
    return;
  }
  igraph_vector_t degs;
  igraph_vector_init( &degs, 0 );
  igraph_degree( g, &degs, igraph_vss_all(), IGRAPH_OUT, 1 );
  int maxpos = igraph_vector_which_max( &degs );
  igraph_vector_destroy( &degs );  

  int name = igraph_cattribute_VAN( g, "n", maxpos );
  int revname = igraph_cattribute_VAN( g, "r", maxpos );
  int rev = -1;
  if(name!=revname){
    igraph_vector_ptr_t reversed_candidates_singleton;
    igraph_vector_ptr_init( &reversed_candidates_singleton, 0 );
    igraph_neighborhood( g, &reversed_candidates_singleton,
                         igraph_vss_1(maxpos), 2, IGRAPH_OUT );
    igraph_vector_t * reversed_candidates =
      (igraph_vector_t *) VECTOR(reversed_candidates_singleton)[0];
    igraph_vector_t names;
    igraph_vector_init( &names, 0 );
    igraph_cattribute_VANV( g, "n", igraph_vss_vector( reversed_candidates ),
                        &names );
    long int pos;
    igraph_vector_search( &names, 0, revname, &pos );
    rev = VECTOR(*reversed_candidates)[pos];
    igraph_vector_destroy( &names );
    igraph_vector_ptr_destroy( &reversed_candidates_singleton );
  }
  igraph_vs_t delnodes;
  igraph_vs_vector_small( &delnodes, maxpos, rev, -1 );
  igraph_delete_vertices( g, delnodes );
  igraph_vs_destroy( &delnodes );

  independent( g );
}

void handle(int n){
  std::cout << "n=" << n << "\n";
  avoid.clear();
  count = 0;
  construct( n );
  independent( &graph );
  // try all nodes again:
  for(int node=0; node<igraph_vcount( &graph ); ++node)
    if(avoid.count(node)==0)
      use_node(node);
  std::cout << "result: " << count << "\n\n";
  igraph_destroy( &graph );
}

int main(){
  igraph_i_set_attribute_table( &igraph_cattribute_table );
  for(int i=1; i<6; ++i)
    handle(i);
}

To compile on debian, install libigraph0-dev and do g++ -std=c++11 -Wall -O3 -I/usr/include/igraph -o ig ig.cpp -ligraph.

Old description:

The igraph library has a function to compute the maximal size of an independent vertex set of a graph. It can handle this problem up to n=3 in less than a second and does not terminate in days for n=4.

So what I do is decompose the graph into connected components, and let the library handle the small (less than MAXDIRECT nodes) components. For the other components, I select a vertex and delete it. In the best case, this splits the graph into several components, but usually it doesn't. Anyway, the components (maybe only one) are smaller, and we can use recursion.

Obviously the selection of the vertex is important. I just take one of maximal degree. I found that I get better result (but only for n=4) when I use a reversed node list. That explains the magic part of the construct function.

It might be worth while improving the selection. But it seems more important to reconsider the deleted nodes. Right now, I never look at them again. Some of them may be unconnected to any of the selected nodes. The problem is that I don't know which nodes form the independent set. For one, deleting nodes renumbers the remaining nodes. That can be handled by attaching attribtes to them. But worse, the computation of the independence number just gives this number. The best alternative the library offers is to compute all largest independent sets, which is slower (how much seems to depend on the graph size). Still, this seems the immediate way to go. Much more vaguely, I also think it might be useful to consider if we can use the special way the graph is defined.

The case n=6 might become reachable (at all, not necessarily in 5 minutes) if I replace the recursion with a loop using a queue for the remaining components.

I found it interesting to look at the components of the graphs. For n=4, their sizes are 168, 2*29, 2*28, 3, 4*2, 4*1. Only the biggest one can not be handled directly.

For n=5, the sizes are 1376, 2*128, 2*120, 119, several <=6.

I expect those double sizes to correspond to isomorphic graphs, but using this doesn't seem worth while because there is always one single dominating biggest component:

For n=6, the biggest component contains 11941 nodes (from a total of 15425), the next two biggest components have size 596.

For n=7, these numbers are 107593 (125232), 2647.

Christian Sievers

Posted 2017-03-13T09:16:13.153

Reputation: 6 366

Could you let me know what the set is for 83, I want to know why my algorithm doesn't get that high for 4 but somehow gets higher for 5 :P – fəˈnɛtɪk – 2017-03-19T23:44:10.480

It has to be g++ -std=c++11 -Wall -O3 -I/usr/include/igraph -o sievers sievers.cpp -ligraph . It matters where -ligraph is. – None – 2017-03-20T10:14:53.587

@ChristianSievers How does the generation of edges work in the code? – fəˈnɛtɪk – 2017-03-20T12:29:08.950

@ChristianSievers I was wondering how it determines what each vertex should connect to. Reversing the array might be screwing that up. – fəˈnɛtɪk – 2017-03-20T13:17:41.003

@fəˈnɛtɪk The distance vectors seem to come sorted out of the set I use to avoid duplicates, but I didn't even think about their ordering when I wrote that code. The inner loop starting at i+1 just avoids looking at a pair and also its swapped version which is not needed, and is the easiest way to avoid loops (edges (a,a)). It doesn't depend on the order in which the nodes come, I don't care if I get (a,b) or (b,a). – Christian Sievers – 2017-03-20T13:49:01.863

My greedy algorithm is suboptimal but it works well for 5 – fəˈnɛtɪk – 2017-03-22T11:37:06.637

@fəˈnɛtɪk My algorithm is more suboptimal! I think I have to be content with having pushed you to 83... – Christian Sievers – 2017-03-22T17:41:06.383

Getting 83 for 4 takes longer than getting 372 for 5 :P – fəˈnɛtɪk – 2017-03-22T17:42:24.837

@fəˈnɛtɪk Oh that's interesting! In the new version I get the 4 case without any need for special casing. – Christian Sievers – 2017-03-22T18:05:25.087

3

Javascript, Seq: 2,5,21, 81 83,372 67,349

Managed to increase the value for 4 by using random removal of elements at the start of my search. Oddly enough, removing 20 elements with more than 6 connections was faster than removing 5 elements with more than 8 connections...

This sequence is probably not optimal for 5 and might not be optimal for 4. None of the nodes are close to another in the set though.

Code:

input=4;
maxConnections=6;
numRand=20;

hammings=[];
h=(x,y)=>{retVal=0;while(x||y){if(x%2!=y%2)retVal++;x>>=1;y>>=1}return retVal};
for(i=1<<(input-1);i<1<<input;i++){
  for(j=0;j<1<<(2*input);j++){
    hamming=[];
    for(k=0;k<input;k++){
      hamming.push(h((j>>k)%(1<<input),i));
    }
    equal=0;
    for(k=0;k<hammings.length;k++){
      if(hamming.join("")==hammings[k].join("")){
        equal=1;
        break;
      }
    }
    if(!equal)hammings.push(hamming);
  }
}
adjMat=[];
for(i=0;i<Math.pow(input+1,input);i++){
  row=[];
  for(j=0;j<Math.pow(input+1,input);j++){
    row.push(0);
  }
  adjMat.push(row);
}
nodes=[]
for(i=0;i<Math.pow(input+1,input);i++){
  nodes[i]=0;
}
for(i=0;i<hammings.length;i++){
  sum=0;
  chkNodes=[[]];
  for(j=0;j<input;j++){
    chkVal=[];
    t=Math.pow(input+1,j);
    sum+=t*hammings[i][j];
    tmp=[];
    for(r=0;r<chkNodes.length;r++){
      for(k=hammings[i][j];k<=Math.min(hammings[i][j]*2,input);k++){
        stor=[]
        for(s=0;s<chkNodes[r].length;s++){
          stor.push(chkNodes[r][s])
        }
        stor.push(k)
        tmp.push(stor);
      }
    }
    chkNodes=[];
    for(r=0;r<tmp.length;r++){
      chkNodes.push(tmp[r])
    }
  }
  nodes[sum]=1;
  for(j=0;j<chkNodes.length;j++){
    adjSum=0
    for(k=0;k<input;k++){
      adjSum+=Math.pow(input+1,k)*chkNodes[j][k]
    }
    if(adjSum!=sum)adjMat[sum][adjSum]=adjMat[adjSum][sum]=1
  }
}
t=nodes.length;
for(i=0;i<t;i++){
  if(!nodes[i]){
    for(k=0;k<t;k++){
      adjMat[i][k]=adjMat[k][i]=0
    }
  }
}
sum=(a,b)=>a+b;
console.log(nodes.reduce(sum))
connections=x=>x.reduce(sum)
counts=adjMat.map(connections);
stor=[];
for(i=0;i<t;i++){
  stor.push(nodes[i]);
}
maxRemainder=0;

greater=[]
for(i=0;i<t;i++){
  if(nodes[i]&&counts[i]>maxConnections){
    greater.push(i);
  }
}

if(input==4){
  for(w=0;w<greater.length*numRand;w++){
    for(i=0;i<t;i++){
      nodes[i]=stor[i];
    }
    counts=adjMat.map(connections);
    toRemove=Math.floor(Math.random()*numRand*2)
    for(i=0;i<toRemove&&i<greater.length;i++){
      rand=Math.floor(Math.random()*greater.length);
      if(nodes[greater[rand]]){
        nodes[greater[rand]]=0;
        for(j=0;j<t;j++){
          if(adjMat[rand][j]){
            counts[j]--;
          }
        }
      }
    }

    for(i=0;i<t*t;i++){
      max=0;
      maxLoc=0;
      for(j=0;j<t;j++){
        if(counts[j]>=max&&nodes[j]){
          max=counts[j];
          maxLoc=j;
        }
      }
      if(max>0){
        for(j=0;j<t;j++){
          if(adjMat[maxLoc][j]){
            counts[j]--;
            if(counts[j]<max-1&&stor[j]&&!nodes[j]){
              nodes[j]=1;
              for(k=0;k<t;k++){
                if(adjMat[j][k])counts[k]++;
              }
            }
          }
          nodes[maxLoc]=0;
        }
      }
      else{
        break;
      }
    }
    maxRemainder=Math.max(maxRemainder,nodes.reduce(sum))
    //console.log(nodes.reduce(sum));
  }
  console.log(maxRemainder);
}
else{
  for(i=0;i<t*t;i++){
    max=0;
    maxLoc=0;
    for(j=0;j<t;j++){
      if(counts[j]>=max&&nodes[j]){
        max=counts[j];
        maxLoc=j;
      }
    }
    if(max>0){
      for(j=0;j<t;j++){
        if(adjMat[maxLoc][j]){
          counts[j]--;
          if(counts[j]<max-1&&stor[j]&&!nodes[j]){
            nodes[j]=1;
            for(k=0;k<t;k++){
              if(adjMat[j][k])counts[k]++;
            }
          }
        }
        nodes[maxLoc]=0;
      }
    }
    else{
      break;
    }
  }
  console.log(nodes.reduce(sum));
}

Try it online!

Snippet that can be added the end of the program to show what Hamming distance sequences each selected Hamming distance sequence

for(i=0;i<t;i++){
  if(nodes[i]){
    tmp=[]
    for(j=0;j<input;j++){
      tmp.unshift(Math.floor(i/Math.pow(input+1,j))%(input+1))
    }
    console.log(tmp.join(""))
    output=""
    for(j=0;j<t;j++){
      if(adjMat[i][j]&&stor[j]){
        outArr=[]
        for(k=0;k<input;k++){
          outArr.unshift(Math.floor(j/Math.pow(input+1,k))%(input+1))
        }
        output+=" "+outArr.join("");
      }
    }
    console.log(output)
  }
}

Explanation:

First, the code generates all unique hamming distances from the substrings.

input=3;
hammings=[];
h=(x,y)=>{retVal=0;while(x||y){if(x%2!=y%2)retVal++;x>>=1;y>>=1}return retVal};
for(i=1<<(input-1);i<1<<input;i++){
  for(j=0;j<1<<(2*input);j++){
    hamming=[];
    for(k=0;k<input;k++){
      hamming.push(h((j>>k)%(1<<input),i));
    }
    equal=0;
    for(k=0;k<hammings.length;k++){
      if(hamming.join("")==hammings[k].join("")){
        equal=1;
        break;
      }
    }
    if(!equal)hammings.push(hamming);
  }
}

Next, the code converts this list into an undirected graph

adjMat=[];
for(i=0;i<Math.pow(input+1,input);i++){
  row=[];
  for(j=0;j<Math.pow(input+1,input);j++){
    row.push(0);
  }
  adjMat.push(row);
}
nodes=[]
for(i=0;i<Math.pow(input+1,input);i++){
  nodes[i]=0;
}
for(i=0;i<hammings.length;i++){
  sum=0;
  chkNodes=[[]];
  for(j=0;j<input;j++){
    chkVal=[];
    t=Math.pow(input+1,j);
    sum+=t*hammings[i][j];
    tmp=[];
    for(r=0;r<chkNodes.length;r++){
      for(k=hammings[i][j];k<=Math.min(hammings[i][j]*2,input);k++){
        stor=[]
        for(s=0;s<chkNodes[r].length;s++){
          stor.push(chkNodes[r][s])
        }
        stor.push(k)
        tmp.push(stor);
      }
    }
    chkNodes=[];
    for(r=0;r<tmp.length;r++){
      chkNodes.push(tmp[r])
    }
  }
  nodes[sum]=1;
  for(j=0;j<chkNodes.length;j++){
    adjSum=0
    for(k=0;k<input;k++){
      adjSum+=Math.pow(input+1,k)*chkNodes[j][k]
    }
    if(adjSum!=sum)adjMat[sum][adjSum]=adjMat[adjSum][sum]=1
  }
}

Finally, the code cycles through this graph, removing the vertex with the most connections each cycle before restoring any nodes that would now have less connections than the current maximum. When it has finished with this cycle it outputs the number of remaining nodes

t=nodes.length;
for(i=0;i<t;i++){
  if(!nodes[i]){
    for(k=0;k<t;k++){
      adjMat[i][k]=adjMat[k][i]=0
    }
  }
}
sum=(a,b)=>a+b;
counts=adjMat.map(x=>x.reduce(sum));
stor=[];
for(i=0;i<t;i++){
  stor.push(nodes[i]);
}
for(i=0;i<t*t;i++){
  max=0;
  maxLoc=0;
  for(j=0;j<t;j++){
    if(counts[j]>=max&&nodes[j]){
      max=counts[j];
      maxLoc=j;
    }
  }
  if(max>0){
    for(j=0;j<t;j++){
      if(adjMat[maxLoc][j]){
        counts[j]--;
        if(counts[j]<max-1&&stor[j]&&!nodes[j]){
          nodes[j]=1;
          for(k=0;k<t;k++){
            if(adjMat[j][k])counts[k]++;
          }
        }
      }
      nodes[maxLoc]=0;
    }
  }
  else{
    break;
  }
}
console.log(nodes.reduce(sum));

Sets:

1:

0 1

2:

00 01 10 12 21

3:

000 001 011 013 030 031 100 101 110 111 123 130 132 203 213 231 302 310 312 
321 333

4:

0000 0001 0011 0111 0124 0133 0223 0230 0232 0241 0313 0320 0322 0331 0403 
0412 1000 1001 1013 1021 1100 1102 1110 1111 1134 1201 1224 1233 1243 1304 
1314 1323 1330 1332 1342 1403 1413 1420 1422 2011 2033 2124 2133 2140 2142 
2214 2230 2241 2303 2313 2320 2331 2411 3023 3032 3040 3041 3101 3114 3123 
3130 3132 3141 3203 3213 3220 3231 3302 3310 3312 3321 3334 3343 3433 4031 
4113 4122 4131 4210 4212 4221 4311 4333

5:

00000 00001 00011 00111 00123 01112 01235 01244 01324 01343 02111 02230 
02234 02333 02342 02432 02441 02522 02530 02531 03134 03142 03220 03224 
03233 03241 03314 03323 03331 03403 03412 03421 03520 04133 04141 04214 
04223 04232 04303 04313 04322 05042 05050 05051 05132 10000 10001 10011 
10122 10212 10221 10245 11000 11001 11013 11022 11100 11112 11120 11121 
11202 11211 11345 11353 11443 12012 12111 12201 12245 12253 12335 12344 
12352 12425 12430 12434 12442 12513 12532 13033 13042 13244 13252 13325 
13330 13334 13342 13404 13424 13433 13441 13520 13522 13531 14032 14051 
14140 14152 14225 14230 14234 14241 14243 14304 14315 14324 14332 14413 
14420 14422 14431 15041 15050 15125 15133 15142 15215 15223 15232 20112 
20135 20211 20253 20334 20352 21012 21021 21102 21110 21111 21201 21245 
21344 21352 21430 21433 21442 21514 21523 22011 22101 22135 22244 22252 
22325 22334 22340 22343 22405 22415 22424 22441 22520 22522 22531 23041 
23144 23150 23152 23225 23234 23240 23243 23251 23304 23315 23324 23333 
23341 23403 23413 23420 23432 23521 24031 24050 24125 24130 24134 24142 
24151 24215 24224 24233 24303 24314 24320 24323 24331 24412 24421 25123 
25132 25141 25203 25214 25222 25231 25302 25312 25321 30234 30243 30252 
30324 30333 30340 30342 30414 30423 30430 30432 31011 31235 31244 31253 
31325 31334 31340 31343 31405 31415 31424 31432 31441 31504 31521 32025 
32034 32100 32144 32152 32225 32234 32240 32243 32251 32304 32315 32324 
32330 32333 32342 32403 32414 32423 32512 33024 33031 33033 33125 33134 
33140 33143 33151 33215 33224 33230 33233 33242 33303 33314 33320 33323 
33332 33412 33431 34124 34133 34203 34214 34223 34232 34241 34310 34313 
34322 34411 35202 35213 35221 35311 40323 40332 40341 40431 40505 40513 
41135 41144 41240 41243 41252 41324 41330 41333 41342 41403 41414 41423 
41512 42033 42134 42143 42230 42233 42242 42303 42310 42314 42323 42332 
42341 42413 42422 42431 43023 43124 43130 43133 43142 43203 43220 43223 
43232 43241 43302 43313 43322 43331 43421 44114 44123 44132 44210 44213 
44222 44231 44312 44321 50413 50422 50504 51233 51242 51251 51323 51332 
51341 51413 51422 52023 52133 52142 52151 52223 52232 52241 52313 52322 
52331 52421 53102 53114 53122 53210 53213 53321 54201 54212 54221 54311

fəˈnɛtɪk

Posted 2017-03-13T09:16:13.153

Reputation: 4 166

Thanks for giving the first answer! Could you give a step by step guide for idiots on how to run your code in Linux please? – None – 2017-03-16T13:15:33.427

Maybe fəˈnɛtɪk could turn his code into a stack snippet? – mbomb007 – 2017-03-16T13:32:56.710

@mbomb007 for some reason, making this into a snippet causes the error 0 is not a function... in the line for(j=0;j<t;j++) – fəˈnɛtɪk – 2017-03-16T13:52:56.210

Maybe you could try JSFiddle? – mbomb007 – 2017-03-16T15:46:25.067

If you have chrome you can copy paste the code into the console and run it by pressing enter. Not entirely sure about other browsers. Chrome runs the code faster than the online systems for me. Managed to get a 5th value of 349 – fəˈnɛtɪk – 2017-03-17T00:30:25.267

Firefox also has a console but I can't get it to run for an input of 5. – fəˈnɛtɪk – 2017-03-17T00:40:17.073

You may have to run this more than once to get the desired result of 83 for input of 4. Under current conditions it only randomly removes 20 elements 3684 times. As a result, it will sometimes get 82 and sometimes get 83. – fəˈnɛtɪk – 2017-03-20T15:05:22.107

Can you put the 372 list on bpaste.net ? – None – 2017-03-22T08:27:00.690

@Lembik I don't have access to filesharing right now due to firewall. I have attached all the sets of hamming distances that are not close to one another. – fəˈnɛtɪk – 2017-03-22T11:52:11.843

@fəˈnɛtɪk Very interesting, thanks! – None – 2017-03-23T09:01:43.840