1

I have two PostgreSQL servers running, one publishing all tables and the other subscribing to all via logical replication. I'm currently testing a failover solution, and it seems if I remove the subscribers on the secondary and redirect the application to it, insertions fail as the primary key for each table is trying to start at one.

For example, I see a lot of stuff like this:

Oct 25 15:48:10 icinga-master1 icinga2[29819]: Error "ERROR:  duplicate key value violates unique constraint "pk_customvariable_id"
                                                            DETAIL:  Key (customvariable_id)=(1) already exists.
                                                            " when executing query "INSERT INTO icinga_customvariables (config_type, instance_id, is_json, object_id, varname, varvalue) VALUES (E'1', 1, E'0', 2677, E'gfl_bmname', E'tomcat1_filenotfound')"

When trying to manually insert that, I get the same error in postgres. I can, however, import the pg_dump from the secondary and everything is fine. Is there a setting I'm missing here? I've also tried throwing in a recovery.conf and using pg_ctl to promote, but that seems to stop the subscriber from working and resulted in the same issue. Maybe I'm not subscribing to a system table I should be?

I have these wal settings in my postgresql.conf in each:

wal_level = logical
hot_standby = on
hot_standby_feedback = on
max_logical_replication_workers = 8
max_sync_workers_per_subscription = 8

On primary:

                       List of publications
Name    |  Owner   | All tables | Inserts | Updates | Deletes 
------------+----------+------------+---------+---------+---------
icinga_pub | postgres | t          | t       | t       | t

On secondary (I do set that disabled when testing this):

             List of subscriptions
Name    |  Owner   | Enabled | Publication  
------------+----------+---------+--------------
icinga_sub | postgres | t       | {icinga_pub}

I've dropped the databases and started over on the secondary while troubleshooting to keep things clean. Any help is appreciated.

  • So, probably n00bery on my part, found my "why" after digging around in there for a while: https://www.postgresql.org/docs/10/static/logical-replication-restrictions.html I'll add an answer or link another answer when I find a practical way to do what is described there. – Blake Hartshorn Oct 26 '18 at 17:18

2 Answers2

3

This is a known limitation of logical replication within PostgreSQL 10.

Here's a snippet from the docs at https://www.postgresql.org/docs/10/logical-replication-restrictions.html

Sequence data is not replicated. The data in serial or identity columns backed by sequences will of course be replicated as part of the table, but the sequence itself would still show the start value on the subscriber. If the subscriber is used as a read-only database, then this should typically not be a problem. If, however, some kind of switchover or failover to the subscriber database is intended, then the sequences would need to be updated to the latest values, either by copying the current data from the publisher (perhaps using pg_dump) or by determining a sufficiently high value from the tables themselves.

In other words, what you're seeing is expected behaviour, albeit not necessarily wanted behaviour.

We have two functions that reset the sequences for us. They may not be perfect, but they work in our situation.

CREATE OR REPLACE FUNCTION public.update_sequence(
    IN tabschema text,
    IN tabname text,
    OUT tschema text,
    OUT tname text,
    OUT pkname text,
    OUT seqname text,
    OUT startval bigint,
    OUT minval bigint,
    OUT maxval bigint,
    OUT incr bigint,
    OUT maxseq bigint,
    OUT lastval bigint,
    OUT newseq bigint,
    OUT prevcalled boolean)
  RETURNS record AS
$BODY$
DECLARE
  seq_offset CONSTANT bigint := 0;
  seq_range CONSTANT bigint := 9999999999999;
BEGIN
  tschema := tabschema;
  tname := tabname;

  -- protect against concurrent inserts while you update the counter
  EXECUTE format('LOCK TABLE %I.%I IN EXCLUSIVE MODE', tabschema, tabname);

  SELECT column_name, table_name||'_'||column_name||'_seq' FROM information_schema.columns WHERE column_default IS NOT NULL AND data_type = 'bigint' AND column_default ilike 'nextval(%_seq''::regclass)' AND table_schema = tabschema AND table_name = tabname INTO pkname, seqname;
  SELECT start_value, min_value, max_value, increment_by FROM pg_sequences WHERE schemaname = tabschema AND sequencename = seqname INTO startval, minval, maxval, incr;
  EXECUTE format('SELECT last_value, is_called FROM %I.%I', tabschema, seqname) INTO lastval, prevcalled;
  EXECUTE format('SELECT max(%I) FROM %I.%I WHERE %I between $1 AND $2', pkname, tabschema, tabname, pkname) USING seq_offset+1, seq_offset+seq_range INTO maxseq;
  newseq := CASE WHEN maxseq IS NULL THEN seq_offset+incr ELSE coalesce(greatest(maxseq+incr, CASE WHEN prevcalled THEN lastval+incr ELSE lastval END), seq_offset+incr) END;

  EXECUTE format('ALTER SEQUENCE %I.%I MINVALUE %s START %s RESTART %s MAXVALUE %s;', tabschema, seqname, seq_offset+1, seq_offset+1, newseq, seq_offset+seq_range);
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;


CREATE OR REPLACE FUNCTION public.update_all_sequences()
  RETURNS TABLE(tabschema text, tabname text, pkname text, seqname text, startval bigint, minval bigint, maxval bigint, incr bigint, maxseq bigint, lastval bigint, newseq bigint, prevcalled boolean) AS
$BODY$
BEGIN
  RETURN QUERY WITH table_list (tschema, tname) AS (
    SELECT n.nspname, c.relname FROM pg_class c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind = 'r' AND n.nspname NOT IN ('information_schema', 'pg_catalog') ORDER BY n.nspname, c.relname
  )
  SELECT a.* FROM table_list t JOIN update_sequence(tschema, tname) a on t.tschema = a.tschema and t.tname = a.tname;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

Then all you'll need to do as part of your failover is just prior to switching your virtual IP over to the new master, is to run the update_all_sequences function to make sure the sequences are at the correct point (there's always a chance of a few sequences being missing if your increment_by value on the sequence is >1).

The seq_offset and seq_range values are there if you want to start at something other than 1, which we do as we have a geographically distributed system and databases from other datacentres start at other values.

  • Thanks Malcolm. I had found that in the doc and punished myself for it in my last reply. I was going to update soon with how I fixed this (currently deep in an event handler I'm writing in Python). I'm going to upvote your answer over mine because it's a purely SQL solution. – Blake Hartshorn Nov 08 '18 at 13:19
0

I like Malcolm's solution as a purely postgres option. Alternatively, here are the basic functions I looped over my tables in Python if you're hacking this into something else. The ones for dropping the subscription are assuming your primary is totally dead so your secondary doesn't start pulling from it if/when it wakes up.

ALTER SUBSCRIPTION [sub] DISABLE;
ALTER SUBSCRIPTION [sub] SET (slot_name = NONE);
COMMIT;
DROP SUBSCRIPTION [sub];

A kind of hacky way I got a list of all tables using a sequence. I wasn't able to figure out how to just back reference from getting a list of sequences.

SELECT table_name,column_name FROM information_schema.columns WHERE column_default like('nextval%');

Then I looped over that, first gathering last val:

SELECT MAX([column]) FROM [table];

Fetching the relevant serial:

SELECT pg_get_serial_sequence('[table]','[column]');

And finally,

SELECT setval('[sequence]','[maxvalue]');
COMMIT;

My way is ugly but it worked.