Londiste3 queue position lost error

I recently encountered a londiste3 replication error (in a multi-master environment) where the status command noted that the queue position had been lost.

[[email protected] ~]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini status
Queue: facility1   Local node: remote_slave
master (root)
  |                           Tables: 1/0/0
  |                           Lag: 4m32s, Tick: 12080717, NOT UPTODATE
  +--: remote_slave (leaf)
                              Tables: 0/1/0
                              Lag: 1d22h50m5s, Tick: 11995419
                              ERR: remote_slave: Lost position: batch 11995443..11995444, dst has 11995419

The resolution was fairly simple. I used the –reset option on the worker to reset the queue position on the remote site and then issued wait-sync to get the table queue moving again.

[[email protected] etc]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini worker --reset
2015-06-17 15:54:46,307 1206 INFO Resetting queue tracking on dst side
[[email protected] etc]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini status
Queue: facility1   Local node: remote_slave
master (root)
  |                           Tables: 1/0/0
  |                           Lag: 7m19s, Tick: 12080717, NOT UPTODATE
  +--: remote_slave (leaf)
                              Tables: 0/1/0
                              Lag: 1d22h52m51s, Tick: 11995443

[[email protected] etc]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini wait-sync
2015-06-17 15:58:35,715 1619 INFO Waiting until all tables are in sync
2015-06-17 15:58:35,959 1619 INFO 1/1 table(s) to copy

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>