-
Notifications
You must be signed in to change notification settings - Fork 31
export functionality: start peers and wait for peer network to stabilize before exporting#162
Open
inqrphl wants to merge 3 commits into
Open
export functionality: start peers and wait for peer network to stabilize before exporting #162inqrphl wants to merge 3 commits into
inqrphl wants to merge 3 commits into
Conversation
...ize before exporting exporting used to only contact peers in the config and only call initAllTables. it did not start the peer at all, and onlt exported the initial state of the tables. this led to sub-peers in the lower tiers not being discovered. starting the peer would call peer.tryUpdate -> peer.updateTick -> datastoreset.updateDelta -> ds.sync.UpdateDelta this would then get sites table for LMD peers , and /thruk/r/v1/sites API endpoint for thruk peers. if a new site was discovered, it would then call peer.addSubPeer the sub peer would be added to lmd.peerMap , and started independently to possibly discover all sub peers. in the exporter function, add a network state stabilization function. this starts all peers in configuration, and lets them discover their network independently. after that it waits for stabilization in rounds. each round checks the peer map, seeing if the total count of peers went up and every peer has data. if the peer map looks stable for three consecutive rounds, it exits. maximum of 30 rounds with 10 second sleep time in between. after the stabilization rounds, the peer tables are exported. exported peer count is passed upwards in the call chain, up until to the main.go. this is done to stay true to the exported peer count. after the maximum rounds more peers could be discovered and added to peerMap, so saying that len(peerMap) of peers are exported could be wrong.
inqrphl
commented
Jun 10, 2026
Author
Tested with the lmd_federation_multitier_e2e scenario, which spwans three tiers of OMD instances.
ahmet@jelinek:~/repositories/thruk/t/scenarios/lmd_federation_multitier_e2e$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
84bccd3b8bfd lmd_federation_multitier_e2e-tier3b "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier3b-1
5e34d7bac3dc lmd_federation_multitier_e2e-tier1a "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 4730/tcp, 5666/tcp, 127.0.0.3:60080->80/tcp, 127.0.0.3:60443->443/tcp lmd_federation_multitier_e2e-tier1a-1
e7e42c9bbafc lmd_federation_multitier_e2e-tier3a "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier3a-1
a753e362f6f3 lmd_federation_multitier_e2e-tier1d "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 4730/tcp, 5666/tcp, 127.0.0.3:60083->80/tcp, 127.0.0.3:60446->443/tcp lmd_federation_multitier_e2e-tier1d-1
733d6722ca07 lmd_federation_multitier_e2e-tier2c "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier2c-1
f19f79fe12e1 lmd_federation_multitier_e2e-tier2a "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier2a-1
4e24c683ecbb lmd_federation_multitier_e2e-tier3c "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier3c-1
475414119bff lmd_federation_multitier_e2e-tier1b "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 4730/tcp, 5666/tcp, 127.0.0.3:60081->80/tcp, 127.0.0.3:60444->443/tcp lmd_federation_multitier_e2e-tier1b-1
c50110e0c366 lmd_federation_multitier_e2e-tier2e "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier2e-1
de9eaaa392de lmd_federation_multitier_e2e-tier1c "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 4730/tcp, 5666/tcp, 127.0.0.3:60082->80/tcp, 127.0.0.3:60445->443/tcp lmd_federation_multitier_e2e-tier1c-1
d1ee2e277e24 lmd_federation_multitier_e2e-tier2b "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier2b-1
30acc5b640b2 lmd_federation_multitier_e2e-tier2d "/root/start.sh" 19 hours ago Up 19 hours 22/tcp, 80/tcp, 443/tcp, 4730/tcp, 5666/tcp lmd_federation_multitier_e2e-tier2d-1
731c643400a4 naemon-dev-box-devbox "/bin/sh -c /box/dev..." 2 weeks ago Up 25 hours 0.0.0.0:1980->80/tcp, [::]:1980->80/tcp, 0.0.0.0:19443->443/tcp, [::]:19443->443/tcp naemon-dev-box-devbox-1
2fb85378e4fe portainer/portainer-ce:lts "/portainer" 3 months ago Up 25 hours 0.0.0.0:8000->8000/tcp, [::]:8000->8000/tcp, 0.0.0.0:9443->9443/tcp, [::]:9443->9443/tcp, 9000/tcp portainer
ahmet@jelinek:~/repositories/lmd$ ./lmd.linux.amd64 --logfile stdout -config lmd_export_test.ini --export export.tgz
[2026年06月10日 12:15:22.016][Info][pid:669002][exporter:28] starting export to export.tgz
[2026年06月10日 12:15:22.016][Info][pid:669002][peer:305] [tier1a] starting connection
[2026年06月10日 12:15:22.016][Info][pid:669002][peer:305] [tier1b] starting connection
[2026年06月10日 12:15:22.016][Info][pid:669002][peer:305] [tier1c] starting connection
[2026年06月10日 12:15:22.016][Info][pid:669002][peer:305] [tier1d] starting connection
[2026年06月10日 12:15:22.016][Info][pid:669002][exporter:201] waiting for all peers to connect and initialize
[2026年06月10日 12:15:22.060][Info][pid:669002][peer:1519] [tier1d] remote connection MultiBackend flag set, got 2 sites
[2026年06月10日 12:15:22.062][Info][pid:669002][peer:1519] [tier1a] remote connection MultiBackend flag set, got 8 sites
[2026年06月10日 12:15:22.122][Info][pid:669002][peer:726] [tier1d] initial objects synchronized in: 62.120492ms
[2026年06月10日 12:15:22.125][Info][pid:669002][peer:726] [tier1a] initial objects synchronized in: 63.168691ms
[2026年06月10日 12:15:22.142][Info][pid:669002][peer:726] [tier1d] initial objects synchronized in: 125.336869ms
[2026年06月10日 12:15:22.148][Info][pid:669002][peer:726] [tier1a] initial objects synchronized in: 131.785687ms
[2026年06月10日 12:15:22.168][Info][pid:669002][peer:726] [tier1b] initial objects synchronized in: 151.876835ms
[2026年06月10日 12:15:22.170][Info][pid:669002][peer:726] [tier1c] initial objects synchronized in: 153.361301ms
[2026年06月10日 12:15:29.370][Info][pid:669002][peer:305] [tier1d] starting connection
[2026年06月10日 12:15:29.371][Info][pid:669002][peer:305] [tier2d] starting connection
[2026年06月10日 12:15:29.384][Info][pid:669002][peer:305] [tier1a] starting connection
[2026年06月10日 12:15:29.384][Info][pid:669002][peer:305] [tier2b] starting connection
[2026年06月10日 12:15:29.384][Info][pid:669002][peer:305] [tier2c] starting connection
[2026年06月10日 12:15:29.384][Info][pid:669002][peer:305] [tier2e] starting connection
[2026年06月10日 12:15:29.384][Info][pid:669002][peer:305] [tier3c] starting connection
[2026年06月10日 12:15:29.384][Info][pid:669002][peer:305] [tier2a] starting connection
[2026年06月10日 12:15:29.384][Info][pid:669002][peer:305] [tier3a] starting connection
[2026年06月10日 12:15:29.385][Info][pid:669002][peer:305] [tier3b] starting connection
[2026年06月10日 12:15:29.576][Info][pid:669002][peer:726] [tier1d] initial objects synchronized in: 204.976402ms
[2026年06月10日 12:15:29.607][Info][pid:669002][peer:726] [tier2d] initial objects synchronized in: 236.232081ms
[2026年06月10日 12:15:29.869][Info][pid:669002][peer:726] [tier2e] initial objects synchronized in: 484.619575ms
[2026年06月10日 12:15:29.880][Info][pid:669002][peer:726] [tier1a] initial objects synchronized in: 495.708886ms
[2026年06月10日 12:15:29.901][Info][pid:669002][peer:726] [tier3b] initial objects synchronized in: 516.649309ms
[2026年06月10日 12:15:30.171][Info][pid:669002][peer:726] [tier2c] initial objects synchronized in: 785.288771ms
[2026年06月10日 12:15:30.195][Info][pid:669002][peer:726] [tier3a] initial objects synchronized in: 810.205028ms
[2026年06月10日 12:15:30.572][Info][pid:669002][peer:726] [tier3c] initial objects synchronized in: 1.187727796s
[2026年06月10日 12:15:30.583][Info][pid:669002][peer:726] [tier2b] initial objects synchronized in: 1.198999243s
[2026年06月10日 12:15:30.845][Info][pid:669002][peer:726] [tier2a] initial objects synchronized in: 1.460156078s
[2026年06月10日 12:16:02.019][Info][pid:669002][exporter:203] all peers ready for export (14 peers)
[2026年06月10日 12:16:02.022][Info][pid:669002][exporter:122] exported tier1b (tier1b), used space: 15 kb
[2026年06月10日 12:16:02.023][Info][pid:669002][exporter:122] exported tier1c (tier1c), used space: 15 kb
[2026年06月10日 12:16:02.025][Info][pid:669002][exporter:122] exported tier1d (8d6dc), used space: 16 kb
[2026年06月10日 12:16:02.027][Info][pid:669002][exporter:122] exported tier2d (881f5), used space: 15 kb
[2026年06月10日 12:16:02.029][Info][pid:669002][exporter:122] exported tier1a (2ae43), used space: 17 kb
[2026年06月10日 12:16:02.032][Info][pid:669002][exporter:122] exported tier2b (c369c), used space: 15 kb
[2026年06月10日 12:16:02.034][Info][pid:669002][exporter:122] exported tier2c (c784e), used space: 15 kb
[2026年06月10日 12:16:02.037][Info][pid:669002][exporter:122] exported tier2e (84bd2), used space: 16 kb
[2026年06月10日 12:16:02.039][Info][pid:669002][exporter:122] exported tier3c (a8dd1), used space: 35 kb
[2026年06月10日 12:16:02.045][Info][pid:669002][exporter:122] exported tier2a (c21da), used space: 18 kb
[2026年06月10日 12:16:02.047][Info][pid:669002][exporter:122] exported tier3a (5f0ee), used space: 17 kb
[2026年06月10日 12:16:02.048][Info][pid:669002][exporter:122] exported tier3b (e984d), used space: 16 kb
[2026年06月10日 12:16:02.048][Info][pid:669002][main:385] exported 12 peers successfully
...erAndDiscoverSubpeers this function calls peer.initAllTables and peer.updateTick once and does not do anything else with the peer use this function for initializing peers defined in config, and any other new peers while waiting for peer network to stabilize
inqrphl
commented
Jun 10, 2026
Author
starting a peer would have it run in the background indefinitely while the peer network was stabilizing.
instead of starting, wrote a new function instead called initializePeerAndDiscoverSubpeers. It calls peer.initAllTables and does a manual peer.updateTick, which will discover the subpeers.
call this function for peers in lmd config first, and while waiting for stabilization iterate through peer list, and then call it for any new peers as well.
the function calls waitGroup.Done() at the end, so concurrent instances for different peers are possible using goroutines.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
exporting used to only contact peers in the config and only call initAllTables. it did not start the peer at all, and only exported the initial state of the tables. this led to sub-peers in the lower tiers not being discovered.
starting the peer would call peer.tryUpdate -> peer.updateTick -> datastoreset.updateDelta -> ds.sync.UpdateDelta
this would then get sites table for LMD peers , and /thruk/r/v1/sites API endpoint for thruk peers. if a new site was discovered, it would then call peer.addSubPeer
the sub peer would be added to lmd.peerMap , and started independently to possibly discover all sub peers.
in the exporter function, add a network state stabilization function. this starts all peers in configuration, and lets them discover their network independently.
after that it waits for stabilization in rounds. each round checks the peer map, seeing if the total count of peers went up and every peer has data. if the peer map looks stable for three consecutive rounds, it exits. maximum of 30 rounds with 10 second sleep time in between.
after the stabilization rounds, the peer tables are exported. exported peer count is passed upwards in the call chain, up until to the main.go. this is done to stay true to the exported peer count. after the maximum rounds more peers could be discovered and added to peerMap, so saying that len(peerMap) of peers are exported could be wrong.