Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Dom0 xvif mbuf issues



On 2018年9月27日 13:13:27 +0200
Manuel Bouyer <bouyer%antioche.eu.org@localhost> wrote:
> On Wed, Sep 26, 2018 at 01:14:40PM -0700, Harry Waddell wrote:
> > 
> > I have a server where Dom0 started becoming unusable as of a few months ago
> > where previously it ran for years with few issues. 
> > 
> > netbsd-7 branch, never more than a month behind. 
> > BRIDGE_IPF is enabled and these options set in sysctl.conf: 
> > 
> > kern.sbmax=1048576
> > net.inet.tcp.recvbuf_max=1048576
> > net.inet.tcp.sendbuf_max=1048576
> > kern.mbuf.nmbclusters=300000
> > kern.maxfiles=3000
> > 
> > Xen 4.8.3 similarly updated.
> > 
> > One of the xvif devices "could not allocate a new mbuf". I enabled MBUF debugging
> > and netstat didn't seem to point to a leak on any of the devices. 
> 
> Looks like temporary memory shortage in the dom0 (this is a MGETHDR failing,
> not MCLGET, so the nmbclusters limit is not relevant).
> How many mbufs were allocated ?
> 
At the time of the hang, I have no idea. 
It's around 512 whenever I check. 
[root@xen-09:conf]> netstat -m
515 mbufs in use:
	513 mbufs allocated to data
	2 mbufs allocated to packet headers
0 calls to protocol drain routines
> 
> > It hung again, but with a new 
> > error scrolling on the console. "xennetback: got only 63 new mcl pages" 
> 
> This would point to a memory shortage in the hypervisor itself.
> Do you have enough free memory (xl info) ?
> 
total_memory : 131037
free_memory : 26601
sharing_freed_memory : 0
sharing_used_memory : 0
outstanding_claims : 0
> > 
> > My suspicion is that either one of the guests started doing a lot more nfs activity OR
> > that a VM I created which uses a fuse filesystem to move large dumpfile to azure blob
> > storage may be what pushed this previously working system off the edge. 
> > 
> > I'm moving the azure fuse system to another server and plan to disable ipf on the bridge. 
> 
> ipf shouldn't be a problem, I'm using it extensively on bridges here.
> 
Good. Just grasping at straws. 
> > 
> > Beyond that, and any suggestions? Should I just upgrade to netbsd 8 and/or xen 4.11?
> > ( even if it's just to make debugging easier since this is where current work is taking place? ) 
> 
> I'm not sure it would change something
> 
> > 
> > This is a production system with about 30 guests. I just want it to work like it used to. 
> 
> how many vifs is there in the dom0 ?
> 
I expect this is not an ideal way to do this but ...
(for i in `xl list | awk '{print 1ドル}'`;do xl network-list $i | grep vif ;done) | wc -l
 57
Several of the systems are part of a cluster where hosts are multihomed on 2 of 4 networks
to test a customer setup. Most of my systems have < 30, except for one other with 42. 
The others don't hang like this one does. 
> -- 
> Manuel Bouyer <bouyer%antioche.eu.org@localhost>
> NetBSD: 26 ans d'experience feront toujours la difference
> --
Thanks for the followup. Answers inline above. 
HW


Home | Main Index | Thread Index | Old Index

AltStyle によって変換されたページ (->オリジナル) /