WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Xen

xen-users

[Top] [All Lists]

Re: [Xen-users] Xen system hang or freeze

To: Nick Anderson <nick@xxxxxxxxxxxx>
Subject: Re: [Xen-users] Xen system hang or freeze
From: Peter Booth <peter_booth@xxxxxxx>
Date: 2009年4月21日 12:31:04 -0400
Cc: Paraic Gallagher <paraic.gallagher@xxxxxxxxx>, xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: 2009年4月21日 09:31:51 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20090421141047.GA13853@tp >
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <33b90e520904030756l3d2e2eb5s1b7e50535a9a44c7@xxxxxxxxxxxxxx> <20090403152333.GA20561@cmdln-laptop > <33b90e520904030859i23595d1ft32c491bfcd166389@xxxxxxxxxxxxxx> <9830130B-2801-4B39-9954-4569600AF973@xxxxxxx> <20090421141047.GA13853@tp >
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Some thoughts:
0. Do you have the default behavior where the guests independent wallclocks are disabled? 1. I have observed visible performance differences from a VM when %steal goes above 1%.
It sounds like you have 8 cores.
How many VMs do you have?
What are their weights and caps?
2. The system default of collecting sar every ten minutes is pretty unhelpful for problem diagnosis. I routinely adjust this to interval to five seconds, which for the expense of a lot of disk space, gives a historical dataset that is useful for forensics.
On Apr 21, 2009, at 10:10 AM, Nick Anderson wrote:
On Tue, Apr 21, 2009 at 08:30:32AM -0400, Peter Booth wrote:
It would be interesting to know whether sar data was captured during
this time. From this you could track whether there was any process
creation or destruction occurring.
I just had another lockup this weekend.
Sar (from the host)
12:35:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
12:45:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
12:55:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
01:05:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
01:15:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
Average: all 0.00 0.00 0.00 0.00
0.01 99.98
01:25:53 PM LINUX RESTART
01:35:02 PM CPU %user %nice %system %iowait
%steal %idle
01:45:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
01:55:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
02:05:01 PM all 0.00 0.00 0.00 0.00
0.01 99.99
sar -b
11:55:01 AM 12.22 0.90 11.32 12.90 257.89
12:05:01 PM 13.97 0.49 13.48 7.68 331.48
12:15:01 PM 18.88 7.30 11.59 161.74 260.17
12:25:01 PM 14.34 1.10 13.23 16.53 438.73
12:35:01 PM 9.01 0.43 8.58 6.96 208.50
12:45:01 PM 8.47 0.35 8.12 5.23 186.03
12:55:01 PM 10.00 1.09 8.91 19.22 245.17
01:05:01 PM 11.89 1.82 10.06 27.76 279.90
01:15:01 PM 10.06 0.34 9.72 5.23 214.62
Average: 17.55 6.12 11.43 385.87 369.74
01:25:53 PM LINUX RESTART
01:35:02 PM tps rtps wtps bread/s bwrtn/s
01:45:01 PM 19.01 7.19 11.83 113.49 273.91
01:55:01 PM 12.23 2.44 9.79 37.42 239.82
02:05:01 PM 16.89 2.79 14.10 47.93 422.02
02:15:01 PM 17.09 1.92 15.17 26.93 495.01
02:25:01 PM 13.91 3.42 10.49 164.83 282.82
02:35:01 PM 12.47 2.05 10.42 28.45 256.32
02:45:01 PM 13.67 1.81 11.87 31.78 340.39
sar -c
12:45:01 PM 0.02
12:55:01 PM 0.02
01:05:01 PM 0.02
01:15:01 PM 0.02
Average: 0.03
01:25:53 PM LINUX RESTART
01:35:02 PM proc/s
01:45:01 PM 0.02
01:55:01 PM 0.02
sar -q
12:55:01 PM 0 147 0.00 0.00 0.00
01:05:01 PM 0 147 0.07 0.03 0.01
01:15:01 PM 0 147 0.00 0.00 0.00
Average: 0 147 0.00 0.00 0.00
01:25:53 PM LINUX RESTART
01:35:02 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
01:45:01 PM 0 147 0.00 0.00 0.00
01:55:01 PM 0 147 0.00 0.00 0.00
sar -r
01:05:01 PM 7312568 1878856 20.44 175416 66532
1044184 0 0.00 0
01:15:01 PM 7311948 1879476 20.45 175416 66544
1044184 0 0.00 0
Average: 7328126 1863298 20.27 175403 67011
1044184 0 0.00 0
01:25:53 PM LINUX RESTART
01:35:02 PM kbmemfree kbmemused %memused kbbuffers kbcached
kbswpfree kbswpused %swpused kbswpcad
01:45:01 PM 8620940 570484 6.21 64136 36012
1044184 0 0.00 0
01:55:01 PM 8619824 571600 6.22 64972 36028
1044184 0 0.00 0
02:05:01 PM 8618204 573220 6.24 65800 36040
1044184 0 0.00 0
===============================================================
Now perhaps I have missed something but to me that all looks just
fine. I should setup something to log ps. But in my guests I see steal
pushed through the roof. And its like that for days ahead time. Ive
noticed the steal during the lockups before but either I neglected to
look back several days or forgot what I saw. I didnt recall steal
being at 100% as far back as my logs go.
12:55:01 PM CPU %user %nice %system %iowait
%steal %idle
01:05:01 PM all 0.00 0.00 0.00 0.00
100.00 0.00
01:15:01 PM all 0.00 0.00 0.00 0.00
100.00 0.00
Average: all 0.00 0.00 0.00 0.00
100.00 0.00
01:27:49 PM LINUX RESTART
01:35:01 PM CPU %user %nice %system %iowait
%steal %idle
01:45:01 PM all 4.04 0.00 1.80 0.64
0.02 93.50
01:55:01 PM all 4.10 0.00 1.76 0.31
0.02 93.80
02:05:01 PM all 5.45 0.00 2.47 0.23
0.02 91.83
02:15:01 PM all 7.03 0.00 3.22 0.22
0.02 89.51
02:25:01 PM all 4.82 0.00 2.31 0.18
0.01 92.6
Might also be worth adding a cron entry to append the output of lsof to a file every N minutes (perhaps with logrotate enabled) to see if you can capture what changed in the running system when this "lockup" occurred?
Also worth collecting ps output every minute
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
--
Nick Anderson <nick@xxxxxxxxxxxx>
http://www.cmdln.org
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
<Prev in Thread] Current Thread [Next in Thread>
Previous by Date: Re: [Xen-users] Windows DomU on NAS , David Miller
Next by Date: [Xen-devel] RE: Xen 4.0 Feature Requests , Tim Moore
Previous by Thread: Re: [Xen-users] Xen system hang or freeze , Nick Anderson
Next by Thread: Re: [Xen-users] Xen system hang or freeze , Nick Anderson
Indexes: [Date] [Thread] [Top] [All Lists]

Copyright ©, Citrix Systems Inc. All rights reserved. Legal and Privacy
Citrix This site is hosted by Citrix

AltStyle によって変換されたページ (->オリジナル) /