You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(33) |
Dec
(20) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(7) |
Feb
(44) |
Mar
(51) |
Apr
(43) |
May
(43) |
Jun
(36) |
Jul
(61) |
Aug
(44) |
Sep
(25) |
Oct
(82) |
Nov
(97) |
Dec
(47) |
2005 |
Jan
(77) |
Feb
(143) |
Mar
(42) |
Apr
(31) |
May
(93) |
Jun
(93) |
Jul
(35) |
Aug
(78) |
Sep
(56) |
Oct
(44) |
Nov
(72) |
Dec
(75) |
2006 |
Jan
(116) |
Feb
(99) |
Mar
(181) |
Apr
(171) |
May
(112) |
Jun
(86) |
Jul
(91) |
Aug
(111) |
Sep
(77) |
Oct
(72) |
Nov
(57) |
Dec
(51) |
2007 |
Jan
(64) |
Feb
(116) |
Mar
(70) |
Apr
(74) |
May
(53) |
Jun
(40) |
Jul
(519) |
Aug
(151) |
Sep
(132) |
Oct
(74) |
Nov
(282) |
Dec
(190) |
2008 |
Jan
(141) |
Feb
(67) |
Mar
(69) |
Apr
(96) |
May
(227) |
Jun
(404) |
Jul
(399) |
Aug
(96) |
Sep
(120) |
Oct
(205) |
Nov
(126) |
Dec
(261) |
2009 |
Jan
(136) |
Feb
(136) |
Mar
(119) |
Apr
(124) |
May
(155) |
Jun
(98) |
Jul
(136) |
Aug
(292) |
Sep
(174) |
Oct
(126) |
Nov
(126) |
Dec
(79) |
2010 |
Jan
(109) |
Feb
(83) |
Mar
(139) |
Apr
(91) |
May
(79) |
Jun
(164) |
Jul
(184) |
Aug
(146) |
Sep
(163) |
Oct
(128) |
Nov
(70) |
Dec
(73) |
2011 |
Jan
(235) |
Feb
(165) |
Mar
(147) |
Apr
(86) |
May
(74) |
Jun
(118) |
Jul
(65) |
Aug
(75) |
Sep
(162) |
Oct
(94) |
Nov
(48) |
Dec
(44) |
2012 |
Jan
(49) |
Feb
(40) |
Mar
(88) |
Apr
(35) |
May
(52) |
Jun
(69) |
Jul
(90) |
Aug
(123) |
Sep
(112) |
Oct
(120) |
Nov
(105) |
Dec
(116) |
2013 |
Jan
(76) |
Feb
(26) |
Mar
(78) |
Apr
(43) |
May
(61) |
Jun
(53) |
Jul
(147) |
Aug
(85) |
Sep
(83) |
Oct
(122) |
Nov
(18) |
Dec
(27) |
2014 |
Jan
(58) |
Feb
(25) |
Mar
(49) |
Apr
(17) |
May
(29) |
Jun
(39) |
Jul
(53) |
Aug
(52) |
Sep
(35) |
Oct
(47) |
Nov
(110) |
Dec
(27) |
2015 |
Jan
(50) |
Feb
(93) |
Mar
(96) |
Apr
(30) |
May
(55) |
Jun
(83) |
Jul
(44) |
Aug
(8) |
Sep
(5) |
Oct
|
Nov
(1) |
Dec
(1) |
2016 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
(3) |
Sep
(1) |
Oct
(3) |
Nov
|
Dec
|
2017 |
Jan
|
Feb
(5) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
(7) |
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
1
(2) |
2
(5) |
3
|
4
|
5
(1) |
6
|
7
|
8
|
9
|
10
(2) |
11
(3) |
12
|
13
(1) |
14
|
15
(3) |
16
(6) |
17
(4) |
18
(4) |
19
(5) |
20
(2) |
21
(9) |
22
(3) |
23
(1) |
24
(1) |
25
(2) |
26
|
27
|
28
(10) |
29
(6) |
30
(5) |
31
(4) |
|
|
On Tue, Dec 15, 2009 at 9:57 AM, Andrew Straw <str...@as...> wrote: > > notch_max = med + 1.57*iq/np.sqrt(row) > notch_min = med - 1.57*iq/np.sqrt(row) > > Is this code actually calculating a meaningful value? If so, what? > >From the statistics ignoramus in the room, so take this with a grain of salt... I'd write that code as notch_max = med + (iq/2) * (pi/np.sqrt(row)) and it makes more sense. The notch limits are an estimate of the interval of the median, which is (one-half, for each up/down) the q3-q1 range times a normalization factor which is pi/sqrt(n), where n==row=len(d). The 1/sqrt(n) makes some sense, as it's the usual statistical error normalization factor. The multiplication by pi, I'm not so sure, and I can't find that exact formula in any quick stats reference, but I'm sure someone who actually knows stats can point out where it comes from. Note that the code below does: if notch_max > q3: notch_max = q3 if notch_min < q1: notch_min = q1 though matlab explicitly states in: http://www.mathworks.com/access/helpdesk/help/toolbox/stats/boxplot.html that """ Interval endpoints are the extremes of the notches or the centers of the triangular markers. When the sample size is small, notches may extend beyond the end of the box. """ So it seems to me that the more principled thing to do would be to leave those notch markers outside the box if they land there, because that's a warning of the robustness of the estimation. Clipping them to q1/q3 is effectively hiding a problem... cheers, f
Hi, I've been reading about box plots and examining the source code for boxplot() lately. While there doesn't seem to be a convention about what the notch specifies, I can't find any justification (or text describing) what exactly the MPL notch is. The source code is: # get median and quartiles q1, med, q3 = mlab.prctile(d,[25,50,75]) iq = q3 - q1 notch_max = med + 1.57*iq/np.sqrt(row) notch_min = med - 1.57*iq/np.sqrt(row) Is this code actually calculating a meaningful value? If so, what? The original commit was r1098, which doesn't offer a useful comment either (only "aaplied several sf patches" ... looking through the SF bug tracker, I couldn't find anything relevant from before the commit date of 2005年03月28日).
The following (uncommitted) test currently fails. The reason is that mlab.prctile(x,50) doesn't handle even length sequences according to the numpy and wikipedia convention for the definition of median. Do we agree that it should pass? Not only would I commit the test, but I also have a fix to make it pass, derived from scipy.stats.scoreatpercentile(). This would affect boxplot, if not more. def test_prctile(): # test odd lengths x=[1,2,3] assert mlab.prctile(x,50)==np.median(x) # test even lengths x=[1,2,3,4] assert mlab.prctile(x,50)==np.median(x) # derived from email sent by jason-sage to MPL-user on 20090914 ob1=[1,1,2,2,1,2,4,3,2,2,2,3,4,5,6,7,8,9,7,6,4,5,5] p = [75] expected = [5.5] # test vectorized actual = mlab.prctile(ob1,p) assert np.allclose( expected, actual ) # test scalar for pi, expectedi in zip(p,expected): actuali = mlab.prctile(ob1,pi) assert np.allclose( expectedi, actuali )