You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(12) |
Sep
(12) |
Oct
(56) |
Nov
(65) |
Dec
(37) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(59) |
Feb
(78) |
Mar
(153) |
Apr
(205) |
May
(184) |
Jun
(123) |
Jul
(171) |
Aug
(156) |
Sep
(190) |
Oct
(120) |
Nov
(154) |
Dec
(223) |
2005 |
Jan
(184) |
Feb
(267) |
Mar
(214) |
Apr
(286) |
May
(320) |
Jun
(299) |
Jul
(348) |
Aug
(283) |
Sep
(355) |
Oct
(293) |
Nov
(232) |
Dec
(203) |
2006 |
Jan
(352) |
Feb
(358) |
Mar
(403) |
Apr
(313) |
May
(165) |
Jun
(281) |
Jul
(316) |
Aug
(228) |
Sep
(279) |
Oct
(243) |
Nov
(315) |
Dec
(345) |
2007 |
Jan
(260) |
Feb
(323) |
Mar
(340) |
Apr
(319) |
May
(290) |
Jun
(296) |
Jul
(221) |
Aug
(292) |
Sep
(242) |
Oct
(248) |
Nov
(242) |
Dec
(332) |
2008 |
Jan
(312) |
Feb
(359) |
Mar
(454) |
Apr
(287) |
May
(340) |
Jun
(450) |
Jul
(403) |
Aug
(324) |
Sep
(349) |
Oct
(385) |
Nov
(363) |
Dec
(437) |
2009 |
Jan
(500) |
Feb
(301) |
Mar
(409) |
Apr
(486) |
May
(545) |
Jun
(391) |
Jul
(518) |
Aug
(497) |
Sep
(492) |
Oct
(429) |
Nov
(357) |
Dec
(310) |
2010 |
Jan
(371) |
Feb
(657) |
Mar
(519) |
Apr
(432) |
May
(312) |
Jun
(416) |
Jul
(477) |
Aug
(386) |
Sep
(419) |
Oct
(435) |
Nov
(320) |
Dec
(202) |
2011 |
Jan
(321) |
Feb
(413) |
Mar
(299) |
Apr
(215) |
May
(284) |
Jun
(203) |
Jul
(207) |
Aug
(314) |
Sep
(321) |
Oct
(259) |
Nov
(347) |
Dec
(209) |
2012 |
Jan
(322) |
Feb
(414) |
Mar
(377) |
Apr
(179) |
May
(173) |
Jun
(234) |
Jul
(295) |
Aug
(239) |
Sep
(276) |
Oct
(355) |
Nov
(144) |
Dec
(108) |
2013 |
Jan
(170) |
Feb
(89) |
Mar
(204) |
Apr
(133) |
May
(142) |
Jun
(89) |
Jul
(160) |
Aug
(180) |
Sep
(69) |
Oct
(136) |
Nov
(83) |
Dec
(32) |
2014 |
Jan
(71) |
Feb
(90) |
Mar
(161) |
Apr
(117) |
May
(78) |
Jun
(94) |
Jul
(60) |
Aug
(83) |
Sep
(102) |
Oct
(132) |
Nov
(154) |
Dec
(96) |
2015 |
Jan
(45) |
Feb
(138) |
Mar
(176) |
Apr
(132) |
May
(119) |
Jun
(124) |
Jul
(77) |
Aug
(31) |
Sep
(34) |
Oct
(22) |
Nov
(23) |
Dec
(9) |
2016 |
Jan
(26) |
Feb
(17) |
Mar
(10) |
Apr
(8) |
May
(4) |
Jun
(8) |
Jul
(6) |
Aug
(5) |
Sep
(9) |
Oct
(4) |
Nov
|
Dec
|
2017 |
Jan
(5) |
Feb
(7) |
Mar
(1) |
Apr
(5) |
May
|
Jun
(3) |
Jul
(6) |
Aug
(1) |
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
1
(14) |
2
(3) |
3
|
4
|
5
|
6
(6) |
7
(8) |
8
(5) |
9
|
10
|
11
|
12
(7) |
13
(1) |
14
|
15
(2) |
16
(5) |
17
(8) |
18
|
19
(1) |
20
(2) |
21
(3) |
22
(1) |
23
(3) |
24
(1) |
25
|
26
|
27
|
28
|
29
(5) |
30
(3) |
31
|
Yes, I understand there are alternatives -- but I still think a simple, binned histogram is a fairly basic feature. KDEs are nice but can easily be overtweaked (if I see one I certainly want to know how the bandwidth was selected, otherwise it's not better than a histogram -- even worse, as the issue is now hidden); while CDFs (essentially, your second proposition) can be useful, some kinds of data are traditionally represented as histograms and CDFs would only confuse readers. Antony 2014年05月30日 15:11 GMT-07:00 Mark Voorhies <mar...@uc...>: > On 05/30/2014 08:25 AM, Antony Lee wrote: > >> I can still need to bin data, e.g. when the data range is "large", or at >> least not small compared to the number of data points. >> Antony >> > > Two alternatives to histograms that you might consider: > > Kernel density estimation (KDE) > > * This blog post has a good discussion motivating KDE from issues with bin > choice in histograms: > http://www.mglerner.com/blog/?p=28 > * And this follow up explores the various KDE implementations in the > "Scientific Python" stack: > http://jakevdp.github.io/blog/2013/12/01/kernel-density-estimation/ > > A rank vs. value plot, e.g.: > > plot(sorted(r)) > > This is horizontal for peaks (lots of copies of similar values) and > vertical for tails/gaps, > so it presents the same information as a histogram, but without requiring > bin choice. > > --Mark > > > >> >> 2014年05月30日 5:03 GMT-07:00 Yoshi Rokuko <yo...@ro...>: >> >> Am 2014年5月29日 14:14:52 -0700 >>> schrieb Antony Lee <ant...@be...>: >>> >>> Hi, >>>> When histogramming integer data, is there an easy way to tell >>>> matplotlib that I want a certain number of bins, and each bin to >>>> cover an equal number of integers (except possibly the last one)? >>>> (in order to avoid having some bins higher than others merely because >>>> they cover more integers) I know I can pass in an explicit bins array >>>> (something like list(range(min, max, (max-min)//n)) + max) but I was >>>> hoping for something simpler, like hist(data, nbins=42, >>>> equal_integer_coverage=True). Best, >>>> Antony >>>> >>> >>> Int data is discrete. For discrete variables you don't need bins, you >>> don't estimate the frequency distribution you know it exactly by >>> counting. >>> >>> Of course you could do that with the hist function: >>> >>> pl.hist(r, np.arange(min(r)-0.5, max(r)+1.5), histtype='step') >>>>>> >>>>> >>> >>> ------------------------------------------------------------ >>> ------------------ >>> Time is money. Stop wasting it! Get your web API in 5 minutes. >>> www.restlet.com/download >>> http://p.sf.net/sfu/restlet >>> _______________________________________________ >>> Matplotlib-users mailing list >>> Mat...@li... >>> https://lists.sourceforge.net/lists/listinfo/matplotlib-users >>> >>> >> >> >> ------------------------------------------------------------ >> ------------------ >> Time is money. Stop wasting it! Get your web API in 5 minutes. >> www.restlet.com/download >> http://p.sf.net/sfu/restlet >> >> >> >> _______________________________________________ >> Matplotlib-users mailing list >> Mat...@li... >> https://lists.sourceforge.net/lists/listinfo/matplotlib-users >> >> > >
I can still need to bin data, e.g. when the data range is "large", or at least not small compared to the number of data points. Antony 2014年05月30日 5:03 GMT-07:00 Yoshi Rokuko <yo...@ro...>: > Am 2014年5月29日 14:14:52 -0700 > schrieb Antony Lee <ant...@be...>: > > > Hi, > > When histogramming integer data, is there an easy way to tell > > matplotlib that I want a certain number of bins, and each bin to > > cover an equal number of integers (except possibly the last one)? > > (in order to avoid having some bins higher than others merely because > > they cover more integers) I know I can pass in an explicit bins array > > (something like list(range(min, max, (max-min)//n)) + max) but I was > > hoping for something simpler, like hist(data, nbins=42, > > equal_integer_coverage=True). Best, > > Antony > > Int data is discrete. For discrete variables you don't need bins, you > don't estimate the frequency distribution you know it exactly by > counting. > > Of course you could do that with the hist function: > > >>> pl.hist(r, np.arange(min(r)-0.5, max(r)+1.5), histtype='step') > > > ------------------------------------------------------------------------------ > Time is money. Stop wasting it! Get your web API in 5 minutes. > www.restlet.com/download > http://p.sf.net/sfu/restlet > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users >
Am 2014年5月29日 14:14:52 -0700 schrieb Antony Lee <ant...@be...>: > Hi, > When histogramming integer data, is there an easy way to tell > matplotlib that I want a certain number of bins, and each bin to > cover an equal number of integers (except possibly the last one)? > (in order to avoid having some bins higher than others merely because > they cover more integers) I know I can pass in an explicit bins array > (something like list(range(min, max, (max-min)//n)) + max) but I was > hoping for something simpler, like hist(data, nbins=42, > equal_integer_coverage=True). Best, > Antony Int data is discrete. For discrete variables you don't need bins, you don't estimate the frequency distribution you know it exactly by counting. Of course you could do that with the hist function: >>> pl.hist(r, np.arange(min(r)-0.5, max(r)+1.5), histtype='step')