Hi Users, I'm looking to make a histogram that is normalized by the total number of items shown in the histogram. For example: Let's say that I have an array 1000 items long. If I make a histogram in the normal way hist(x,10) then I get a histogram showing the total number of items in each bin. What I want to do is take that total number in each bin and divide them by 1000 and then make the plot. So if one of my bins has 350 objects in it, then it would be changed to 0.35. Another way to say it would be that I want the height of the histogram to represent the fraction of the total. I am pretty sure that this is different than using the "normed=True" flag, but I couldn't find anyone talking about this when I searched. Thanks Steven
Hi Steven, Try this: import numpy as np import numpy.random import matplotlib as mpl import matplotlib.pyplot as plt x = np.random.randn(1000) h, binedg = np.histogram(x, 10) wid = binedg[1:] - binedg[:-1] plt.bar(binedg[:-1], h/float(x.size), width=wid) On Nov 30, 2011, at 10:25 AM, Steven Boada wrote: > Hi Users, > > I'm looking to make a histogram that is normalized by the total number > of items shown in the histogram. For example: > > Let's say that I have an array 1000 items long. If I make a > histogram in > the normal way hist(x,10) then I get a histogram showing the total > number of items in each bin. What I want to do is take that total > number > in each bin and divide them by 1000 and then make the plot. > > So if one of my bins has 350 objects in it, then it would be > changed to > 0.35. > > Another way to say it would be that I want the height of the histogram > to represent the fraction of the total. I am pretty sure that this is > different than using the "normed=True" flag, but I couldn't find > anyone > talking about this when I searched. > > Thanks > > Steven > > > ---------------------------------------------------------------------- > -------- > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users
On Wed, Nov 30, 2011 at 11:42 AM, Jeffrey Blackburne < jbl...@al...> wrote: > Hi Steven, > > Try this: > > import numpy as np > import numpy.random > import matplotlib as mpl > import matplotlib.pyplot as plt > > x = np.random.randn(1000) > h, binedg = np.histogram(x, 10) > > wid = binedg[1:] - binedg[:-1] > plt.bar(binedg[:-1], h/float(x.size), width=wid) > > > On Nov 30, 2011, at 10:25 AM, Steven Boada wrote: > > > Hi Users, > > > > I'm looking to make a histogram that is normalized by the total number > > of items shown in the histogram. For example: > > > > Let's say that I have an array 1000 items long. If I make a > > histogram in > > the normal way hist(x,10) then I get a histogram showing the total > > number of items in each bin. What I want to do is take that total > > number > > in each bin and divide them by 1000 and then make the plot. > > > > So if one of my bins has 350 objects in it, then it would be > > changed to > > 0.35. > > > > Another way to say it would be that I want the height of the histogram > > to represent the fraction of the total. I am pretty sure that this is > > different than using the "normed=True" flag, but I couldn't find > > anyone > > talking about this when I searched. > > > > Thanks > > > > Steven > > One option: You can plot the normal `hist` and then change the tick labels appropriately. Here's some code for accomplishing that: #~~~ import numpy as np import matplotlib.pyplot as plt from matplotlib.ticker import FuncFormatter, MultipleLocator N = 350 ytick_step = 0.05 data = np.random.normal(size=N) def norm_num(x, pos): return '%g' % (x / float(N)) locator = MultipleLocator(N * ytick_step) formatter = FuncFormatter(norm_num) f, ax = plt.subplots() ax.yaxis.set_major_formatter(formatter) ax.yaxis.set_major_locator(locator) ax.hist(data) plt.show() #~~~ Note that the formatter object is all you need to change to the desired scale. But, that result will usually look ugly, because you'll get tick labels with long, ugly floating point numbers. The locator object fixes that issue. Best, -Tony