11

I just came by the following plot:

alt text

And wondered how can it be done in R? (or other softwares)

Update 10.03.11: Thank you everyone who participated in answering this question - you gave wonderful solutions! I've compiled all the solution presented here (as well as some others I've came by online) in a post on my blog.

asked Sep 1, 2010 at 13:39
11
  • 5
    This is maybe a stupid comment, but what does the position of dots suppose to mean? Commented Sep 1, 2010 at 14:52
  • 1
    It wasn't a stupid comment because the answer to how to plot that is plot(x,y). I'm sure mbq was trying to get at the idea that what you're trying to do may be something other than a simple scatter plot. Commented Sep 1, 2010 at 16:41
  • 1
    It's also something other than a simple violin plot since that's supposed to be symmetric around the vertical axis. Commented Sep 1, 2010 at 17:01
  • 2
    @Tal, @John -- I know how standard vioplot works, but I can't figure out how those points were obtained (and as I see not only me, while it is crucial for producing good answer) -- some kind of stem? Or maybe someone just thought that filling vioplot with distorted polka dots is a good idea? Commented Sep 1, 2010 at 19:51
  • 2
    OK, I found what the software was. It's a "column scatter plot" made in GraphPad Prism. See for instance graphpad.com/help/prism5/… . I found some reference to those also here: originlab.com/www/products/… Commented Sep 2, 2010 at 16:38

5 Answers 5

9

Make.Funny.Plot does more or less what I think it should do. To be adapted according to your own needs, and might be optimized a bit, but this should be a nice start.

Make.Funny.Plot <- function(x){
 unique.vals <- length(unique(x))
 N <- length(x)
 N.val <- min(N/20,unique.vals)
 if(unique.vals>N.val){
 x <- ave(x,cut(x,N.val),FUN=min)
 x <- signif(x,4)
 }
 # construct the outline of the plot
 outline <- as.vector(table(x))
 outline <- outline/max(outline)
 # determine some correction to make the V shape,
 # based on the range
 y.corr <- diff(range(x))*0.05
 # Get the unique values
 yval <- sort(unique(x))
 plot(c(-1,1),c(min(yval),max(yval)),
 type="n",xaxt="n",xlab="")
 for(i in 1:length(yval)){
 n <- sum(x==yval[i])
 x.plot <- seq(-outline[i],outline[i],length=n)
 y.plot <- yval[i]+abs(x.plot)*y.corr
 points(x.plot,y.plot,pch=19,cex=0.5)
 }
}
N <- 500
x <- rpois(N,4)+abs(rnorm(N))
Make.Funny.Plot(x)

EDIT : corrected so it always works.

answered Sep 2, 2010 at 12:46
Sign up to request clarification or add additional context in comments.

8 Comments

Found one problem with it: If cut returns an empty level, you get an error.
+1 Good job! Still I think something is missing -- that original plot is asymmetric.
@mbq? Something missing? I just optimized that original plot. It's not a bug, it's a feature! ;-)
@Joris Maybe try using cut2 from Hmisc instead of cut?
@chl If I don't have to load other libraries, I rather avoid it. I just used the wrong number in the for-loop, that has been corrected now.
|
8

I recently came upon the beeswarm package, that bears some similarity.

The bee swarm plot is a one-dimensional scatter plot like "stripchart", but with closely-packed, non-overlapping points.

Here's an example:

 library(beeswarm)
 beeswarm(time_survival ~ event_survival, data = breast,
 method = 'smile',
 pch = 16, pwcol = as.numeric(ER),
 xlab = '', ylab = 'Follow-up time (months)',
 labels = c('Censored', 'Metastasis'))
 legend('topright', legend = levels(breast$ER),
 title = 'ER', pch = 16, col = 1:2)


(source: eklund at www.cbs.dtu.dk)

Glorfindel
22.8k13 gold badges97 silver badges124 bronze badges
answered Oct 14, 2010 at 14:55

Comments

4

I have come up with the code similar to Joris, still I think this is more than a stem plot; here I mean that they y value in each series is a absolute value of a distance to the in-bin mean, and x value is more about whether the value is lower or higher than mean.
Example code (sometimes throws warnings but works):

px<-function(x,N=40,...){
x<-sort(x);
#Cutting in bins
cut(x,N)->p;
#Calculate the means over bins
sapply(levels(p),function(i) mean(x[p==i]))->meansl;
means<-meansl[p];
#Calculate the mins over bins
sapply(levels(p),function(i) min(x[p==i]))->minl;
mins<-minl[p];
#Each dot is one value.
#X is an order of a value inside bin, moved so that the values lower than bin mean go below 0
X<-rep(0,length(x));
for(e in levels(p)) X[p==e]<-(1:sum(p==e))-1-sum((x-means)[p==e]<0);
#Y is a bin minum + absolute value of a difference between value and its bin mean
plot(X,mins+abs(x-means),pch=19,cex=0.5,...);
}
answered Sep 2, 2010 at 14:05

1 Comment

Thank you mbq, I was wondering who's answer to pick. I choose Joris, simply since he wrapped it up. Either way - both answers are great and won my +1 vote. Cheers - Tal
2

Try the vioplot package:

library(vioplot)
vioplot(rnorm(100))

(with awful default color ;-)

There is also wvioplot() in the wvioplot package, for weighted violin plot, and beanplot, which combines violin and rug plots. They are also available through the lattice package, see ?panel.violin.

answered Sep 1, 2010 at 13:54

6 Comments

That doesn't produce a scatterplot, does it?
@Shane no, it's just a variation of the boxplot with an added kernel density estimate
@Shane @Tal BTW, Box-percentile plot are better (bpplot in the Hmisc package).
Hi chl. Thank you for the answer. I remember coming by that function, but as Shane said - it doesn't produce the scatter plot element. I'll +1 for the good intentions - but will keep this question open :). Cheers, Tal
@Tal Well, I'll try to figure out myself how to make it in R; I think it would not be so difficult using stripchart() or a jittering procedure.
|
2

Since this hasn't been mentioned yet, there is also ggbeeswarm as a relatively new R package based on ggplot2.

Which adds another geom to ggplot to be used instead of geom_jitter or the like.

In particular geom_quasirandom (see second example below) produces really good results and I have in fact adapted it as default plot.

Noteworthy is also the package vipor (VIolin POints in R) which produces plots using the standard R graphics and is in fact also used by ggbeeswarm behind the scenes.


set.seed(12345)
install.packages('ggbeeswarm')
library(ggplot2)
library(ggbeeswarm)
ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm()
ggplot(iris,aes(Species, Sepal.Length)) + geom_quasirandom()
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()
answered Jun 14, 2017 at 16:47

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.