determine if data is uni- or bimodal

Question 1

I have a dataset where many but not all of my data seems to have a bivariate distribution.

I can perform a mixture model for each group in the dataset, but I'd like to know how to test if a univariate model is better fit.

dfA <- filter(df, group == "A") %>% select(samp)
mmA <- normalmixEM(dfA$samp)

Here's a histogram of the first group where the data looks more bimodal:

as.data.frame(cbind(group, samp)) %>% 
 ggplot() +
 geom_histogram(aes(x = as.numeric(samp)), 
 bins = 10,
 stat = "bin") +
 facet_wrap(~group)

enter image description here

Here's a subset of my data:

structure(list(group = c("A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", 
"C", "C", "C", "C", "C", "C"), samp = c("260", "205", "244", 
"241", "218", "217", "261", "208", "238", "194", "227", "237", 
"229", "273", "210", "176", "286", "231", "196", "269", "238", 
"288", "196", "215", "192", "300", "233", "180", "200", "227", 
"192", "187", "245", "255", "180", "215", "229", "192", "219", 
"214", "215", "226", "285", "201", "199", "280", "232", "223", 
"183", "202", "217", "192", "267", "219", "187", "277", "230", 
"172", "221", "218", "196", "184", "210", "176", "295", "218", 
"193", "177", "214", "295", "187", "219", "188", "187", "219", 
"180", "175", "210", "299", "193", "212", "262", "196", "224", 
"215", "175", "224", "260", "285", "222", "203", "191", "213", 
"189", "209", "223", "188", "206", "208", "203", "290", "219", 
"249", "172", "236", "291", "287", "212", "294", "255", "230", 
"278", "190", "224", "287", "195", "224", "196", "189", "216", 
"253", "210", "220", "190", "180", "217", "193", "264", "222", 
"253", "176", "224", "270", "218", "213", "180", "263", "224", 
"182", "290", "222", "276", "176", "215", "190", "185", "223", 
"200", "179", "205", "210", "210", "229", "257", "209", "215", 
"183", "271", "217", "273", "296", "220", "206", "194", "227", 
"187", "272", "223", "212", "202", "223", "179", "168", "220", 
"179", "287", "243", "208", "265", "234", "255", "207", "234", 
"201", "180", "208", "190", "230", "224", "285", "278", "219", 
"260", "183", "212", "292", "188", "206", "294", "184", "223", 
"189", "177", "222", "259", "260", "225", "254", "267", "220", 
"295", "290", "214", "275", "188", "220", "201", "194", "213", 
"194", "290", "197", "208", "238", "239", "208", "199", "229", 
"199", "178", "224", "231", "286", "227", "169", "182", "231", 
"186", "191", "210", "260", "223", "216", "176", "195", "218", 
"172", "287", "197", "201", "190", "202", "260", "193", "204", 
"267", "192", "206", "171", "182", "200", "275", "184", "226", 
"285", "294", "216", "283", "193", "230", "226", "197", "208", 
"245", "183", "225", "167", "185", "216", "257", "272", "219", 
"286", "275", "217", "274", "185", "231", "295", "252", "231", 
"186", "271", "220", "201", "264", "222", "302", "273", "207"
)), class = "data.frame", row.names = c(NA, -300L))

Question 2

All three are either bimodal or trimodal at least at the level of the grouping parameter that was selected by the histogram function. Have you done any searching on fitting mixture distributions?

Question 3

@IRTFM yes, I believe I fit a mixture model with "normalmixEM" above. but I'm trying to sort out how to determine if the data is truly bimodal (or higher) or unimodal and I can't find information on doing that step. The data I provided is a small subset of my full dataset, so the distributions might not be quite right.

CollectivesTM on Stack Overflow

determine if data is uni- or bimodal

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions