cluster analysis - R Univariate Clustering by Group -


i trying find method cluster univariate data group. example, in data below have 2 failure codes (a , b) , 6 data points each grouping. in plot can see each failure code there 2 distinct clusters failure time. manually isn't bad, can't figure out how larger data set (~100k rows , ~30 codes). end result give me medoid each cluster , count of codes in cluster.

library(ggplot2) failure <- rep(c("a","b"),each=6) ttf <- c(1,1.5,2,5,5.5,6,8,8.5,9,14,14.5,15) data <- data.frame(failure,ttf) qplot(failure, ttf) results <- data.frame(failure = c("a","b"), m1 = c(1.5,8.5), m2 = c(5.5,14.5)) 

enter image description here

i end result give me table below.

failure m1   m1count  m2    m2count       1.5  3        5.5   3 b       8.5  3        14.5  3 

this want, assuming 2 clusters per failure group, though change in tapply apply failure groups.

res2 <- tapply(data$ttf, index = data$failure, function(x) kmeans(x,2))     res3 <- lapply(names(res2), function(x) data.frame(failure=x, centers=res2[[x]]$centers, size=res2[[x]]$size))      res3 <- do.call(rbind, res3)  res3    failure centers size 1            5.5    3 2            1.5    3 11       b    14.5    3 21       b     8.5    3 

Comments

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

javascript - IE9 error '$'is not defined -