cluster analysis - R Univariate Clustering by Group -
i trying find method cluster univariate data group. example, in data below have 2 failure codes (a , b) , 6 data points each grouping. in plot can see each failure code there 2 distinct clusters failure time. manually isn't bad, can't figure out how larger data set (~100k rows , ~30 codes). end result give me medoid each cluster , count of codes in cluster.
library(ggplot2) failure <- rep(c("a","b"),each=6) ttf <- c(1,1.5,2,5,5.5,6,8,8.5,9,14,14.5,15) data <- data.frame(failure,ttf) qplot(failure, ttf) results <- data.frame(failure = c("a","b"), m1 = c(1.5,8.5), m2 = c(5.5,14.5))
i end result give me table below.
failure m1 m1count m2 m2count 1.5 3 5.5 3 b 8.5 3 14.5 3
this want, assuming 2 clusters per failure group, though change in tapply
apply failure groups.
res2 <- tapply(data$ttf, index = data$failure, function(x) kmeans(x,2)) res3 <- lapply(names(res2), function(x) data.frame(failure=x, centers=res2[[x]]$centers, size=res2[[x]]$size)) res3 <- do.call(rbind, res3) res3 failure centers size 1 5.5 3 2 1.5 3 11 b 14.5 3 21 b 8.5 3
Comments
Post a Comment