r - How to select a subset of least resembling combinations of a common vector? -


i need select 10 times different combinations of 15 items common set of 30 items, several constraints.

first constraint: each of 10 combinations, need remove 3 items should not present in specific combination. let's wf01, wf02 , wf03 first combination; wf04, wf05 , wf06 second combination , on.

second constraint: on ten combinations, each item should present same number of times (5).

third constraint: ten combinations should different possible 1 (i.e. should avoid sharing common items as possible).

i doing in excel, guess there should more efficient way in r!

here coding far:

# data: 30 items (pictures choose from)  picturenames <- c("wf01", "wf02", "wf03", "wf04", "wf05", "wf06", "wf07", "wf08", "wf09", "wf10", "wf11", "wf12", "wf13", "wf14", "wf15", "wf16", "wf17", "wf18", "wf19", "wf20", "wf21", "wf22", "wf23", "wf24", "wf25", "wf26", "wf27", "wf28", "wf29", "wf30")  set.seed(10)  sample01 <- sample(picturenames[4:30], 15) sample02 <- sample(picturenames[c(1:3, 7:30)], 15) sample03 <- sample(picturenames[c(1:6, 10:30)], 15) sample04 <- sample(picturenames[c(1:9, 13:30)], 15) sample05 <- sample(picturenames[c(1:12, 16:30)], 15) sample06 <- sample(picturenames[c(1:15, 19:30)], 15) sample07 <- sample(picturenames[c(1:18, 22:30)], 15) sample08 <- sample(picturenames[c(1:21, 25:30)], 15) sample09 <- sample(picturenames[c(1:24, 28:30)], 15) sample10 <- sample(picturenames[c(1:27)], 15) 

up now, implemented constraints manually in excel (i know…). first, ensured each item present 5 times across 10 different combinations. then, created matrix name of picture row , column. each cell counted how many times pair of pictures present in same combination across 10 combinations. value ranges 0 (pictures never together) 5 (pictures together). then, manually shifting pictures between combinations, managed avoid presence of 0s , 5s in matrix. afterwards, implemented optimisation task, each picture should associated many different pictures possible, while respecting previous constraints. so, each picture calculate standard deviation of column (or row) values. then, calculated standard deviation of these standard deviations. shift reduces value while respecting other constraints retained. managed arrive standard deviation of 0.036, have no idea whether best result can get. appreciated!!!

thanks in advance!

edit

after @roman-luštrik's remark, here new version of code. still horrible, , not close optimising matrix, getting there!

# data: 30 items (pictures choose from)  picturenames <- data.frame(names = c("wf01", "wf02", "wf03", "wf04", "wf05",               "wf06", "wf07", "wf08", "wf09", "wf10", "wf11", "wf12", "wf13",               "wf14", "wf15", "wf16", "wf17", "wf18", "wf19", "wf20", "wf21",               "wf22", "wf23", "wf24", "wf25", "wf26", "wf27", "wf28", "wf29",               "wf30"), stringsasfactors = false)  library(dplyr)  set.seed(10)  sample01 <- data.frame(names = sort(sample(picturenames$names[c(4:30)], 15))) sample02 <- anti_join(picturenames, sample01) sample03 <- data.frame(names = sort(sample(picturenames$names[c(1:6, 10:30)], 15))) sample04 <- anti_join(picturenames, sample03) sample05 <- data.frame(names = sort(sample(picturenames$names[c(1:12, 16:30)], 15))) sample06 <- anti_join(picturenames, sample05) sample07 <- data.frame(names = sort(sample(picturenames$names[c(1:18, 22:30)], 15))) sample08 <- anti_join(picturenames, sample07) sample09 <- data.frame(names = sort(sample(picturenames$names[c(1:24, 28:30)], 15))) sample10 <- anti_join(picturenames, sample09)  allsamples <- data.frame(sample01, sample02, sample03, sample04, sample05,                      sample06, sample07, sample08, sample09, sample10)  colnames(allsamples) <- c("sa", "sb", "sc", "sd", "se", "sf", "sg", "sh", "si", "sj")  cooccurmat <- matrix(nrow = length(picturenames$names), ncol = length(picturenames$names),                  dimnames = c(picturenames, picturenames))  for(j in 1:length(picturenames$names)){   for(i in 1:length(picturenames$names)){ cooccurmat[i,j] <- sum(ifelse(sapply(picturenames,                        function(x)colsums(xor(allsamples == picturenames$names[j],                        allsamples == picturenames$names[i]))) == 2, 1, 0)) colsd <- apply(cooccurmat, 2, sd) overallsd <- sd(colsd)   } }  cooccurmat colsd overallsd 


Comments

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

Laravel mail error `Swift_TransportException in StreamBuffer.php line 269: Connection could not be established with host smtp.gmail.com [ #0]` -

c# SetCompatibleTextRenderingDefault must be called before the first -