R: One Way Anova and pairwise post hoc tests (Turkey, Scheffe or other) for numerical columns -
i have 3 columns in dataframe dune (below - bottom of page) describing % cover of marram grass recorded 3 different sand dune ecosystems:
(1) restored; (2) degraded; , (3) natural;
i have performed 2 different 1 way anova tests (below) - test 1 , test 2 - establish significant differences between ecosystems. test 1 shows significant differences between ecosystems; however, test 2 shows no significant differences. box plot's (below) show stark differences in variance between ecosystems.
afterwards, melted dataframe produce factorial column (i.e, headed ecosystem.type) response variable. idea apply glm model (test 3 - below)to test 1 way anova; however, method unsuccessful (please find error message below).
problem
i confused whether code perform each 1 way anova test correct , correct procedure perform post hoc tests (turkey hsd, scheffe or others) distinguish pairs of ecosystems different. if can help, appreciative advice. many thanks....
data(dune)
test 1
dune.type.1<-aov(natural~restored+degraded, data=dune) summary.aov(dune.type.1, intercept=t) df sum sq mean sq f value pr(>f) (intercept) 1 34694 34694 138.679 1.34e-09 *** restored 1 94 94 0.375 0.548 degraded 1 486 486 1.942 0.181 residuals 17 4253 250 --- signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
post-hoc test's
posthoc<-tukeyhsd(dune.type.1, conf.level=0.95) error in tukeyhsd.aov(dune.type.1, conf.level = 0.95) : no factors in fitted model in addition: warning messages: 1: in replications(paste("~", xx), data = mf) : non-factors ignored: restored 2: in replications(paste("~", xx), data = mf) : non-factors ignored: degraded
test 2
dune1<-aov(restored~natural, data=dune) dune2<-aov(restored~degraded, data=dune) dune3<-aov(degraded~natural, data=dune) summary(dune1) df sum sq mean sq f value pr(>f) natural 1 86 85.58 0.356 0.558 residuals 18 4325 240.26 summary(dune2) df sum sq mean sq f value pr(>f) degraded 1 160 159.7 0.676 0.422 residuals 18 4250 236.1 summary(dune3) df sum sq mean sq f value pr(>f) natural 1 168.5 168.49 2.318 0.145 residuals 18 1308.5 72.69
test 3
melt.dune<-melt(dune, measure.vars=c("degraded", "restored", "natural")) colnames(melt.dune)=c("ecosystem.type", "percentage.cover") melt.dune$percentage.cover<-as.numeric(melt.dune$percentage.cover) glm.dune<-glm(ecosystem.type~percentage.cover, data=melt.dune) summary(glm.dune) error glm.dune<-glm(ecosystem.type~percentage.cover, data=melt.dune) error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, : na/nan/inf in 'y' in addition: warning messages: 1: in ops.factor(y, mu) : ‘-’ not meaningful factors 2: in ops.factor(eta, offset) : ‘-’ not meaningful factors 3: in ops.factor(y, mu) : ‘-’ not meaningful factors
melted dataframe
structure(list(ecosystem.type = structure(c(1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l), .label = c("degraded", "restored", "natural"), class = "factor"), percentage.cover = c(12, 17, 21, 11, 22, 16, 7, 9, 14, 2, 3, 15, 23, 4, 19, 36, 26, 4, 15, 23, 38, 46, 65, 35, 54, 29, 48, 13, 19, 33, 37, 55, 11, 53, 13, 24, 28, 44, 42, 39, 18, 61, 31, 46, 51, 51, 41, 44, 55, 47, 73, 43, 25, 42, 21, 13, 65, 30, 47, 29)), row.names = c(na, -60l), .names = c("ecosystem.type", "percentage.cover"), class = "data.frame")
data
structure(list(degraded = c(12l, 17l, 21l, 11l, 22l, 16l, 7l, 9l, 14l, 2l, 3l, 15l, 23l, 4l, 19l, 36l, 26l, 4l, 15l, 23l), restored = c(38l, 46l, 65l, 35l, 54l, 29l, 48l, 13l, 19l, 33l, 37l, 55l, 11l, 53l, 13l, 24l, 28l, 44l, 42l, 39l), natural = c(18l, 61l, 31l, 46l, 51l, 51l, 41l, 44l, 55l, 47l, 73l, 43l, 25l, 42l, 21l, 13l, 65l, 30l, 47l, 29l)), .names = c("degraded", "restored", "natural"), class = "data.frame", row.names = c(na, -20l))
there several things point you.
first, test 1 , test 2 produce similar results. difference selected intercept on test 1 , outcome tells if fit linear model (i come in few minutes) intercept required. hence significance see whether line force fit needs intercept or not. try using "intercept=t" other outcomes , pretty sure similar results.
the second thing should careful linear model try fit. dune.type.1 model model see how correlated different sand dune ecosystems are. in other words, assume there linear association between natural , restored , every unit increase of restored have increase on natural. if understood correctly want examine mean differences , not correlation. can 2 things:
the data prepared perform t.tests (a test compares mean between several categories). easy in r , valid since variables reasonably normal. have multiple testing issues (you perform 3 t-tests mean differences), , need use bonferroni correction.
but think want following:
first reform data
data <- data.frame(v = c(dune$degraded, dune$restored, dune$natural), labels = c(rep("degraded", times = 20), rep("restored", times = 20), rep("natural", times = 20)))
then fit linear model
mod.1 <- lm(v ~ labels, data = data) summary(mod.1) lm(formula = v ~ labels, data = data) residuals: min 1q median 3q max -28.650 -10.725 0.875 8.050 31.350 coefficients: estimate std. error t value pr(>|t|) (intercept) 14.950 3.066 4.875 9.07e-06 *** labelsnatural 26.700 4.337 6.157 7.95e-08 *** labelsrestored 21.350 4.337 4.923 7.64e-06 ***
where can see mean of baseline category (i.e degraded) smaller mean of natural category , etc. can check model assumptions, see if model fit
qqnorm(residuals(mod.1)) qqline(residuals(mod.1))
residuals reasonably normal model fine. can follow anova approach , have:
anova.model <- aov(v ~ labels, data = data)) summary(anova.model) df sum sq mean sq f value pr(>f) labels 2 7982 3991 21.22 1.29e-07 *** residuals 57 10720 188
which indicates there @ least 1 significant difference between means of sand dune ecosystems, , follow tukey pointwise intervals:
post <- tukeyhsd(aov(v ~ labels, data = data)) plot(post, ylim = c(0, 4))
already adjusted multiple testing :)
Comments
Post a Comment