##### Code for Analyzing Categorical Variables in R ### Import Data datum=read.csv(file.choose()) ### creates a data frame named datum from imported csv file head(datum) ### In the newest version of R, we must tell R that 'Sex' is a categorical (or 'factor') variable datum$Sex=as.factor(datum$Sex) # Tell R that Sex is a 'factor' or categorical variable levels(datum$Sex) #Check that it worked by asking R what the groups of 'Sex' are ### Plot Data plot(Mass~Sex,data=datum) #Plot relationship between sex and mass #Note that R plots a box and whiskers plot because Sex is categorical ### Run a t-test using the t.test command help(t.test) ### pull up help file for 't.test' results=t.test(Mass~Sex,data=datum,var.equal=TRUE) ### run a t-test, call results 'results' #var.equal=TRUE argument ensures homoscedasticity assumption for comparison to linear model #default is var.equal=FALSE which is standard for t-tests; allows for unequal variance of groups results ### prints results - Note absence of 'summary' statement - one of few tests that doesn't use 'summary' #Note results don't provide estimate of effect! ### Run a t-test using the lm command resultslm=lm(Mass~Sex,data=datum) ### runs a regression with mass as dependent and sex as independent variables summary(resultslm) ### give results from regression analysis confint(resultslm) ### give confidence intervals of coefficient estimates ### Run regression on dummy coded variable resultsDum=lm(Mass~Males,data=datum) ### runs a regression with mass as dependent and sex as independent variables # Note that Males = 1 is males summary(resultsDum) confint(resultsDum) ### provide confidence intervals on estimates in 'resultsDum' ###Import data 2 datum2=read.csv(file.choose()) ### creates a data frame named datum2 from imported csv file head(datum2) ### check data was imported properly ### In the newest version of R, we must tell R that 'Sex' is a categorical (or 'factor') variable datum$Sex=as.factor(datum2$Sex) # Tell R that Sex is a 'factor' or categorical variable levels(datum2$Sex) #Check that it worked by asking R what the groups of 'Sex' are ### Plot Data plot(Mass~Sex,data=datum2) #Plot relationship between sex and mass #Note that R plots a box and whiskers plot because Sex is categorical ### run an ANOVA using aov command results3=aov(Mass~Sex,data=datum2) ### generate an ANOVA for effect of sex on mass, call it results3 summary(results3) #### give summary of ANOVA #returns anova table with SSE, MSE, F statistic and p-value #Significant p-value indicates at least 2 groups are significantly different ### run an ANOVA using the lm command results3lm=lm(Mass~Sex,data=datum2) ### run a regression with mass as dependent and sex as independent varialbes summary(results3lm) ### Give results #note F statistic from lm and aov give same results, F statistics and p-values are at bottom of summary output ### run an ANOVA using dummy coded variables results3Dum=lm(Mass~Males+Herm,data=datum2) ### runs a regression with mass as dependent and sex as independent #Note that + in the lm equation above doesn't actually mean 'add' - it means include both variables in the model summary(results3Dum) #Note that results3Dum and results3lm are exactly the same ### How to change the reference category help(relevel) #use the relevel command to change reference categories results4=lm(Mass~relevel(Sex,ref="Herm"),data=datum2) #Changes the reference group to "Herm" summary(results4)