Assignment #8 - 2019

For each of the following data sets, analyze the data using generalized linear modeling with the appropriate family structure. There are no interactions, collinearity, or other complications; just straight analysis.

As always, create a word document with a paragraph describing your results for each DataSet. 

For each data set, I've also included the excel file that was used to randomly generate the data so that you can see how the data was generated and also see the true model and effect sizes. However, be sure your analysis is on the CSV file (don't save a new CSV file from the excel data - the data will change and I won't be able to evaluate your interpretation of the results).

DataSet #1 (Excel Version) - In this example, you will basically be doing a habitat analysis. In essence, you have sampled 100 long-leaf pine stands for red-cockaded woodpeckers. In addition to whether or not the bird was detected (Y - 'Present'; which we will assume was done without error), you collected data on the stand age (X1 - in years), the density of the trees (X2 - in trees / 100 m2), and whether or not the stand is burned regularly by management personnel (X3 - every 3 years) or only burned when fire occurs naturally.

In addition to the paragraph, answer the following question in your word document::

DataSet #2 (Excel Version) - In this example, you are conducting a study of bird biodiversity. You have sampled 100 randomly chosen habitat patches for birds. In addition to the number of bird species detected in the patch (Y - 'Present'), you also collected data on the habitat type (X1), and the understory density (X2 - an index from 0 to 100 generated using a density board - 100 means so thick you can't see through it (100% obstruction), 0 means no understory (0% obstruction)).  In your paragraph, be sure you describe the differences among all three habitat types!

In addition to the paragraph, answer the following question in your comments:

DataSet #3 (Excel Version) - In this example, you are studying factors that influence the risk of failure to conceive after artificial insemination in rhinos. In addition to whether or not individuals conceived from the insemination (Y - 'Conceived'), you collected information on the amount of sperm the individual received (X1 - in million of sperm, ranging from 5 to 60), and whether or not the sperm sample was fresh or had been cryo-frozen (X2). Note that each individual was monitored during 10 insemination events. Thus, there is a random effect of individual. To account for this random effect, you will need to use the glmer function in the lme4 package. Unlike lme, lmer and glmer embed the random statement in the regular formula like this: +(1|individual). Don't forget your family statement!

In addition to the paragraph, answer the following question in your comments: