Follow-up Notes on R Tutorial

From our R expert (R-xpert?) Kevin Miklasz. Thanks Kevin!

Here is a link to an expanded sample file that shows some common functions I use, and is mostly focused on some graphing parameters I use a lot (at the bottom of the script). None of the graphing functions will work as they use data sets that I can’t share, but students can at least see some common graphing parameters that can be adjusted and reuse bits of the code in their own visualizations. Also including a list of preloaded R colors, and this image shows how the pch values in plots corresponds to different shapes.

I might have come down a little too hard on p values yesterday. They are still useful, and you should definitely test for p-values. But your process should look a bit like this:

Is there a significant p-value? If yes, go on
Do you have a large sample size? If no, you’re good with the p value! If yes, go on.
So the difference is significant, but is it meaningful? Is there an external measure for meaningfulness that you can use? Or can you shift to a data mining approach that can use cross-validation? It’s definitely ok to explore your data initially with correlation tests and t-tests, but before making a strong claim that there’s some real pattern in your data, you probably want to do some analysis beyond just finding a significant p value. P values should sort of be “necessary but not sufficient” kind of thing.