Gemstracker and HandWristStudyGroup data
R-course: RYouReady
General introduction to R and GitHub
R-course: RYouReady
Applying R to HandWristStudyGroup data
VizWhiz 1: plotten van variabelen
Now that you have a reasonably tidy dataset, and know-how to apply this in general, the fun part begins: visualizing and analyzing.
Data visualization has two very important goals:
-
Create insight for yourself into the data you have and it's quality. This makes it easier to see in a plot whether there are improbable or impossible values in your data and whether the data is normally distributed.
-
Present the results you have found in a clear, orderly, and insightful way.
R has very extensive possibilities to create plots; in fact, any kind of plot is possible and the quality is of a professional level.
Let's plot!
#Loading libraries
library(tidyverse)
library(here)
library(readr)
library(ggbeeswarm)
#importing data_long ----
data_combined <- read_csv("data/data_RYouReady.csv")
#Create scatter plot
data_combined %>%
ggplot(aes(x=Geslacht, y = vasPijnGemiddeld_1)) +
geom_point()
#Create geom_jitter plot
data_combined %>%
ggplot(aes(x=geslacht, y = stap_om_week_aantal)) +
geom_jitter()
#Create geom_quasirandom
data_combined %>%
ggplot(aes(x=geslacht, y = stap_om_week_aantal)) +
geom_quasirandom()
The first step in your data analysis, sometimes even before cleaning and then again after cleaning, is to plot your raw data. As you could see in the video, you can do this well in R using, for example, geom_point and geom_jitter. We call this plotting raw data because you don't use a mean or standard deviation of the data, for example. Proper representation of raw data and a critical look at the result is an essential first step in clearing and analyzing data.
Assignment: plotting raw data
-
Create a new script with the name RYouReady_visualization in which clear you global environment with rm(list = ls())
-
Now load the same libraries as you have used before.
-
nd load the example dataset called Example_LongFormatHashed.RData
-
Now use the ggplot function to create a scatter plot with geom_point to show the distribution of VAS pain on average (y-axis) for both men and women (x-axis)
-
And what about the same plot if you use geom_jitter and geom_quasirandom?
Using colours and changing x and y axis
Assignment: use of colors and more
-
Now plot the distribution of mean VAS pain between men and women again with geom_point, but use na.omit () to remove missing data.
-
You can see how quickly you can now create new plots, for example by making exactly the same plot, but not for the VAS pain average for men and women, but the VAS pain average per measuring point (Intake, 3 months, 12 months) on the x-axis. Add this code to your script.
-
Add the coordinate_flip and see what happens
-
Assign the different measuring points in the plot a different color based on gender in your last code
#plot the distribution of mean VAS pain between men and women and remove missing data
data_combined %>%
na.omit() %>%
ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1)) +
geom_jitter()
#plot VAS pain for different measuring points
data_combined %>%
na.omit() %>%
ggplot(aes(x= rounddescription, y = vasPijnGemiddeld_1)) +
geom_jitter()
#Add coordinate flip
data_combined %>%
na.omit() %>%
ggplot(aes(x=rounddescription, y = vasPijnGemiddeld_1)) +
geom_point() +
coord_flip()
#Add colour to the measuring points in the plot based on gender
data_combined %>%
na.omit() %>%
ggplot(aes(x=rounddescription y = vasPijnGemiddeld_1 colour colour = geslacht)) +
geom_jitter() +
coord_flip()
Next step: subplots (facets)
Assignment: Facet Wraps
-
Now plot VAS pain average for men and women again with geom_jitter and use facet_wrap to create separate subplots for the different measurement points.
-
Also assign the different measuring points a different color based on gender to distinguish them from each other
#Plot VAS pain average for men and women again with geom_jitter and use facet_wrap to create separate subplots
data_combined %>%
na.omit() %>%
ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1)) +
geom_jitter() +
facet_wrap( ~ rounddescription)
#Assign the different measuring points a different color based on gender to distinguish them from each other
data_combined %>%
na.omit() %>%
ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1, colour = geslacht )) +
geom_jitter() +
facet_wrap( ~ rounddescription)
Combine cleaning and plotting with %>%
Assignment: %>% the plot
-
Analyze whether in the last plot you made there are values for VAS pain average that are impossible? For example, someone who has entered a value greater than 100, or lower than 0.
-
Create a new version of the same plot, removing both too high and too low values (you can do this in 2 steps with the filter command)
#Analyze whether in the last plot you made there are values for VAS pain average that are impossible
data_combined %>%
na.omit() %>%
filter (vasPijnGemiddeld_1 > 100) %>%
filter (vasPijnGemiddeld_1 < 0) %>%
ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1, colour = geslacht )) +
geom_jitter() + facet_wrap( ~ rounddescription)
And now save your plots for use elsewhere
Assignment: save your plots
-
Now save 1 of the created plots via the export function and via ggsave.
-
Find the picture you saved and open it on your computer (outside of R, so)
#Now save 1 of the created plots via the export function and via ggsave.
ggsave("VASpain.png")
Continue plottting
There are many different types of plots to make. For now, we will skip the module VizWhiz 2 from R Ladies. Check this again later, depending on what you need for your project VizWhiz 2:
-
The first movie gives examples of all kinds of other types of plots you can make.
-
The second video shows how to make a histogram. This is an important tool to see how your data is distributed. Come back to this again after the RYouReady course.
-
The third movie shows how you can combine different types of plots in "layers". So, for example, a box plot over which you plot individual points. For now, we also skip this, but come back to it if you want to use this for your own analyses.
For now we move on to VizWhiz 3
VizWhiz3