top of page

Gemstracker and HandWristStudyGroup data

R-course: RYouReady

General introduction to R and GitHub

R-course: RYouReady

Applying R to HandWristStudyGroup data

VizWhiz 1: plotten van variabelen

Now that you have a reasonably tidy dataset, and know-how to apply this in general, the fun part begins: visualizing and analyzing.

 

Data visualization has two very important goals:

  1. Create insight for yourself into the data you have and it's quality. This makes it easier to see in a plot whether there are improbable or impossible values ​​in your data and whether the data is normally distributed.

  2. Present the results you have found in a clear, orderly, and insightful way.

 

R has very extensive possibilities to create plots; in fact, any kind of plot is possible and the quality is of a professional level.

Let's plot!

#Loading libraries

library(tidyverse)
library(here)
library(readr)
library(ggbeeswarm)
 

#importing data_long ----
data_combined <- read_csv("data/data_RYouReady.csv")
 

#Create scatter plot
data_combined %>% 
  ggplot(aes(x=Geslacht, y = vasPijnGemiddeld_1)) +
           geom_point()

#Create geom_jitter plot
data_combined %>% 
   ggplot(aes(x=geslacht, y = stap_om_week_aantal)) +
            geom_jitter()

#Create geom_quasirandom 
data_combined %>% 
   ggplot(aes(x=geslacht, y = stap_om_week_aantal)) +
           geom_quasirandom()

 

Answers

The first step in your data analysis, sometimes even before cleaning and then again after cleaning, is to plot your raw data. As you could see in the video, you can do this well in R using, for example, geom_point and geom_jitter. We call this plotting raw data because you don't use a mean or standard deviation of the data, for example. Proper representation of raw data and a critical look at the result is an essential first step in clearing and analyzing data. 

 

Assignment: plotting raw data

 

  • Create a new script with the name RYouReady_visualization in which clear you global environment with rm(list = ls())

  • Now load the same libraries as you have used before.

  • nd load the example dataset called Example_LongFormatHashed.RData

  • Now use the ggplot function to create a scatter plot with geom_point to show the distribution of VAS pain on average (y-axis) for both men and women (x-axis)

  • And what about the same plot if you use geom_jitter and geom_quasirandom?

 

 

 

 

 

 

Using colours and changing x and y axis

Assignment: use of colors and more

 

  • Now plot the distribution of mean VAS pain between men and women again with geom_point, but use na.omit () to remove missing data.

  • You can see how quickly you can now create new plots, for example by making exactly the same plot, but not for the VAS pain average for men and women, but the VAS pain average per measuring point (Intake, 3 months, 12 months) on the x-axis. Add this code to your script.

  • Add the coordinate_flip and see what happens

  • Assign the different measuring points in the plot a different color based on gender in your last code

#plot the distribution of mean VAS pain between men and women and remove missing data

data_combined %>% 
  na.omit() %>% 
  ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1)) +
  geom_jitter() 

#plot VAS pain for different measuring points 
data_combined %>% 
  na.omit() %>% 
  ggplot(aes(x= rounddescription, y = vasPijnGemiddeld_1)) +
  geom_jitter() 

#Add coordinate flip
data_combined %>% 
  na.omit() %>% 
  ggplot(aes(x=rounddescription, y =
vasPijnGemiddeld_1)) +
  geom_point() + 
  coord_flip()

#Add colour to the measuring points in the plot based on gender

data_combined %>% 
  na.omit() %>% 
  ggplot(aes(x=rounddescription y = vasPijnGemiddeld_1 colour colour = geslacht)) +
  geom_jitter() + 
  coord_flip()

Answers

Next step: subplots (facets)

Assignment: Facet Wraps

  • Now plot VAS pain average for men and women again with geom_jitter and use facet_wrap to create separate subplots for the different measurement points.

  • Also assign the different measuring points a different color based on gender to distinguish them from each other

#Plot VAS pain average for men and women again with geom_jitter and use facet_wrap to create separate subplots

data_combined %>% 
  na.omit() %>% 
  ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1)) +
  geom_jitter() +
  facet_wrap( ~ rounddescription)

#Assign the different measuring points a different color based on gender to distinguish them from each other

data_combined %>% 
  na.omit() %>% 
  ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1, colour = geslacht )) +
  geom_jitter() +
  facet_wrap( ~ rounddescription)

Answers

Combine cleaning and plotting with %>%

Assignment: %>% the plot

 

  • Analyze whether in the last plot you made there are values ​​for VAS pain average that are impossible? For example, someone who has entered a value greater than 100, or lower than 0.

  • Create a new version of the same plot, removing both too high and too low values ​​(you can do this in 2 steps with the filter command)

#Analyze whether in the last plot you made there are values ​​for VAS pain average that are impossible​

data_combined %>% 
  na.omit() %>% 
  filter (vasPijnGemiddeld_1 > 100) %>% 
  filter (vasPijnGemiddeld_1 < 0) %>% 
 
ggplot(aes(x=geslacht, y = vasPijnGemiddeld_1, colour = geslacht )) +       

  geom_jitter() +  facet_wrap( ~ rounddescription)

Answers

And now save your plots for use elsewhere

Assignment: save your plots

  • Now save 1 of the created plots via the export function and via ggsave.

  • Find the picture you saved and open it on your computer (outside of R, so)

#Now save 1 of the created plots via the export function and via ggsave.

ggsave("VASpain.png")

Answers

Continue plottting 

There are many different types of plots to make. For now, we will skip the module VizWhiz 2 from R Ladies. Check this again later, depending on what you need for your project VizWhiz 2:

  • The first movie gives examples of all kinds of other types of plots you can make. 

  • The second video shows how to make a histogram. This is an important tool to see how your data is distributed. Come back to this again after the RYouReady course.

  • The third movie shows how you can combine different types of plots in "layers". So, for example, a box plot over which you plot individual points. For now, we also skip this, but come back to it if you want to use this for your own analyses.

 

For now we move on to VizWhiz 3

VizWhiz3

bottom of page