SISBID Data Visualization – Making a mess again

Warmup

Turn the french_fries data from wide format into a long format with variables type and rating.

# A tibble: 6 × 9
  time  treatment subject   rep potato buttery grassy rancid painty
  <fct> <fct>     <fct>   <dbl>  <dbl>   <dbl>  <dbl>  <dbl>  <dbl>
1 1     1         3           1    2.9     0      0      0      5.5
2 1     1         3           2   14       0      0      1.1    0  
3 1     1         10          1   11       6.4    0      0      0  
4 1     1         10          2    9.9     5.9    2.9    2.2    0  
5 1     1         15          1    1.2     0.1    0      1.1    5.1
6 1     1         15          2    8.8     3      3.6    1.5    2.3

05:00

What would you like to find out about the french fries data set?

Put your questions in the chat!

What would we like to know?

Is the design complete?
Are replicates like each other?
How do the ratings on the different scales differ?
Are raters giving different scores on average?
Do ratings change over the weeks?

Each of these questions requires a different summary of the data.

Pivot french fries to long

# A tibble: 6 × 6
  time  treatment subject   rep type    rating
  <fct> <fct>     <fct>   <dbl> <chr>    <dbl>
1 1     1         3           1 potato     2.9
2 1     1         3           1 buttery    0  
3 1     1         3           1 grassy     0  
4 1     1         3           1 rancid     0  
5 1     1         3           1 painty     5.5
6 1     1         3           2 potato    14

Pivot long to wide

Examples:

Are replicates like each other?
- compare rep 1 to rep 2 values
How do the ratings on the different scales differ?
- compare ratings across scales
Are raters giving different scores on average?
- compare ratings across raters
Do ratings change over the weeks?
- compare ratings week-by-week

Pivot to wide form

tidyr’s pivot_wider function creates variables with comparable values

Long form:

time	treatment	subject	rep	type	rating
1	1	3	1	potato	2.9
1	1	3	1	buttery	0.0
1	1	3	1	grassy	0.0
1	1	3	1	rancid	0.0

Pivot to wide form

tidyr’s pivot_wider function creates variables with comparable values

Wide form:

treatment	subject	rep	type	1	2	3	4	5	6	7	8	9	10
1	3	1	potato	2.9	9.0	11.8	13.6	14.0	0.4	2.9	3.5	1.1	NA
1	3	1	buttery	0.0	0.3	0.2	0.1	0.3	1.2	0.0	0.5	0.4	NA
1	3	1	grassy	0.0	0.1	0.0	0.0	0.0	0.0	0.0	1.3	0.0	NA

Pivot to wide form

# A tibble: 6 × 14
  treatment subject   rep type     `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`
  <fct>     <fct>   <dbl> <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1         3           1 potato   2.9   9    11.8  13.6  14     0.4   2.9   3.5
2 1         3           1 butte…   0     0.3   0.2   0.1   0.3   1.2   0     0.5
3 1         3           1 grassy   0     0.1   0     0     0     0     0     1.3
4 1         3           1 rancid   0     5.8   6     1.7   0     0     0     0  
5 1         3           1 painty   5.5   0.3   0     0     1.7   9.5   5.5   3.8
6 1         3           2 potato  14     5.5   7.8   5.3  12.9   3.3   0.8   0.6
# ℹ 2 more variables: `9` <dbl>, `10` <dbl>

pivot_wider:

creates a new column for each value of the variable in names_from
fills values in using the variable in values_from

Comparing ratings: different weeks

Note the use of the backtick for variable names with special characters or numbers.

Your turn: Are the replicates similar?

Goal: Plot the replicates against each other using a scatterplot.

Convert the data into long form
Get the replicates spread into separate columns by replicate.
Make the plot.

05:00

Are ratings similar across scales?

Scales: potato-y, buttery, grassy, rancid and painty?
Pivot into long form, plot with facet_...(~scale).

# A tibble: 6 × 6
  time  treatment subject   rep type    rating
  <fct> <fct>     <fct>   <dbl> <chr>    <dbl>
1 1     1         3           1 potato     2.9
2 1     1         3           1 buttery    0  
3 1     1         3           1 grassy     0  
4 1     1         3           1 rancid     0  
5 1     1         3           1 painty     5.5
6 1     1         3           2 potato    14

Are ratings similar across scales?

Side-by-side boxplots

Your turn: Correlation b/w scales?

Create a wide form of the data by type of scale.
cor allows you to create a correlation matrix.
Run ?cor to look up how to get rid of NA values in the result.
Draw a scatterplot of two scales with the highest (positive or negative) correlation value.

05:00

Ratings by week

Use the long form of the data and plot:

Your turn: Ratings by time & scale?

Find a linear model describing the average rating by week (time) and type of scale as shown below.
Which form of the dataset should we use?
Challenge: can you plot the fitted lines from the model?

05:00

Resources

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Making a mess again - with the data

Warmup

What would you like to find out about the french fries data set?

What would we like to know?

Pivot french fries to long

Pivot long to wide

Pivot to wide form

Pivot to wide form

Pivot to wide form

Comparing ratings: different weeks

Your turn: Are the replicates similar?

Are ratings similar across scales?

Are ratings similar across scales?

Side-by-side boxplots

Your turn: Correlation b/w scales?

Ratings by week

Your turn: Ratings by time & scale?

Resources