Session 2: Entering the tidyverse

This session introduces the basic functions in dplyr and other packages within the tidyverse.

Simon J. Brandl, PhD https://www.fishandfunctions.com/ (The University of Texas at Austin)
2025-04-17

a. 2: Demo



b. 2: Slides

You can access the full slideshow used in the 2-Tidyverse narration here.

The dataset called ‘fishtibble.csv’ can be downloaded here.

The dataset called ‘coralreefherbivores.csv’ can be downloaded here.

The dataset called ‘fish_abundance.csv’ can be downloaded here.

c. 2: Exercises

Read in the coralreefherbivores.csv dataset (if you haven’t already done so).

a herbivorous parrotfish

The “coralreefherbivores.csv” dataset contains trait information on 93 species of herbivorous coral reef fishes from the Great Barrier Reef, Australia. The first four columns provide taxonomic information (family, genus, species, and a combined genus + species column called “genspe”). The next column “sl” indicates the standard length of the fish. Then, you have three columns with morphological measurements of body depth, snout length, and eye diameter. Finally, you have two columns that indicate maximum body size (categories from XS to XL) and their schooling behavior.

Part I

a herbivorous parrotfish

Part II

The “fish_abundance.csv” dataset contains reef fish abundance information, which was collected by the Reef Life Survey around Lizard Island, Australia. It includes information on the respective survey ID and its metadata (e.g. site, latitude, longitude, date, and depth). It also includes a row for each observed species on a given visual survey, including taxonomic information (family, genus, species, and “genspe”), as well as the number of individuals observed on that survey (‘total’).

d. 2: Solutions

Part I

herbivores <- read.csv(file = "data/coralreefherbivores.csv")
herbivores.filter <- herbivores %>% # mean sl and sd sl
  filter(family == c("Acanthuridae", "Labridae", "Siganidae")) %>%
  group_by(family) %>%
  summarize(mean.sl = mean(sl), sd.sl = sd(sl))
head(herbivores.filter)
# A tibble: 3 × 3
  family       mean.sl sd.sl
  <chr>          <dbl> <dbl>
1 Acanthuridae    186.  89.7
2 Labridae        269.  73.5
3 Siganidae       168.  54.7
herbivores.size_schooling <- herbivores %>%
  mutate(size_school = paste(size, schooling, sep = "."))
head(herbivores.size_schooling)
        family      genus        species                   gen.spe
1 Acanthuridae Acanthurus       achilles       Acanthurus.achilles
2 Acanthuridae Acanthurus albipectoralis Acanthurus.albipectoralis
3 Acanthuridae Acanthurus   auranticavus   Acanthurus.auranticavus
4 Acanthuridae Acanthurus        blochii        Acanthurus.blochii
5 Acanthuridae Acanthurus     dussumieri     Acanthurus.dussumieri
6 Acanthuridae Acanthurus        fowleri        Acanthurus.fowleri
        sl bodydepth snoutlength eyediameter size    schooling
1 163.6667 0.5543625   0.4877797   0.3507191    S     Solitary
2 212.7300 0.4405350   0.4402623   0.2560593    M  SmallGroups
3 216.0000 0.4726556   0.5386490   0.2451253    M MediumGroups
4  82.9000 0.5586486   0.4782217   0.3196155    M  SmallGroups
5 193.7033 0.5457248   0.5661867   0.2807218    L     Solitary
6 266.0000 0.4669521   0.5950563   0.2217376    M     Solitary
     size_school
1     S.Solitary
2  M.SmallGroups
3 M.MediumGroups
4  M.SmallGroups
5     L.Solitary
6     M.Solitary
herbivores.size_schooling2 <- unite(herbivores, col = "size_and_schooling", c("size", "schooling"), sep = ".") 
head(herbivores.size_schooling2)
        family      genus        species                   gen.spe
1 Acanthuridae Acanthurus       achilles       Acanthurus.achilles
2 Acanthuridae Acanthurus albipectoralis Acanthurus.albipectoralis
3 Acanthuridae Acanthurus   auranticavus   Acanthurus.auranticavus
4 Acanthuridae Acanthurus        blochii        Acanthurus.blochii
5 Acanthuridae Acanthurus     dussumieri     Acanthurus.dussumieri
6 Acanthuridae Acanthurus        fowleri        Acanthurus.fowleri
        sl bodydepth snoutlength eyediameter size_and_schooling
1 163.6667 0.5543625   0.4877797   0.3507191         S.Solitary
2 212.7300 0.4405350   0.4402623   0.2560593      M.SmallGroups
3 216.0000 0.4726556   0.5386490   0.2451253     M.MediumGroups
4  82.9000 0.5586486   0.4782217   0.3196155      M.SmallGroups
5 193.7033 0.5457248   0.5661867   0.2807218         L.Solitary
6 266.0000 0.4669521   0.5950563   0.2217376         M.Solitary
#alternative solution


zebrasoma.scarus <- herbivores %>%
  filter(genus == c("Zebrasoma", "Scarus")) %>%
  group_by(genus) %>%
  summarize(mean.eye = mean(eyediameter))

zebrasoma.scarus <- herbivores %>%
  filter(genus %in% c("Zebrasoma", "Scarus")) %>%
  group_by(genus) %>%
  summarize(mean.eye = mean(eyediameter))

number.species <- herbivores %>% # solution 1
  select(size) %>%
  group_by(size) %>%
  count(size)
number.species
# A tibble: 5 × 2
# Groups:   size [5]
  size      n
  <chr> <int>
1 L        18
2 M        43
3 S        28
4 XL        5
5 XS        2
crh.d <- herbivores %>% 
  group_by(size) %>% 
  summarize(n_species = n_distinct(gen.spe))
crh.d
# A tibble: 5 × 2
  size  n_species
  <chr>     <int>
1 L            18
2 M            43
3 S            28
4 XL            5
5 XS            2
ratio <- herbivores %>%
  group_by(species) %>%
  mutate(ratio.snout.sl = (snoutlength/sl)) %>%
  ungroup() %>%
  group_by(genus) %>%
  summarize(mean.ratio = mean(ratio.snout.sl)) %>%
  arrange(desc(mean.ratio))
head(ratio)
# A tibble: 6 × 2
  genus         mean.ratio
  <chr>              <dbl>
1 Zebrasoma        0.00519
2 Paracanthurus    0.00509
3 Ctenochaetus     0.00492
4 Acanthurus       0.00320
5 Siganus          0.00239
6 Naso             0.00216

Part II

fish.abu <- read.csv(file = "data/fish_abundance.csv")


# how many surveys
surveys <- fish.abu %>%
  select(surveyid) %>% 
  distinct() # distinct gives you the number of unique occurrences
nrow(surveys)
[1] 62
# replace Scaridae with Labridae
fish.abu.recode <- fish.abu %>%
  mutate(family = recode(family, "Scaridae" = "Labridae")) # use recode within mutate to change the family
head(fish.abu.recode)
  surveyid   country              site sitelat sitelong    surveydate
1  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
2  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
3  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
4  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
5  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
6  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
  depth       family        genus      species block total
1     3 Acanthuridae   Acanthurus grammoptilus     2     4
2     3 Acanthuridae   Acanthurus   nigricauda     2     3
3     3 Acanthuridae   Acanthurus    olivaceus     2     2
4     3 Acanthuridae Ctenochaetus    binotatus     1    15
5     3 Acanthuridae Ctenochaetus    binotatus     2    10
6     3 Acanthuridae    Zebrasoma       scopas     1     2
                   genspe
1 Acanthurus.grammoptilus
2   Acanthurus.nigricauda
3    Acanthurus.olivaceus
4  Ctenochaetus.binotatus
5  Ctenochaetus.binotatus
6        Zebrasoma.scopas
# filter out families
fish.abu.filtered <- fish.abu.recode %>%
  filter(family %in% c("Acanthuridae", "Siganidae", "Labridae", "Kyphosidae")) # use filter - also works as family == c("")
head(fish.abu.filtered)
  surveyid   country              site sitelat sitelong    surveydate
1  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
2  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
3  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
4  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
5  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
6  4000720 Australia Watsons Bay north  -14.66   145.45 6/27/10 14:00
  depth       family        genus      species block total
1     3 Acanthuridae   Acanthurus grammoptilus     2     4
2     3 Acanthuridae   Acanthurus   nigricauda     2     3
3     3 Acanthuridae   Acanthurus    olivaceus     2     2
4     3 Acanthuridae Ctenochaetus    binotatus     1    15
5     3 Acanthuridae Ctenochaetus    binotatus     2    10
6     3 Acanthuridae    Zebrasoma       scopas     1     2
                   genspe
1 Acanthurus.grammoptilus
2   Acanthurus.nigricauda
3    Acanthurus.olivaceus
4  Ctenochaetus.binotatus
5  Ctenochaetus.binotatus
6        Zebrasoma.scopas
# count species in each genus
fish.species.counts <- fish.abu.filtered %>%
  group_by(genus) %>% # group by genus
  summarize(number.species = n_distinct(species)) # use n_distinct() to count the rows in each groups ;-) 
head(fish.species.counts)
# A tibble: 6 × 2
  genus      number.species
  <chr>               <int>
1 Acanthurid              1
2 Acanthurus              9
3 Anampses                2
4 Bodianus                3
5 Calotomus               1
6 Cetoscarus              1
# mean abundances and number of species
fish.genus.abundance <- fish.abu.filtered %>%
  group_by(genus) %>% # group by family and genus - important because we'll want family in this dataset
  summarize(mean.abu = mean(total), number.species = n_distinct(species)) # mean and n_distinct
head(fish.genus.abundance)
# A tibble: 6 × 3
  genus      mean.abu number.species
  <chr>         <dbl>          <int>
1 Acanthurid     1                 1
2 Acanthurus     4.01              9
3 Anampses       3.33              2
4 Bodianus       1.32              3
5 Calotomus      1                 1
6 Cetoscarus     1                 1
# eye diameter
herbivore.eyes <- herbivores %>%
  ungroup() %>%
  group_by(schooling) %>% # group by schooling variable
  summarize(median.eye = median(eyediameter)) #summarize()
herbivore.eyes
# A tibble: 5 × 2
  schooling    median.eye
  <chr>             <dbl>
1 LargeGroups       0.275
2 MediumGroups      0.230
3 Pairs             0.303
4 SmallGroups       0.261
5 Solitary          0.272
herbivore.pruned <- herbivores %>%
  select(-sl, -size, -schooling) # use select to remove columns
head(herbivore.pruned)
        family      genus        species                   gen.spe
1 Acanthuridae Acanthurus       achilles       Acanthurus.achilles
2 Acanthuridae Acanthurus albipectoralis Acanthurus.albipectoralis
3 Acanthuridae Acanthurus   auranticavus   Acanthurus.auranticavus
4 Acanthuridae Acanthurus        blochii        Acanthurus.blochii
5 Acanthuridae Acanthurus     dussumieri     Acanthurus.dussumieri
6 Acanthuridae Acanthurus        fowleri        Acanthurus.fowleri
  bodydepth snoutlength eyediameter
1 0.5543625   0.4877797   0.3507191
2 0.4405350   0.4402623   0.2560593
3 0.4726556   0.5386490   0.2451253
4 0.5586486   0.4782217   0.3196155
5 0.5457248   0.5661867   0.2807218
6 0.4669521   0.5950563   0.2217376
# mean body depth, snout length, and eye diameter
herbivore.means <- herbivore.pruned %>%
  gather(5:7, key = "metric", value = "value") %>% # gather thre three morphometric columns into one
  group_by(genus, metric) %>% # group by
  summarize(mean.val = mean(value)) # get mean
head(herbivore.means)
# A tibble: 6 × 3
# Groups:   genus [2]
  genus        metric      mean.val
  <chr>        <chr>          <dbl>
1 Acanthurus   bodydepth      0.508
2 Acanthurus   eyediameter    0.298
3 Acanthurus   snoutlength    0.498
4 Bolbometopon bodydepth      0.437
5 Bolbometopon eyediameter    0.233
6 Bolbometopon snoutlength    0.312
# turn traits back into separate columns
herbivore.spread <- herbivore.means %>%
  spread(key = "metric", value = "mean.val") # use spread function
herbivore.spread
# A tibble: 14 × 4
# Groups:   genus [14]
   genus         bodydepth eyediameter snoutlength
   <chr>             <dbl>       <dbl>       <dbl>
 1 Acanthurus        0.508       0.298       0.498
 2 Bolbometopon      0.437       0.233       0.312
 3 Calotomus         0.385       0.246       0.306
 4 Cetoscarus        0.387       0.139       0.428
 5 Chlorurus         0.400       0.187       0.366
 6 Ctenochaetus      0.532       0.295       0.535
 7 Hipposcarus       0.404       0.150       0.386
 8 Kyphosus          0.479       0.181       0.164
 9 Leptoscarus       0.334       0.209       0.297
10 Naso              0.386       0.274       0.547
11 Paracanthurus     0.490       0.295       0.533
12 Scarus            0.391       0.195       0.341
13 Siganus           0.443       0.329       0.352
14 Zebrasoma         0.575       0.305       0.657
herbivores.joined <- herbivore.spread %>%
  left_join(fish.genus.abundance, by = "genus") %>% # use left_join to retain the ones in the traits dataset
  drop_na(mean.abu) # remove NA values 
herbivores.joined
# A tibble: 11 × 6
# Groups:   genus [11]
   genus     bodydepth eyediameter snoutlength mean.abu number.species
   <chr>         <dbl>       <dbl>       <dbl>    <dbl>          <int>
 1 Acanthur…     0.508       0.298       0.498     4.01              9
 2 Calotomus     0.385       0.246       0.306     1                 1
 3 Cetoscar…     0.387       0.139       0.428     1                 1
 4 Chlorurus     0.400       0.187       0.366     5.39              2
 5 Ctenocha…     0.532       0.295       0.535     5.58              3
 6 Kyphosus      0.479       0.181       0.164     1.5               1
 7 Naso          0.386       0.274       0.547     1.54              6
 8 Paracant…     0.490       0.295       0.533     2                 1
 9 Scarus        0.391       0.195       0.341     3.42             15
10 Siganus       0.443       0.329       0.352     2.21             11
11 Zebrasoma     0.575       0.305       0.657     2.80              2
herbivores.joined.2 <- herbivores.joined %>%
  mutate(highvslow = case_when(mean.abu >=2 ~ "high", # use case_when function
                               TRUE ~ "low"))
herbivores.joined.2
# A tibble: 11 × 7
# Groups:   genus [11]
   genus     bodydepth eyediameter snoutlength mean.abu number.species
   <chr>         <dbl>       <dbl>       <dbl>    <dbl>          <int>
 1 Acanthur…     0.508       0.298       0.498     4.01              9
 2 Calotomus     0.385       0.246       0.306     1                 1
 3 Cetoscar…     0.387       0.139       0.428     1                 1
 4 Chlorurus     0.400       0.187       0.366     5.39              2
 5 Ctenocha…     0.532       0.295       0.535     5.58              3
 6 Kyphosus      0.479       0.181       0.164     1.5               1
 7 Naso          0.386       0.274       0.547     1.54              6
 8 Paracant…     0.490       0.295       0.533     2                 1
 9 Scarus        0.391       0.195       0.341     3.42             15
10 Siganus       0.443       0.329       0.352     2.21             11
11 Zebrasoma     0.575       0.305       0.657     2.80              2
# ℹ 1 more variable: highvslow <chr>
herbivores.highlow <- herbivores.joined.2 %>%
  gather(2:5, key = "metric", value = "value") %>% # gathering to make it more efficient
  group_by(metric, highvslow) %>% # grouping
  summarize(mean.val = mean(value), # summarizing for mean and sd
            sd.val = sd(value))
herbivores.highlow
# A tibble: 8 × 4
# Groups:   metric [4]
  metric      highvslow mean.val sd.val
  <chr>       <chr>        <dbl>  <dbl>
1 bodydepth   high         0.477 0.0688
2 bodydepth   low          0.409 0.0466
3 eyediameter high         0.272 0.0567
4 eyediameter low          0.210 0.0613
5 mean.abu    high         3.63  1.44  
6 mean.abu    low          1.26  0.299 
7 snoutlength high         0.469 0.119 
8 snoutlength low          0.361 0.164 

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. Source code is available at https://github.com/simonjbrandl/marinecommunityecology, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".