Madagascar Report Card

Benthic Cover and Fish Biomass

Authors

Iain R. Caldwell

James P.W. Robinson

Published

May 19, 2025


This example is based on a “Madagascar Report Card” comparing and examining patterns between benthic cover and fish biomass across sites in Madagascar, adapted to use data from the MERMAID platform (https://datamermaid.org). Visualizing fish and benthic cover data together can be helpful for identifying if fish biomass and benthic cover are related and whether some sites are doing much better than others. This could be a first step in examining possible drivers of benthic and fish communities.


Getting summary sample event data from MERMAID

The first step is to download summary sample event data from MERMAID, using the mermaidr package (for which documentation can be found at https://data-mermaid.github.io/mermaidr/). The summary sample events contain all surveys (i.e. sample events) that have permissions of “public summary” or “public”.

Show the code
rm(list = ls()) #remove past stored objects
options(scipen = 999) #turn off scientific notation

####  Load packages and libraries ####
## If this is the first time using mermaidr, install the package through "remotes"
# install.packages("remotes")
# remotes::install_github("data-mermaid/mermaidr")

library(mermaidr) #package to download data from datamermaid.org
library(tidyverse) #package that makes it easier to work with data
library(plotly) #for interactive plotting
library(htmlwidgets) #for saving plots at html files
library(DT) #package for interactive tables
library(ggplot2)
library(ggpubr)

#### Get data from MERMAID for creating aggregate visualizations ####
allMermaidSampEventsTBL <- mermaidr::mermaid_get_summary_sampleevents()

Filter to an individual project and get relevant data by site

The next step is to filter the data to an individual project and get data for the relevant benthic cover and fish biomass data for each site.

In this case, I will focus on the project called “Madagascar NW 2020”, as it has data policies of “public summary” for both benthic cover and fish biomass.

Since the report card shows patterns in benthic cover and fish biomass by site, I will also get just the most recent surveys for each site and select only the benthic cover and fish biomass fields of interest. In the case of this project (at least at the time of writing this code) there is only one survey (i.e. sample event) per site so the step of getting the most recent survey is not strictly necessary. However, I have included this step here so it can be applied to other projects that may have multiple surveys per site.

Although there are multiple protocols that could collect the benthic data of interest, this project only includes one (benthic PIT) so I am focusing on that protocol. However, this could be applied to other protocols by changing the associated field names (i.e. column titles). The report card has benthic cover figures seperated into broad benthic categories. In the MERMAID summary sample events there are 11 of these broad categories that I will use: Hard coral, Soft coral, Macroalgae, Turf algae, Crustose coralline algae, Rubble, Sand, Seagrass, Cyanobacteria, Other invertebrates, and Bare substrate. I also calculate an “Other” category in case those 11 categories do not encompass all of the benthic cover.

Show the code
projectName = "Madagascar NW 2020"

madagascarProjSampleEventsTBL <- allMermaidSampEventsTBL %>% 
  filter(project == projectName &
           !is.na(benthicpit_sample_unit_count) &
1           !is.na(beltfish_sample_unit_count)) %>%
2  group_by(site) %>%
  filter(sample_date == max(sample_date)) %>%
  ungroup() %>%
3  select(site, management_rules,
4         `benthicpit_percent_cover_benthic_category_avg_Hard coral`,
         `benthicpit_percent_cover_benthic_category_avg_Soft coral`,
         benthicpit_percent_cover_benthic_category_avg_Macroalgae,
         `benthicpit_percent_cover_benthic_category_avg_Turf algae`,
         `benthicpit_percent_cover_benthic_category_avg_Crustose coralline algae`,
         benthicpit_percent_cover_benthic_category_avg_Rubble,
         benthicpit_percent_cover_benthic_category_avg_Sand,
         benthicpit_percent_cover_benthic_category_avg_Seagrass,
         benthicpit_percent_cover_benthic_category_avg_Cyanobacteria,
         `benthicpit_percent_cover_benthic_category_avg_Other invertebrates`,
         `benthicpit_percent_cover_benthic_category_avg_Bare substrate`,
5         beltfish_biomass_kgha_trophic_group_avg_planktivore,
         `beltfish_biomass_kgha_trophic_group_avg_herbivore-macroalgae`,
         `beltfish_biomass_kgha_trophic_group_avg_herbivore-detritivore`,
         `beltfish_biomass_kgha_trophic_group_avg_invertivore-sessile`,
         `beltfish_biomass_kgha_trophic_group_avg_invertivore-mobile`,
         beltfish_biomass_kgha_trophic_group_avg_omnivore,
         beltfish_biomass_kgha_trophic_group_avg_piscivore,
         beltfish_biomass_kgha_trophic_group_avg_other) %>%
6  rename_with(~ gsub("benthicpit_percent_cover_benthic_category_avg_",
                     "percent_cover_",
                     .x, fixed = TRUE)) %>% 
  rename_with(~ gsub("beltfish_biomass_kgha_trophic_group_avg_",
                     "biomass_",
                     .x, fixed = TRUE)) %>% 
  replace(is.na(.), 0) %>% 
7  mutate(TotalBiomass = biomass_planktivore + `biomass_herbivore-macroalgae` +
           `biomass_herbivore-detritivore` +
           `biomass_invertivore-sessile` + `biomass_invertivore-mobile` + 
           biomass_omnivore + biomass_piscivore + biomass_other,
8         percent_cover_Other = 100 - (`percent_cover_Hard coral` +
                                        `percent_cover_Soft coral` +
                                        percent_cover_Macroalgae +
                                        `percent_cover_Turf algae` +
                                        `percent_cover_Crustose coralline algae` +
                                        percent_cover_Rubble +
                                        percent_cover_Sand +
                                        percent_cover_Seagrass +
                                        percent_cover_Cyanobacteria +
                                        `percent_cover_Other invertebrates` +
                                        `percent_cover_Bare substrate`)) %>% 
  arrange(`percent_cover_Hard coral`)
1
Filter to an individual project and remove any sample events that don’t have any benthic or fish data
2
Get the most recent data for each site
3
Select relevant site data and…
4
…relevant benthic cover fields and…
5
…relevant fish biomass fields
6
Shorten columns to make them easier to reference
7
Calculate a total biomass across the trophic groups
8
Assign the remaining % cover to “Other”

Take a look at the data

One way to view the data after the filtering and selecting is as an interactive table using the package “DT” (https://rstudio.github.io/DT/https://rstudio.github.io/DT/).

Show the code
datatable(madagascarProjSampleEventsTBL %>%
            select(site, management_rules,
                   `percent_cover_Hard coral`, TotalBiomass))

In the table you can sort, filter, or search.


Ecosystem status text block

In the report card there is a text block with information about the status of the ecosystem for each region in Madagascar. This includes averages for fish biomass, hard coral % cover, macroalgae % cover, and the number of sites. Also included is the number of fish species but I have not included that here as it is not currently something that is reported for summary sample events. However, this could be calculated from the observations within a project if the fishbelt data policy is set to “public” or if you are a user within a project. The following is a recreation of that text block, which can be then combined with the plots into a single document.

Show the code
## arrange summary stats for a top-left data display
datadisplay <- data.frame(
  x = 0,
  y = seq(1,0, length.out = 4),
  value = c(paste(round(mean(madagascarProjSampleEventsTBL$TotalBiomass),
                        0),
                  'kg/ha'),
            paste0(round(
              mean(madagascarProjSampleEventsTBL$`percent_cover_Hard coral`),
              0),'%'),
            paste0(round(
              mean(madagascarProjSampleEventsTBL$percent_cover_Macroalgae),
              0), '%'),
            length(unique(madagascarProjSampleEventsTBL$site))),
  text = c('Biomass (mean):', 'Hard coral (mean):',
           'Macroalgae (mean):', '# sites'))

gtext <- ggplot(datadisplay) +
  geom_text(aes(x, y, label = text), fontface = 'bold', hjust = 1) +
  geom_text(aes(x, y, label = value), nudge_x = .01, hjust = 0) + 
  annotate(x = 0, y = 1.6, geom = 'text',
           label = projectName,
           fontface = 'bold', size =5.5, hjust = 0.5) +
  annotate(x = 0, y = 1.3, geom = 'text',
           label = 'Ecosystem status', fontface = 'plain',
           size = 4.5, hjust = 0.5) +
  theme_void() +
  theme(plot.margin=unit(c(0, 0, 0, 0), 'cm')) +
  scale_y_continuous(limits = c(-.1, 1.8), expand=c(0,0)) +
  scale_x_continuous(limits = c(-0.2, 0.2), expand=c(0,0))

gtext


Benthic cover by site and category

Stacked horizontal bar plot

The following recreates the first part of the report - a plot showing the % cover of all high level benthic categories for each site as stacked horizontal bars. The x-axis showing the % benthic cover is reversed and the site names are on a y-axis on the right of the plot as this plot will later be combined with another plot beside it with fish biomass. Sites are ordered from highest to lowest % hard coral, with the values for % hard coral shown on the left.

Show the code
### Reshape the data so that hard coral categories are in rows
benthicTBL <- madagascarProjSampleEventsTBL %>%
  select(site, `percent_cover_Hard coral`:`percent_cover_Bare substrate`,
         TotalBiomass, percent_cover_Other) %>% 
  pivot_longer(-c(site, TotalBiomass),
               names_prefix = "percent_cover_",
               names_to = 'benthic',
               values_to = 'cover') %>%
  mutate(cover = ifelse(is.na(cover), 0, cover),
         #Make site into a factor with levels that sort by hard coral cover
         site = factor(site, levels = madagascarProjSampleEventsTBL$site))

benthicTBL <- benthicTBL %>% 
  mutate(benthic = factor(benthic, levels = unique(benthicTBL$benthic)))

### Assign colors to the benthic types to match the dashboard
benthic_colors <- c("Hard coral" = "#498fc9",
                    "Soft coral" = "#9ce5fa",
                    "Macroalgae" = "#b2b000",
                    "Turf algae" = "#d9eea8",
                    "Crustose coralline algae" = "#fbd7d5",
                    "Rubble" = "#f5f6af",
                    "Sand" = "#c1b180",
                    "Seagrass" = "#4d4d4d",
                    "Cyanobacteria" = "#870e00",
                    "Other invertebrates" = "#4e4e4e",
                    "Bare substrate" = "#f2f3f3")
  
benthicPlot <- ggplot(data = benthicTBL,
                      aes(x = as.numeric(site), y = cover, fill = benthic)) +
  geom_bar(position = 'stack', stat = 'identity', alpha = 0.9,
           color = "black", linewidth = 0.25) +
  labs(x = '', y = 'Cover (%)', fill = 'Benthic type',
       title = 'Benthos') +
  coord_flip() +
  scale_fill_manual(values = benthic_colors) +
  scale_y_reverse(expand = c(0,0)) + 
  scale_x_continuous(expand = c(0,0),
                     breaks = 1:length(unique(benthicTBL$site)),
                     labels = paste0(
                       round(benthicTBL$cover[benthicTBL$benthic == "Hard coral"]),
                       '%'),
                     sec.axis = sec_axis(~.,
                                         breaks = 1:length(unique(benthicTBL$site)),
                                         labels = unique(benthicTBL$site))) +
  theme_classic() +
  theme(legend.position = 'left',
        axis.text = element_text(size = 11, colour = 'black'),
        axis.line = element_line(colour = 'black'),
        axis.ticks = element_line(colour = 'black'),
        axis.title = element_text(size = 12, colour = 'black'),
        plot.subtitle = element_text(colour = 'black', size = 11, hjust = 0.5),
        legend.background = element_rect(fill = 'white', color = NA),
        legend.box.background = element_blank(),
        legend.key = element_rect(color = "black", linewidth = 0.25),
        legend.title = element_text(colour = 'black', face = 'bold'),
        plot.title = element_text(colour = 'black', size = 14, hjust = 0.5,
                                  face = 'bold'),
        plot.margin = unit(c(0.2, 0.1, 0, 0), 'cm'),
        axis.text.y.left = element_text(size = 9, color = '#498FC9'),
        axis.text.y.right = element_text(size = 9, color = 'black', hjust = 0.5),
        axis.line.y = element_blank(),
        axis.ticks.x = element_line(color = 'black'),
        axis.ticks.y = element_blank(),
        panel.border = element_blank())

#Save the plot to share in an analysis hub article
ggsave("../plots/madagascar_report_card_benthicPlot.svg",
       plot = benthicPlot)
Saving 7 x 5 in image
Show the code
benthicPlot


Fish biomass by trophic group

Stacked horizontal bar plot

The following recreates the top right figure in the report - a plot showing fish biomass for each trophic group of fishes as stacked horizontal bars for each site. The x-axis has biomass in kg/ha and the site names are on a y-axis on the left of the plot. Sites are ordered the same as in the benthic plot, from highest to lowest % hard coral. Also included at the end of each stacked bar is text with the total fish biomass for each site and the management regime name.

Show the code
### Reshape the data so that trophic groups are in rows
fishTBL <- madagascarProjSampleEventsTBL %>%
  select(site, management_rules, `percent_cover_Hard coral`,
         biomass_planktivore:TotalBiomass) %>% 
  pivot_longer(-c(site, management_rules,
                  `percent_cover_Hard coral`, TotalBiomass),
               names_prefix = "biomass_",
               names_to = 'TG',
               values_to = 'biomass') %>%
  mutate(biomass = ifelse(is.na(biomass), 0, biomass),
         #Make site into a factor with levels that sort by hard coral cover
         site = factor(site, levels = madagascarProjSampleEventsTBL$site),
         TG = case_when(TG == "planktivore" ~ "Planktivore",
                        TG == "herbivore-macroalgae" ~ "Herbivore (macroalgae)",
                        TG == "herbivore-detritivore" ~ "Herbivore (detritivore)",
                        TG == "invertivore-sessile" ~ "Invertivore (sessile)",
                        TG == "invertivore-mobile" ~ "Invertivore (mobile)",
                        TG == "omnivore" ~ "Omnivore",
                        TG == "piscivore" ~ "Piscivore",
                        TG == "other" ~ "Other",
                        .default = TG)) %>% 
  mutate(TG = factor(TG, levels = c("Planktivore",
                                    "Herbivore (macroalgae)",
                                    "Herbivore (detritivore)",
                                    "Invertivore (sessile)",
                                    "Invertivore (mobile)",
                                    "Omnivore",
                                    "Piscivore",
                                    "Other")))

### Assign colors to the trophic groups to match the dashboard
tg_colors <- c("Planktivore" = "#bebad8",
               "Herbivore (macroalgae)" = "#659034",
               "Herbivore (detritivore)" = "#ddee96",
               "Invertivore (sessile)" = "#f1da54",
               "Invertivore (mobile)" = "#e7b16d",
               "Omnivore" = "#9ccbc1",
               "Piscivore" = "#597db4",
               "Other" = "grey")

#### Recreate the stacked barplot with fish biomass by trophic group
fishBiomassPlot <- ggplot(data = fishTBL,
                          aes(x = site, y = biomass, fill = TG)) +
  geom_bar(position = 'stack', stat = 'identity', alpha = 0.9,
           color = "black", linewidth = 0.25) +
  labs(x = '', y = 'Biomass (kg/ha)', fill = 'Trophic group',
       title = 'Fish') +
  coord_flip() +
  scale_fill_manual(values = tg_colors) +
  geom_text(data = fishTBL,
            aes(x = site, y = TotalBiomass + 50,
                label = paste0(round(TotalBiomass, 0))),
            size = 2, hjust = 0, vjust = -.25) +
  geom_text(data = fishTBL,
            aes(x = site, y = TotalBiomass + 50,
                label = management_rules),
            size = 2, hjust = 0, vjust = 1.2) +
  scale_y_continuous(expand = c(0, 0),
                     limits = c(0, max(fishTBL$TotalBiomass) + 600)) +
  theme_classic() +
  theme(axis.text = element_text(size = 11, colour = 'black'),
        axis.line = element_line(colour = 'black'),
        axis.ticks = element_line(colour = 'black'),
        axis.title = element_text(size = 12, colour = 'black'),
        plot.subtitle = element_text(colour = 'black', size = 11, hjust = 0.5),
        legend.background = element_rect(fill = 'white', color = NA),
        legend.position = "right",
        plot.title = element_text(colour = 'black', size = 14,
                                hjust = 0.5, face = 'bold'),
        legend.box.background = element_blank(),
        legend.key = element_rect(color = "black", linewidth = 0.25),
        legend.title = element_text(colour = 'black', face = 'bold'),
        plot.margin = unit(c(0.2, 1, 0, 0.2), 'cm'),
        axis.ticks.y = element_blank(),
        axis.line.x = element_line(color = 'black'),
        axis.line.y = element_blank(),
        axis.ticks.x = element_line(color = 'black'),
        axis.text.y = element_text(hjust = 0.5, size = 8),
        panel.border = element_blank())

#Save the plot to share in an analysis hub article
ggsave("../plots/madagascar_report_card_fishBiomassPlot.svg",
       plot = fishBiomassPlot)
Saving 7 x 5 in image
Show the code
fishBiomassPlot


Fish biomass vs. benthic cover categories

Scatter plots with fitted GAMs

In the bottom left corner of the report are scatterplots that show the spread of fish biomass (on the y-axis) and % cover (on the x-axis) for each of 8 high level benthic categories (as facets) - resulting in 8 plots. Within each scatterplot, the data is fitted to a generalized additive model (GAM) using the geom_smooth function (https://ggplot2.tidyverse.org/reference/geom_smooth.html). Below is a recreation of these plots. Although there are 11 high level benthic attributes returned by in MERMAID’s summary sample events (as can be seen in the benthic plot above), only a subset of 8 are featured here to allow more room for each plot. Those 8 are the following: Hard coral, Soft coral, Macroalgae, Turf algae, Crustose coralline algae, Rubble, Sand, and Bare substrate.

Show the code
biomassVsCoverScatterplots <-
  ggplot(benthicTBL %>% 
           filter(!benthic %in% c("Cyanobacteria",
                                  "Other invertebrates",
                                  "Seagrass",
                                  "Other")),
         aes(cover, TotalBiomass)) +
  geom_smooth(method = 'gam', col = 'grey', fill = 'grey90') +
  labs(x = 'Benthic cover (%)',
       y = "Fish biomass (kg/ha)",
       title = '') +
  geom_point(aes(fill = benthic), shape = 21, alpha = 0.8, size = 2) +
  scale_color_manual(values = benthic_colors) +
  facet_wrap(~benthic, nrow = 4, scales = 'free') +
  theme_classic() +
  theme(axis.text = element_text(size = 9, colour = 'black'),
        axis.line = element_line(colour = 'black'),
        axis.ticks = element_line(colour = 'black'),
        axis.title = element_text(size = 12, colour = 'black'),
        plot.subtitle = element_text(colour = 'black', size = 11, hjust = 0.5),
        legend.background = element_rect(fill = 'white', color = NA),
        legend.box.background = element_blank(),
        legend.title = element_blank(),
        legend.position = 'none',
        strip.text = element_text(hjust = 0, face = "bold"),
        strip.background = element_blank(),
        plot.margin = unit(c(0, .5, 0, .5), 'cm'),
        panel.border = element_blank())

#Save the plot to share in an analysis hub article
ggsave("../plots/madagascar_report_card_biomassVsCoverPlots.svg",
       plot = biomassVsCoverScatterplots)

biomassVsCoverScatterplots

In this case, there are too many zeroes to fit a GAM to the “Crustose coralline algae” category.


Trophic group biomass vs. hard coral cover

Scatter plots with fitted GAMS

In the bottom right corner of the report are more scatterplots that show fish biomass for each of 7 trophic group (on the y-axis) and % hard coral cover (on the x-axis), with each trophic group as a separate facet/plot. Within each plot, the data is again fitted to a generalized additive model (GAM) as above. Below is a recreation of these plots.

Show the code
trophicBiomassVsHarcCoralScatterplots <-
  ggplot(fishTBL %>% 
           filter(TG != "Other"),
         aes(x = `percent_cover_Hard coral`, y = biomass)) +
  facet_wrap(~TG, ncol = 2, scales = 'free_y') +
  geom_smooth(method = 'gam', col = 'grey', fill = 'grey90') +
  geom_point(aes(fill = TG), shape = 21, alpha = 0.8, size = 2) +
  scale_fill_manual(values = tg_colors) +
  scale_y_continuous(limits = c(0, NA), expand = c(0.1,0.1)) +
  labs(x = 'Hard coral cover (%)',
       y = "Fish biomass (kg/ha)",
       title = '') +
  theme_classic() +
  theme(axis.text = element_text(size = 9, colour = 'black'),
        axis.line = element_line(colour = 'black'),
        axis.ticks = element_line(colour = 'black'),
        axis.title = element_text(size = 12, colour = 'black'),
        plot.subtitle = element_text(colour = 'black', size = 11, hjust = 0.5),
        legend.background = element_rect(fill = 'white', color = NA),
        legend.box.background = element_blank(),
        legend.title = element_blank(),
        legend.position = 'none',
        strip.text = element_text(hjust = 0, face = "bold"),
        strip.background = element_blank(),
        plot.margin = unit(c(0, .5, 0, .5), 'cm'),
        panel.border = element_blank())

trophicBiomassVsHarcCoralScatterplots

 

Powered by

Logo