This example is based on a “Madagascar Report Card” comparing and examining patterns between benthic cover and fish biomass across sites in Madagascar, adapted to use data from the MERMAID platform (https://datamermaid.org). Visualizing fish and benthic cover data together can be helpful for identifying if fish biomass and benthic cover are related and whether some sites are doing much better than others. This could be a first step in examining possible drivers of benthic and fish communities.
Getting summary sample event data from MERMAID
The first step is to download summary sample event data from MERMAID, using the mermaidr package (for which documentation can be found at https://data-mermaid.github.io/mermaidr/). The summary sample events contain all surveys (i.e. sample events) that have permissions of “public summary” or “public”.
Show the code
rm(list =ls()) #remove past stored objectsoptions(scipen =999) #turn off scientific notation#### Load packages and libraries ###### If this is the first time using mermaidr, install the package through "remotes"# install.packages("remotes")# remotes::install_github("data-mermaid/mermaidr")library(mermaidr) #package to download data from datamermaid.orglibrary(tidyverse) #package that makes it easier to work with datalibrary(plotly) #for interactive plottinglibrary(htmlwidgets) #for saving plots at html fileslibrary(DT) #package for interactive tableslibrary(ggplot2)library(ggpubr)#### Get data from MERMAID for creating aggregate visualizations ####allMermaidSampEventsTBL <- mermaidr::mermaid_get_summary_sampleevents()
Filter to an individual project and get relevant data by site
The next step is to filter the data to an individual project and get data for the relevant benthic cover and fish biomass data for each site.
In this case, I will focus on the project called “Madagascar NW 2020”, as it has data policies of “public summary” for both benthic cover and fish biomass.
Since the report card shows patterns in benthic cover and fish biomass by site, I will also get just the most recent surveys for each site and select only the benthic cover and fish biomass fields of interest. In the case of this project (at least at the time of writing this code) there is only one survey (i.e. sample event) per site so the step of getting the most recent survey is not strictly necessary. However, I have included this step here so it can be applied to other projects that may have multiple surveys per site.
Although there are multiple protocols that could collect the benthic data of interest, this project only includes one (benthic PIT) so I am focusing on that protocol. However, this could be applied to other protocols by changing the associated field names (i.e. column titles). The report card has benthic cover figures seperated into broad benthic categories. In the MERMAID summary sample events there are 11 of these broad categories that I will use: Hard coral, Soft coral, Macroalgae, Turf algae, Crustose coralline algae, Rubble, Sand, Seagrass, Cyanobacteria, Other invertebrates, and Bare substrate. I also calculate an “Other” category in case those 11 categories do not encompass all of the benthic cover.
In the report card there is a text block with information about the status of the ecosystem for each region in Madagascar. This includes averages for fish biomass, hard coral % cover, macroalgae % cover, and the number of sites. Also included is the number of fish species but I have not included that here as it is not currently something that is reported for summary sample events. However, this could be calculated from the observations within a project if the fishbelt data policy is set to “public” or if you are a user within a project. The following is a recreation of that text block, which can be then combined with the plots into a single document.
The following recreates the first part of the report - a plot showing the % cover of all high level benthic categories for each site as stacked horizontal bars. The x-axis showing the % benthic cover is reversed and the site names are on a y-axis on the right of the plot as this plot will later be combined with another plot beside it with fish biomass. Sites are ordered from highest to lowest % hard coral, with the values for % hard coral shown on the left.
Show the code
### Reshape the data so that hard coral categories are in rowsbenthicTBL <- madagascarProjSampleEventsTBL %>%select(site, `percent_cover_Hard coral`:`percent_cover_Bare substrate`, TotalBiomass, percent_cover_Other) %>%pivot_longer(-c(site, TotalBiomass),names_prefix ="percent_cover_",names_to ='benthic',values_to ='cover') %>%mutate(cover =ifelse(is.na(cover), 0, cover),#Make site into a factor with levels that sort by hard coral coversite =factor(site, levels = madagascarProjSampleEventsTBL$site))benthicTBL <- benthicTBL %>%mutate(benthic =factor(benthic, levels =unique(benthicTBL$benthic)))### Assign colors to the benthic types to match the dashboardbenthic_colors <-c("Hard coral"="#498fc9","Soft coral"="#9ce5fa","Macroalgae"="#b2b000","Turf algae"="#d9eea8","Crustose coralline algae"="#fbd7d5","Rubble"="#f5f6af","Sand"="#c1b180","Seagrass"="#4d4d4d","Cyanobacteria"="#870e00","Other invertebrates"="#4e4e4e","Bare substrate"="#f2f3f3")benthicPlot <-ggplot(data = benthicTBL,aes(x =as.numeric(site), y = cover, fill = benthic)) +geom_bar(position ='stack', stat ='identity', alpha =0.9,color ="black", linewidth =0.25) +labs(x ='', y ='Cover (%)', fill ='Benthic type',title ='Benthos') +coord_flip() +scale_fill_manual(values = benthic_colors) +scale_y_reverse(expand =c(0,0)) +scale_x_continuous(expand =c(0,0),breaks =1:length(unique(benthicTBL$site)),labels =paste0(round(benthicTBL$cover[benthicTBL$benthic =="Hard coral"]),'%'),sec.axis =sec_axis(~.,breaks =1:length(unique(benthicTBL$site)),labels =unique(benthicTBL$site))) +theme_classic() +theme(legend.position ='left',axis.text =element_text(size =11, colour ='black'),axis.line =element_line(colour ='black'),axis.ticks =element_line(colour ='black'),axis.title =element_text(size =12, colour ='black'),plot.subtitle =element_text(colour ='black', size =11, hjust =0.5),legend.background =element_rect(fill ='white', color =NA),legend.box.background =element_blank(),legend.key =element_rect(color ="black", linewidth =0.25),legend.title =element_text(colour ='black', face ='bold'),plot.title =element_text(colour ='black', size =14, hjust =0.5,face ='bold'),plot.margin =unit(c(0.2, 0.1, 0, 0), 'cm'),axis.text.y.left =element_text(size =9, color ='#498FC9'),axis.text.y.right =element_text(size =9, color ='black', hjust =0.5),axis.line.y =element_blank(),axis.ticks.x =element_line(color ='black'),axis.ticks.y =element_blank(),panel.border =element_blank())#Save the plot to share in an analysis hub articleggsave("../plots/madagascar_report_card_benthicPlot.svg",plot = benthicPlot)
Saving 7 x 5 in image
Show the code
benthicPlot
Fish biomass by trophic group
Stacked horizontal bar plot
The following recreates the top right figure in the report - a plot showing fish biomass for each trophic group of fishes as stacked horizontal bars for each site. The x-axis has biomass in kg/ha and the site names are on a y-axis on the left of the plot. Sites are ordered the same as in the benthic plot, from highest to lowest % hard coral. Also included at the end of each stacked bar is text with the total fish biomass for each site and the management regime name.
Show the code
### Reshape the data so that trophic groups are in rowsfishTBL <- madagascarProjSampleEventsTBL %>%select(site, management_rules, `percent_cover_Hard coral`, biomass_planktivore:TotalBiomass) %>%pivot_longer(-c(site, management_rules,`percent_cover_Hard coral`, TotalBiomass),names_prefix ="biomass_",names_to ='TG',values_to ='biomass') %>%mutate(biomass =ifelse(is.na(biomass), 0, biomass),#Make site into a factor with levels that sort by hard coral coversite =factor(site, levels = madagascarProjSampleEventsTBL$site),TG =case_when(TG =="planktivore"~"Planktivore", TG =="herbivore-macroalgae"~"Herbivore (macroalgae)", TG =="herbivore-detritivore"~"Herbivore (detritivore)", TG =="invertivore-sessile"~"Invertivore (sessile)", TG =="invertivore-mobile"~"Invertivore (mobile)", TG =="omnivore"~"Omnivore", TG =="piscivore"~"Piscivore", TG =="other"~"Other",.default = TG)) %>%mutate(TG =factor(TG, levels =c("Planktivore","Herbivore (macroalgae)","Herbivore (detritivore)","Invertivore (sessile)","Invertivore (mobile)","Omnivore","Piscivore","Other")))### Assign colors to the trophic groups to match the dashboardtg_colors <-c("Planktivore"="#bebad8","Herbivore (macroalgae)"="#659034","Herbivore (detritivore)"="#ddee96","Invertivore (sessile)"="#f1da54","Invertivore (mobile)"="#e7b16d","Omnivore"="#9ccbc1","Piscivore"="#597db4","Other"="grey")#### Recreate the stacked barplot with fish biomass by trophic groupfishBiomassPlot <-ggplot(data = fishTBL,aes(x = site, y = biomass, fill = TG)) +geom_bar(position ='stack', stat ='identity', alpha =0.9,color ="black", linewidth =0.25) +labs(x ='', y ='Biomass (kg/ha)', fill ='Trophic group',title ='Fish') +coord_flip() +scale_fill_manual(values = tg_colors) +geom_text(data = fishTBL,aes(x = site, y = TotalBiomass +50,label =paste0(round(TotalBiomass, 0))),size =2, hjust =0, vjust =-.25) +geom_text(data = fishTBL,aes(x = site, y = TotalBiomass +50,label = management_rules),size =2, hjust =0, vjust =1.2) +scale_y_continuous(expand =c(0, 0),limits =c(0, max(fishTBL$TotalBiomass) +600)) +theme_classic() +theme(axis.text =element_text(size =11, colour ='black'),axis.line =element_line(colour ='black'),axis.ticks =element_line(colour ='black'),axis.title =element_text(size =12, colour ='black'),plot.subtitle =element_text(colour ='black', size =11, hjust =0.5),legend.background =element_rect(fill ='white', color =NA),legend.position ="right",plot.title =element_text(colour ='black', size =14,hjust =0.5, face ='bold'),legend.box.background =element_blank(),legend.key =element_rect(color ="black", linewidth =0.25),legend.title =element_text(colour ='black', face ='bold'),plot.margin =unit(c(0.2, 1, 0, 0.2), 'cm'),axis.ticks.y =element_blank(),axis.line.x =element_line(color ='black'),axis.line.y =element_blank(),axis.ticks.x =element_line(color ='black'),axis.text.y =element_text(hjust =0.5, size =8),panel.border =element_blank())#Save the plot to share in an analysis hub articleggsave("../plots/madagascar_report_card_fishBiomassPlot.svg",plot = fishBiomassPlot)
Saving 7 x 5 in image
Show the code
fishBiomassPlot
Fish biomass vs. benthic cover categories
Scatter plots with fitted GAMs
In the bottom left corner of the report are scatterplots that show the spread of fish biomass (on the y-axis) and % cover (on the x-axis) for each of 8 high level benthic categories (as facets) - resulting in 8 plots. Within each scatterplot, the data is fitted to a generalized additive model (GAM) using the geom_smooth function (https://ggplot2.tidyverse.org/reference/geom_smooth.html). Below is a recreation of these plots. Although there are 11 high level benthic attributes returned by in MERMAID’s summary sample events (as can be seen in the benthic plot above), only a subset of 8 are featured here to allow more room for each plot. Those 8 are the following: Hard coral, Soft coral, Macroalgae, Turf algae, Crustose coralline algae, Rubble, Sand, and Bare substrate.
In this case, there are too many zeroes to fit a GAM to the “Crustose coralline algae” category.
Trophic group biomass vs. hard coral cover
Scatter plots with fitted GAMS
In the bottom right corner of the report are more scatterplots that show fish biomass for each of 7 trophic group (on the y-axis) and % hard coral cover (on the x-axis), with each trophic group as a separate facet/plot. Within each plot, the data is again fitted to a generalized additive model (GAM) as above. Below is a recreation of these plots.