Title: | Calculate and Plot Pollutant Load Duration Curves |
---|---|
Description: | Load duration curves are a method for visualizing pollutant loads in freshwater streams based on the assumed relationship between streamflow and load. Functions are provided for calculating exceedance probabilities, pollutant loading, and plotting load duration curves. |
Authors: | Michael Schramm [aut, cre] , Texas Water Resources Institute [cph] |
Maintainer: | Michael Schramm <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2025-01-08 04:28:32 UTC |
Source: | https://github.com/TxWRI/ldc |
Calculates the median annual ldc with confidence intervals.
calc_annual_ldc( .tbl, Q = NULL, C = NULL, Date = NULL, allowable_concentration = NULL, breaks = c(1, 0.8, 0.4, 0), labels = c("High Flows", "Medium Flows", "Low Flows"), conf_level = 0.9, estimator = 6, n = 500 )
calc_annual_ldc( .tbl, Q = NULL, C = NULL, Date = NULL, allowable_concentration = NULL, breaks = c(1, 0.8, 0.4, 0), labels = c("High Flows", "Medium Flows", "Low Flows"), conf_level = 0.9, estimator = 6, n = 500 )
.tbl |
data frame with at least three columns Q (discharge or flow), C (associated pollutant concentration), and Date. |
Q |
variable name in .tbl for discharge or flow. This must be of class 'units', typically with a units value of "ft^3/s". |
C |
variable name in .tbl for associated pollutant concentration at a given flow value. This must be of class 'units', typically with a units value of "mg/L" or "cfu/100mL". |
Date |
variable name in .tbl for the event Date. This variable must be of class 'Date'. |
allowable_concentration |
an object of class |
breaks |
a numeric vector of break points for flow categories. Must be
of length of labels + 1. defaults to |
labels |
labels for the categories specified by breaks. |
conf_level |
numeric, confidence level (default is 0.9) of the median interval at given exceedance probability. |
estimator |
one of |
n |
numeric, the length of generated probability points. Larger n may
result in a slightly smoother curve at a cost of increased processing time.
The probability points are used to generate the continuous sample quantiles
types 5 to 9 (see |
The median annual ldc is calculated by computing the flow duration curve for each individual year in the dataset. Exceedance probabilities are calculated from the descending order of Daily Flows. By default, the Weibull plotting position is used:
where , is the i-th sorted streamflow value.
The median streamflow +/- chosen confidence interval is calculated at each exceedance probability. The load duration curve is calculated by multiplying the median streamflow by the allowable concentration and appropriate conversions.
list of two tibbles (Q and C). Includes variables in .tbl and Daily_Flow_Volume (discharge volume), Daily_Load (pollutant sample volume), P_Exceedance (exeedance probability), Flow_Category (as defined by breaks and labels).
Vogel, Richard M., and Neil M. Fennessey. "Flow-duration curves. I: New interpretation and confidence intervals." Journal of Water Resources Planning and Management 120, no. 4 (1994): 485-504. doi:10.1061/(ASCE)0733-9496(1994)120:4(485)
# Basic example using built in Tres Palacios data library(dplyr) library(units) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_annual_ldc(df, Q = Flow, C = Indicator_Bacteria, Date = Date, allowable_concentration = allowable_concentration, estimator = 5, n = 1000) df_ldc$Q ## cleanup remove_unit("cfu")
# Basic example using built in Tres Palacios data library(dplyr) library(units) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_annual_ldc(df, Q = Flow, C = Indicator_Bacteria, Date = Date, allowable_concentration = allowable_concentration, estimator = 5, n = 1000) df_ldc$Q ## cleanup remove_unit("cfu")
Calculates the period of record load duration curve from a data frame that includes mean daily flow and associated point measurements of pollutant concentration.
calc_ldc( .tbl, Q = NULL, C = NULL, allowable_concentration = NULL, breaks = c(1, 0.8, 0.4, 0), labels = c("High Flows", "Medium Flows", "Low Flows"), estimator = 6 )
calc_ldc( .tbl, Q = NULL, C = NULL, allowable_concentration = NULL, breaks = c(1, 0.8, 0.4, 0), labels = c("High Flows", "Medium Flows", "Low Flows"), estimator = 6 )
.tbl |
data frame with at least two columns Q (discharge or flow) and C (associated pollutant concentration). |
Q |
variable name in .tbl for discharge or flow. This must have unit set, typically "ft^3/s". |
C |
variable name in .tbl for associated pollutant concentration at a given flow value. This must have a unit set, typically "mg/L" or "cfu/100mL". |
allowable_concentration |
an object of class |
breaks |
a numeric vector of break points for flow categories. Must be
of length of labels + 1. defaults to |
labels |
labels for the categories specified by breaks. |
estimator |
numeric, one of |
The exceedance probability is calculated from the descending order of Daily Flows. By default, the Weibull plotting position is used:
where , is the i-th sorted streamflow value.
object of class tibble. Includes variables in .tbl and Daily_Flow_Volume (discharge volume), Daily_Load (pollutant sample volume), P_Exceedance (exeedance probability), Flow_Category (as defined by breaks and labels).
# Basic example using built in Tres Palacios data library(dplyr) library(units) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_ldc(df, Q = Flow, C = Indicator_Bacteria, allowable_concentration = allowable_concentration) df_ldc ## cleanup remove_unit("cfu")
# Basic example using built in Tres Palacios data library(dplyr) library(units) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_ldc(df, Q = Flow, C = Indicator_Bacteria, allowable_concentration = allowable_concentration) df_ldc ## cleanup remove_unit("cfu")
Creates a load duration curve visualization from the outputs of
calc_ldc
and summ_ldc
as a ggplot object.
draw_ldc( .tbl_calc, .tbl_summ, y_lab = NULL, ldc_legend_name = "Allowable Load at State Water Quality Standard", measurement_name = "Measurement Value", measurement_shape = 21, measurement_color = "dodgerblue", measurement_alpha = 1, summary_name = "Summarized Measured Load", summary_stat_shape = 12, summary_stat_color = "red", label_nudge_y = 0, label_font_family = "Arial", label_font_size = 3, label_break = TRUE )
draw_ldc( .tbl_calc, .tbl_summ, y_lab = NULL, ldc_legend_name = "Allowable Load at State Water Quality Standard", measurement_name = "Measurement Value", measurement_shape = 21, measurement_color = "dodgerblue", measurement_alpha = 1, summary_name = "Summarized Measured Load", summary_stat_shape = 12, summary_stat_color = "red", label_nudge_y = 0, label_font_family = "Arial", label_font_size = 3, label_break = TRUE )
.tbl_calc |
data frame object created by |
.tbl_summ |
data frame object created by |
y_lab |
optional string for y-axis label name, will be appended with units automatically. default is NULL. |
ldc_legend_name |
string, provides the name used for the allowable pollutant load line in the legend. required. |
measurement_name |
string, provides the name used for measured load values in the legend. required. |
measurement_shape |
aesthetic value passed to the layer plotting
measured load values. defaults to |
measurement_color |
aesthetic value passed to the layer plotting
measured load values. defaults to |
measurement_alpha |
aesthetic value passed to the layer plotting
measured load values. defaults to |
summary_name |
string, provides the name used for summary statistic values in the legend. required. |
summary_stat_shape |
aesthetic value passed to the layer plotting
summary statistic values. defaults to |
summary_stat_color |
aesthetic value passed to the layer plotting
summary statistic values. defaults to |
label_nudge_y |
numeric value to vertically nudge flow category labels.
If a log10 transformed scale is being used, a log value is probably
appropriate for example |
label_font_family |
string specifying font family to use in flow category labels. |
label_font_size |
numeric value specifying font size to use in flow category labels. |
label_break |
logical, add line breaks to flow category labels. Labels will break at spaces. |
ggplot object
# Basic example using built in Tres Palacios data library(dplyr) library(units) library(ggplot2) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_ldc(df, Q = Flow, C = Indicator_Bacteria, allowable_concentration = allowable_concentration) # Summarize LDC df_sum <- summ_ldc(df_ldc, Q = Flow, C = Indicator_Bacteria, Exceedance = P_Exceedance, groups = Flow_Category, method = "geomean") # Create ggplot object draw_ldc(df_ldc, df_sum, y_lab = expression(paste(italic("E. coli"))), label_nudge_y = log10(1000)) + scale_y_log10() + theme(legend.title = element_blank(), legend.direction = "vertical", legend.position = "bottom") ## cleanup remove_unit("cfu")
# Basic example using built in Tres Palacios data library(dplyr) library(units) library(ggplot2) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_ldc(df, Q = Flow, C = Indicator_Bacteria, allowable_concentration = allowable_concentration) # Summarize LDC df_sum <- summ_ldc(df_ldc, Q = Flow, C = Indicator_Bacteria, Exceedance = P_Exceedance, groups = Flow_Category, method = "geomean") # Create ggplot object draw_ldc(df_ldc, df_sum, y_lab = expression(paste(italic("E. coli"))), label_nudge_y = log10(1000)) + scale_y_log10() + theme(legend.title = element_blank(), legend.direction = "vertical", legend.position = "bottom") ## cleanup remove_unit("cfu")
Calculates summary statistics for flow and pollutant concentrations for desired flow categories. Estimates "average" pollutant load per category based on average concentration times the median flow.
summ_ldc(.tbl, Q, C, Exceedance, groups, method = "geomean")
summ_ldc(.tbl, Q, C, Exceedance, groups, method = "geomean")
.tbl |
data frame, prefferably the output from |
Q |
variable name in .tbl for discharge or flow. This must have unit set, typically "ft^3/s". |
C |
variable name in .tbl for associated pollutant concentration at a given flow value. This must have a unit set, typically "mg/L" or "cfu/100mL". |
Exceedance |
variable name in .tbl with flow/load exceedance probabilities. |
groups |
variable name in .tbl with categorized flow names. |
method |
string that describes the summary statistic used for the
pollutant concentration. Must be one of |
object of class tibble. Includes Flow Category grouping variable, median flow and exceedance values, geometric mean/mean/median pollutant concentration, and estimated average load based on median flow times the average pollutant concentration per flow category.
# Basic example using built in Tres Palacios data library(dplyr) library(units) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_ldc(df, Q = Flow, C = Indicator_Bacteria, allowable_concentration = allowable_concentration) # Summarize LDC df_sum <- summ_ldc(df_ldc, Q = Flow, C = Indicator_Bacteria, Exceedance = P_Exceedance, groups = Flow_Category, method = "geomean") df_sum ## cleanup remove_unit("cfu")
# Basic example using built in Tres Palacios data library(dplyr) library(units) # Format data install_unit("cfu") df <- as_tibble(tres_palacios) %>% ## filter data so this run quicker filter(!is.na(Indicator_Bacteria)) %>% ## flow must have units, here is is in cfs mutate(Flow = set_units(Flow, "ft^3/s")) %>% ## pollutant concentration must have units mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL")) # Calculate LDC ## specify the allowable concentration allowable_concentration <- 126 ## set the units units(allowable_concentration) <- "cfu/100mL" df_ldc <- calc_ldc(df, Q = Flow, C = Indicator_Bacteria, allowable_concentration = allowable_concentration) # Summarize LDC df_sum <- summ_ldc(df_ldc, Q = Flow, C = Indicator_Bacteria, Exceedance = P_Exceedance, groups = Flow_Category, method = "geomean") df_sum ## cleanup remove_unit("cfu")
A dataset containing the mean daily flow and E. coli bacteria concentrations on the Tres Palacios River from 2000 through 2020.
tres_palacios
tres_palacios
A data frame with 7671 rows and 4 variables:
USGS gage number
Observation Date
Mean Daily Flow in cfs
Bacteria concentration measured on the given day in MPN/100mL
USGS NWIS https://waterdata.usgs.gov/nwis and TCEQ SWQM https://www.tceq.texas.gov/waterquality/monitoring