Package 'ldc' reference manual

Title:	Calculate and Plot Pollutant Load Duration Curves
Description:	Load duration curves are a method for visualizing pollutant loads in freshwater streams based on the assumed relationship between streamflow and load. Functions are provided for calculating exceedance probabilities, pollutant loading, and plotting load duration curves.
Authors:	Michael Schramm [aut, cre] , Texas Water Resources Institute [cph]
Maintainer:	Michael Schramm <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0
Built:	2025-03-09 04:12:00 UTC
Source:	https://github.com/TxWRI/ldc

Calculate annualized load duration curve

Description

Calculates the median annual ldc with confidence intervals.

Usage

calc_annual_ldc(
  .tbl,
  Q = NULL,
  C = NULL,
  Date = NULL,
  allowable_concentration = NULL,
  breaks = c(1, 0.8, 0.4, 0),
  labels = c("High Flows", "Medium Flows", "Low Flows"),
  conf_level = 0.9,
  estimator = 6,
  n = 500
)
calc_annual_ldc(
  .tbl,
  Q = NULL,
  C = NULL,
  Date = NULL,
  allowable_concentration = NULL,
  breaks = c(1, 0.8, 0.4, 0),
  labels = c("High Flows", "Medium Flows", "Low Flows"),
  conf_level = 0.9,
  estimator = 6,
  n = 500
)

Arguments

`.tbl`	data frame with at least three columns Q (discharge or flow), C (associated pollutant concentration), and Date.
`Q`	variable name in .tbl for discharge or flow. This must be of class 'units', typically with a units value of "ft^3/s".
`C`	variable name in .tbl for associated pollutant concentration at a given flow value. This must be of class 'units', typically with a units value of "mg/L" or "cfu/100mL".
`Date`	variable name in .tbl for the event Date. This variable must be of class 'Date'.
`allowable_concentration`	an object of class `units` specifying the allowable pollutant concentration.
`breaks`	a numeric vector of break points for flow categories. Must be of length of labels + 1. defaults to `c(1, 0.8, 0.4, 0)`.
`labels`	labels for the categories specified by breaks.
`conf_level`	numeric, confidence level (default is 0.9) of the median interval at given exceedance probability.
`estimator`	one of `c(5,6,7,8,9,"hd")`. `6` is the default method correponding to the Weibull plotting position. Further details are provided in `quantile`. `"hd"` uses the Harrell-Davis Distribution-Free Quantile Estimator (see: `hdquantile`).
`n`	numeric, the length of generated probability points. Larger n may result in a slightly smoother curve at a cost of increased processing time. The probability points are used to generate the continuous sample quantiles types 5 to 9 (see `quantile`).

Details

The median annual ldc is calculated by computing the flow duration curve for each individual year in the dataset. Exceedance probabilities are calculated from the descending order of Daily Flows. By default, the Weibull plotting position is used:

$p = P(Q > q_i) = \frac{i}{n+1}$

where $q_i, i = 1, 2, ... n$ , is the i-th sorted streamflow value.

The median streamflow +/- chosen confidence interval is calculated at each exceedance probability. The load duration curve is calculated by multiplying the median streamflow by the allowable concentration and appropriate conversions.

Value

list of two tibbles (Q and C). Includes variables in .tbl and Daily_Flow_Volume (discharge volume), Daily_Load (pollutant sample volume), P_Exceedance (exeedance probability), Flow_Category (as defined by breaks and labels).

References

Vogel, Richard M., and Neil M. Fennessey. "Flow-duration curves. I: New interpretation and confidence intervals." Journal of Water Resources Planning and Management 120, no. 4 (1994): 485-504. doi:10.1061/(ASCE)0733-9496(1994)120:4(485)

Examples

# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
  ## filter data so this run quicker
  filter(!is.na(Indicator_Bacteria)) %>%
  ## flow must have units, here is is in cfs
  mutate(Flow = set_units(Flow, "ft^3/s")) %>%
  ## pollutant concentration must have units
  mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_annual_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   Date = Date,
                   allowable_concentration = allowable_concentration,
                   estimator = 5,
                   n = 1000)
df_ldc$Q

## cleanup
remove_unit("cfu")
# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
  ## filter data so this run quicker
  filter(!is.na(Indicator_Bacteria)) %>%
  ## flow must have units, here is is in cfs
  mutate(Flow = set_units(Flow, "ft^3/s")) %>%
  ## pollutant concentration must have units
  mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_annual_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   Date = Date,
                   allowable_concentration = allowable_concentration,
                   estimator = 5,
                   n = 1000)
df_ldc$Q

## cleanup
remove_unit("cfu")

Calculate load duration curve

Description

Calculates the period of record load duration curve from a data frame that includes mean daily flow and associated point measurements of pollutant concentration.

Usage

calc_ldc(
  .tbl,
  Q = NULL,
  C = NULL,
  allowable_concentration = NULL,
  breaks = c(1, 0.8, 0.4, 0),
  labels = c("High Flows", "Medium Flows", "Low Flows"),
  estimator = 6
)
calc_ldc(
  .tbl,
  Q = NULL,
  C = NULL,
  allowable_concentration = NULL,
  breaks = c(1, 0.8, 0.4, 0),
  labels = c("High Flows", "Medium Flows", "Low Flows"),
  estimator = 6
)

Arguments

`.tbl`	data frame with at least two columns Q (discharge or flow) and C (associated pollutant concentration).
`Q`	variable name in .tbl for discharge or flow. This must have unit set, typically "ft^3/s".
`C`	variable name in .tbl for associated pollutant concentration at a given flow value. This must have a unit set, typically "mg/L" or "cfu/100mL".
`allowable_concentration`	an object of class `units` specifying the allowable pollutant concentration.
`breaks`	a numeric vector of break points for flow categories. Must be of length of labels + 1. defaults to `c(1, 0.8, 0.4, 0)`.
`labels`	labels for the categories specified by breaks.
`estimator`	numeric, one of `c(5,6,7,8,9)`. `6` is the default method correponding to the Weibull plotting position. Further details are provided in `stats::quantile()`.

Details

The exceedance probability is calculated from the descending order of Daily Flows. By default, the Weibull plotting position is used:

$p = P(Q > q_i) = \frac{i}{n+1}$

where $q_i, i = 1, 2, ... n$ , is the i-th sorted streamflow value.

Value

object of class tibble. Includes variables in .tbl and Daily_Flow_Volume (discharge volume), Daily_Load (pollutant sample volume), P_Exceedance (exeedance probability), Flow_Category (as defined by breaks and labels).

Examples

# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
  ## filter data so this run quicker
  filter(!is.na(Indicator_Bacteria)) %>%
  ## flow must have units, here is is in cfs
  mutate(Flow = set_units(Flow, "ft^3/s")) %>%
  ## pollutant concentration must have units
  mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   allowable_concentration = allowable_concentration)
df_ldc

## cleanup
remove_unit("cfu")
# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
  ## filter data so this run quicker
  filter(!is.na(Indicator_Bacteria)) %>%
  ## flow must have units, here is is in cfs
  mutate(Flow = set_units(Flow, "ft^3/s")) %>%
  ## pollutant concentration must have units
  mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   allowable_concentration = allowable_concentration)
df_ldc

## cleanup
remove_unit("cfu")

Draw a load duration curve

Description

Creates a load duration curve visualization from the outputs of calc_ldc and summ_ldc as a ggplot object.

Usage

draw_ldc(
  .tbl_calc,
  .tbl_summ,
  y_lab = NULL,
  ldc_legend_name = "Allowable Load at State Water Quality Standard",
  measurement_name = "Measurement Value",
  measurement_shape = 21,
  measurement_color = "dodgerblue",
  measurement_alpha = 1,
  summary_name = "Summarized Measured Load",
  summary_stat_shape = 12,
  summary_stat_color = "red",
  label_nudge_y = 0,
  label_font_family = "Arial",
  label_font_size = 3,
  label_break = TRUE
)
draw_ldc(
  .tbl_calc,
  .tbl_summ,
  y_lab = NULL,
  ldc_legend_name = "Allowable Load at State Water Quality Standard",
  measurement_name = "Measurement Value",
  measurement_shape = 21,
  measurement_color = "dodgerblue",
  measurement_alpha = 1,
  summary_name = "Summarized Measured Load",
  summary_stat_shape = 12,
  summary_stat_color = "red",
  label_nudge_y = 0,
  label_font_family = "Arial",
  label_font_size = 3,
  label_break = TRUE
)

Arguments

`.tbl_calc`	data frame object created by `calc_ldc`
`.tbl_summ`	data frame object created by `summ_ldc`
`y_lab`	optional string for y-axis label name, will be appended with units automatically. default is NULL.
`ldc_legend_name`	string, provides the name used for the allowable pollutant load line in the legend. required.
`measurement_name`	string, provides the name used for measured load values in the legend. required.
`measurement_shape`	aesthetic value passed to the layer plotting measured load values. defaults to `21`.
`measurement_color`	aesthetic value passed to the layer plotting measured load values. defaults to `"dodgerblue"`.
`measurement_alpha`	aesthetic value passed to the layer plotting measured load values. defaults to `1`.
`summary_name`	string, provides the name used for summary statistic values in the legend. required.
`summary_stat_shape`	aesthetic value passed to the layer plotting summary statistic values. defaults to `12`.
`summary_stat_color`	aesthetic value passed to the layer plotting summary statistic values. defaults to `"red"`.
`label_nudge_y`	numeric value to vertically nudge flow category labels. If a log10 transformed scale is being used, a log value is probably appropriate for example `log10(1000)`.
`label_font_family`	string specifying font family to use in flow category labels.
`label_font_size`	numeric value specifying font size to use in flow category labels.
`label_break`	logical, add line breaks to flow category labels. Labels will break at spaces.

Value

ggplot object

Examples

# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
library(ggplot2)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
        ## filter data so this run quicker
        filter(!is.na(Indicator_Bacteria)) %>%
        ## flow must have units, here is is in cfs
        mutate(Flow = set_units(Flow, "ft^3/s")) %>%
        ## pollutant concentration must have units
        mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   allowable_concentration = allowable_concentration)

# Summarize LDC
df_sum <- summ_ldc(df_ldc,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   Exceedance = P_Exceedance,
                   groups = Flow_Category,
                   method = "geomean")

# Create ggplot object
draw_ldc(df_ldc,
         df_sum,
         y_lab = expression(paste(italic("E. coli"))),
         label_nudge_y = log10(1000)) +
         scale_y_log10() +
         theme(legend.title = element_blank(),
               legend.direction = "vertical",
               legend.position = "bottom")

## cleanup
remove_unit("cfu")

# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
library(ggplot2)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
        ## filter data so this run quicker
        filter(!is.na(Indicator_Bacteria)) %>%
        ## flow must have units, here is is in cfs
        mutate(Flow = set_units(Flow, "ft^3/s")) %>%
        ## pollutant concentration must have units
        mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   allowable_concentration = allowable_concentration)

# Summarize LDC
df_sum <- summ_ldc(df_ldc,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   Exceedance = P_Exceedance,
                   groups = Flow_Category,
                   method = "geomean")

# Create ggplot object
draw_ldc(df_ldc,
         df_sum,
         y_lab = expression(paste(italic("E. coli"))),
         label_nudge_y = log10(1000)) +
         scale_y_log10() +
         theme(legend.title = element_blank(),
               legend.direction = "vertical",
               legend.position = "bottom")

## cleanup
remove_unit("cfu")

Summarize load duration curve

Description

Calculates summary statistics for flow and pollutant concentrations for desired flow categories. Estimates "average" pollutant load per category based on average concentration times the median flow.

Usage

summ_ldc(.tbl, Q, C, Exceedance, groups, method = "geomean")
summ_ldc(.tbl, Q, C, Exceedance, groups, method = "geomean")

Arguments

`.tbl`	data frame, prefferably the output from `calc_ldc()`.
`Q`	variable name in .tbl for discharge or flow. This must have unit set, typically "ft^3/s".
`C`	variable name in .tbl for associated pollutant concentration at a given flow value. This must have a unit set, typically "mg/L" or "cfu/100mL".
`Exceedance`	variable name in .tbl with flow/load exceedance probabilities.
`groups`	variable name in .tbl with categorized flow names.
`method`	string that describes the summary statistic used for the pollutant concentration. Must be one of `c('geomean', 'mean', 'median')`.

Value

object of class tibble. Includes Flow Category grouping variable, median flow and exceedance values, geometric mean/mean/median pollutant concentration, and estimated average load based on median flow times the average pollutant concentration per flow category.

Examples

# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
        ## filter data so this run quicker
        filter(!is.na(Indicator_Bacteria)) %>%
        ## flow must have units, here is is in cfs
        mutate(Flow = set_units(Flow, "ft^3/s")) %>%
        ## pollutant concentration must have units
        mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   allowable_concentration = allowable_concentration)

# Summarize LDC
df_sum <- summ_ldc(df_ldc,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   Exceedance = P_Exceedance,
                   groups = Flow_Category,
                   method = "geomean")
df_sum

## cleanup
remove_unit("cfu")

# Basic example using built in Tres Palacios data
library(dplyr)
library(units)
# Format data
install_unit("cfu")
df <- as_tibble(tres_palacios) %>%
        ## filter data so this run quicker
        filter(!is.na(Indicator_Bacteria)) %>%
        ## flow must have units, here is is in cfs
        mutate(Flow = set_units(Flow, "ft^3/s")) %>%
        ## pollutant concentration must have units
        mutate(Indicator_Bacteria = set_units(Indicator_Bacteria, "cfu/100mL"))
# Calculate LDC

## specify the allowable concentration
allowable_concentration <- 126
## set the units
units(allowable_concentration) <- "cfu/100mL"
df_ldc <- calc_ldc(df,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   allowable_concentration = allowable_concentration)

# Summarize LDC
df_sum <- summ_ldc(df_ldc,
                   Q = Flow,
                   C = Indicator_Bacteria,
                   Exceedance = P_Exceedance,
                   groups = Flow_Category,
                   method = "geomean")
df_sum

## cleanup
remove_unit("cfu")

Mean daily flow and point E. coli bacteria measurements.

Description

A dataset containing the mean daily flow and E. coli bacteria concentrations on the Tres Palacios River from 2000 through 2020.

Usage

tres_palacios
tres_palacios

Format

A data frame with 7671 rows and 4 variables:

site_no: USGS gage number
Date: Observation Date
Flow: Mean Daily Flow in cfs
Indicator_Bacteria: Bacteria concentration measured on the given day in MPN/100mL

Source

USGS NWIS https://waterdata.usgs.gov/nwis and TCEQ SWQM https://www.tceq.texas.gov/waterquality/monitoring

Package 'ldc'

Help Index

Calculate annualized load duration curve

Description

Usage

Arguments

Details

Value

References

Examples

Calculate load duration curve

Description

Usage

Arguments

Details

Value

Examples

Draw a load duration curve

Description

Usage

Arguments

Value

Examples

Summarize load duration curve

Description

Usage

Arguments

Value

Examples

Mean daily flow and point E. coli bacteria measurements.

Description

Usage

Format

Source