FIA Demystified

Despite being publicly available, FIA data are notoriosly difficult for most non-FIA users to interpret and understand, and there is potential to greatly increase the utilization of these data among industry professionals, academic scientists, and the general public. We created rFIA to solve this problem, offering a flexible, user-friendly tool which simplifies the process of working with FIA data and helps ensure this valuable public resource is, well… public.

In the examples that follow, we will show you the methods that underpin the estimation capacity of rFIA. We do not attempt to expain every line of code in the rFIA package here, but rather we present a simplified version to highlight our implementation of the major estimation procedures. You can find the complete source code for rFIA at the development repository.

Estimation at the plot-level

In the examples below, we highlight the basic estimation procedures used to compute plot-level estimates of forest attributes from Forest Inventory and Analysis (FIA) data. We will demonstrate these procedures for two plots contained in the fiaRI dataset (included in the rFIA package) so that you can follow along with a small dataset.

The source code for rFIA will vary slightly from that presented below as we designed rFIA to be as flexible and computationally efficient as possible. Despite the differences in syntax and structure, the estimation procedures presented below are identical to those in rFIA.

Two example plots

First, let’s load some packages and the fiaRI dataset:

# Load some packages
library(rFIA)
library(dplyr)

# Load the fiaRI dataset (included in rFIA)
data(fiaRI)

To compute estimates of tree biomass/carbon at the plot-level, we really only need the tree, condition, and plot tables. In the code below, we will subset the rows of these tables which pertain to our two plots of interest (a & b below):

# Some unique identifiers for two plots in fiaRI database object
# For example use below, both measured in 2014
a <- 168263223020004  # One forested condition
b <- 168263219020004  # Two forested, one non-forested condition

# Subset the PLOT, COND, and TREE tables for plot A
plot_a <- filter(fiaRI$PLOT, CN == a)
cond_a <- filter(fiaRI$COND, PLT_CN == a)
tree_a <- filter(fiaRI$TREE, PLT_CN == a)
# Subset the PLOT, COND, and TREE tables for plot B
plot_b <- filter(fiaRI$PLOT, CN == b)
cond_b <- filter(fiaRI$COND, PLT_CN == b)
tree_b <- filter(fiaRI$TREE, PLT_CN == b)

Now that we have the tables we need for our plots of interest, let’s take a look at their COND tables and see how these plots are different:

# COND_STATUS_CD indicates the basic land classification of a
# condition, whether it is forested, non-forest, water, etc...
# COND_STATUS_CD = 1 indicates forested
# Plot A
cond_a$COND_STATUS_CD # One forested condition

## [1] 1

# Plot B
cond_b$COND_STATUS_CD # Two forested, and one non-forest

## [1] 1 1 2

The COND table lists the conditions, or land classes present on an FIA plot. Conditions may be obvious, such as when a plot intersects a forest edge (the forested area and non-forested area would be seperate conditions). However, more subtle differences between forested area on a plot, such as differences in reserved status, owner group, forest type, stand-size class, regeneration status, and stand density can further define conditions.

Since there are two forested conditions on Plot B, what is the basis for the distinction?

# FORTYPCD indicates the forest type of the condition
# PHYSCLCD indicates the Physiographic class (e.g. mesic moist slope)
cond_b$FORTYPCD

## [1] 505 708  NA

cond_b$PHYSCLCD

## [1] 21 31 NA

Looks like we have one forested condition in the Northern red oak forest type (FORTYPCD = 505) occuring on mesic flatwoods (PHYSCLCD = 21), and a second forested condition in the Red maple/lowland forest type (FORTYPCD = 708) on a hydric swamp/bog (PHYSCLCD = 31). The NA values relate to the non-forested condition on the plot (COND_STATUS_CD > 1). Hence it appears Plot B straddles an upland, wetland, and non-forested boundary!

Basic Estimation Procedures

Now that we have the data for our two example plots, let’s put them to work estimating tree biomass and carbon. First, we will join the plot, condition, and tree tables for each plot:

# Plot A
tbl_a <- plot_a %>%
  # Rename the CN column in plot, PLT_CN for simple joining
  mutate(PLT_CN = CN) %>%
  # Join tables
  left_join(cond_a, by = 'PLT_CN') %>%
  left_join(tree_a, by = c('PLT_CN', 'CONDID'))
# Plot B
tbl_b <- plot_b %>%
  # Rename the CN column in plot, PLT_CN for simple joining
  mutate(PLT_CN = CN) %>%
  # Join tables
  left_join(cond_b, by = 'PLT_CN') %>%
  left_join(tree_b, by = c('PLT_CN', 'CONDID'))

Joining tables is important when we want to produce estimates grouped by fields in the plot or condition table. Otherwise, it would be possible to only use the TREE table to compute estimates below.

To produce an estimate of the aboveground biomass and carbon per acre represented by all trees on the plot, we can simply sum the aboveground biomass (DRYBIO_AG) and carbon (CARBON_AG) contained in each tree multiplied by the trees per acre each tree represents (TPA_UNADJ). Notice that we also divide by 200 here to convert estimates from lbs/acre to short tons/acre, matching the units of rFIA output.

# Plot A
all_a <- tbl_a %>%
  group_by(PLT_CN) %>%
  summarize(BIO_AG_ACRE = sum(DRYBIO_AG * TPA_UNADJ / 2000, na.rm = TRUE),
            CARB_AG_ACRE = sum(CARBON_AG * TPA_UNADJ / 2000, na.rm = TRUE))
# Plot B
all_b <- tbl_b %>%
  group_by(PLT_CN) %>%
  summarize(BIO_AG_ACRE = sum(DRYBIO_AG * TPA_UNADJ / 2000, na.rm = TRUE),
            CARB_AG_ACRE = sum(CARBON_AG * TPA_UNADJ / 2000, na.rm = TRUE))

If you have been following along in your own R session, let’s check our estimates against rFIA. We’ve got a match!

# Producing biomass estimates on all plots in RI, for all trees on forestland
rFIA_all <- biomass(fiaRI, byPlot = TRUE, treeType = 'all')
# Plot A
filter(rFIA_all, PLT_CN == a)

## # A tibble: 1 × 6
##    YEAR pltID       PLT_CN BIO_ACRE CARB_ACRE PROP_FOREST
##   <int> <chr>        <dbl>    <dbl>     <dbl>       <dbl>
## 1  2014 1_44_7_288 1.68e14     93.0      44.6           1

all_a

## # A tibble: 1 × 3
##    PLT_CN BIO_AG_ACRE CARB_AG_ACRE
##     <dbl>       <dbl>        <dbl>
## 1 1.68e14        93.0         44.6

# Plot B
filter(rFIA_all, PLT_CN == b)

## # A tibble: 1 × 6
##    YEAR pltID      PLT_CN BIO_ACRE CARB_ACRE PROP_FOREST
##   <int> <chr>       <dbl>    <dbl>     <dbl>       <dbl>
## 1  2014 1_44_3_52 1.68e14     118.      56.3       0.970

all_b

## # A tibble: 1 × 3
##    PLT_CN BIO_AG_ACRE CARB_AG_ACRE
##     <dbl>       <dbl>        <dbl>
## 1 1.68e14        118.         56.3

Unique domains of interest

But what if we want to produce estimates for a specific kind of tree? Say Northern Red Oak (SPCD = 833) which is greater than 12 inches DBH (DIA > 12). We accomplish this using what Bechtold and Patterson (2005) call a ‘domain indicator’. This is essentially just a vector which indicates whether a tree (or plot, condition, etc.) is within our domain of interest (red oak > 12).

To construct the domain indicator, we just need a vector which is the same length as our joined table, and takes a value of 1 if the stem is in the domain and 0 otherwise:

# Plot A
tbl_a <- tbl_a %>%
  mutate(tDI = if_else(SPCD == 833 & DIA > 12, 1, 0))
# The domain indicator
tbl_a$tDI

##  [1] 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# How many trees meet the criteria?
sum(tbl_a$tDI, na.rm = TRUE)

## [1] 3

# Plot B
tbl_b <- tbl_b %>%
  mutate(tDI = if_else(SPCD == 833 & DIA > 12, 1, 0))
# The domain indicator
tbl_b$tDI

##  [1]  0  1  1  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
## [26]  1  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## [51]  0  0  0  0 NA

# How many trees meet the criteria?
sum(tbl_b$tDI, na.rm = TRUE)

## [1] 6

Now we can use our new domain indicator (vector of 0s and 1s, tDI) to produce estimates for any type of tree we specify! By adding tDI to the basic estimation procedures below, we force any tree which is not in our domain of interest to take a value of zero. Therefore, only trees which are within the domain of interest contribute to the plot-level estimate.

# Plot A
ro12_a <- tbl_a %>%
  group_by(PLT_CN) %>%
  summarize(BIO_AG_ACRE = sum(DRYBIO_AG * TPA_UNADJ * tDI / 2000, na.rm = TRUE),
            CARB_AG_ACRE = sum(CARBON_AG * TPA_UNADJ * tDI/ 2000, na.rm = TRUE))
# Plot B
ro12_b <- tbl_b %>%
  group_by(PLT_CN) %>%
  summarize(BIO_AG_ACRE = sum(DRYBIO_AG * TPA_UNADJ * tDI / 2000, na.rm = TRUE),
            CARB_AG_ACRE = sum(CARBON_AG * TPA_UNADJ * tDI / 2000, na.rm = TRUE))

# Now let's check our estimates against rFIA --> We've got a match!
rFIA_ro12 <- biomass(fiaRI, byPlot = TRUE, treeType = 'all', treeDomain = SPCD == 833 & DIA > 12)
# Plot A
filter(rFIA_ro12, PLT_CN == a)

## # A tibble: 1 × 6
##    YEAR pltID       PLT_CN BIO_ACRE CARB_ACRE PROP_FOREST
##   <int> <chr>        <dbl>    <dbl>     <dbl>       <dbl>
## 1  2014 1_44_7_288 1.68e14     22.3      10.7           1

ro12_a

## # A tibble: 1 × 3
##    PLT_CN BIO_AG_ACRE CARB_AG_ACRE
##     <dbl>       <dbl>        <dbl>
## 1 1.68e14        22.3         10.7

# Plot B
filter(rFIA_ro12, PLT_CN == b)

## # A tibble: 1 × 6
##    YEAR pltID      PLT_CN BIO_ACRE CARB_ACRE PROP_FOREST
##   <int> <chr>       <dbl>    <dbl>     <dbl>       <dbl>
## 1  2014 1_44_3_52 1.68e14     45.3      21.7       0.970

ro12_b

## # A tibble: 1 × 3
##    PLT_CN BIO_AG_ACRE CARB_AG_ACRE
##     <dbl>       <dbl>        <dbl>
## 1 1.68e14        45.3         21.7

Grouped estimation procedures

What if we want to produce estimates grouped by some attribute contained in the FIA Database, like forest type? We can accomplish by simply adding the attribute you want to group by to the group_by call in the estimation procedures above. This will then sum the biomass and carbon per acre of stems occuring on each forest type seperately.

# Plot A
for_a <- tbl_a %>%
  # Adding FORTYPCD here
  group_by(PLT_CN, FORTYPCD) %>%
  summarize(BIO_AG_ACRE = sum(DRYBIO_AG * TPA_UNADJ / 2000, na.rm = TRUE),
            CARB_AG_ACRE = sum(CARBON_AG * TPA_UNADJ/ 2000, na.rm = TRUE),
            .groups = 'drop')
# Plot B
for_b <- tbl_b %>%
  # Adding FORTYPCD here
  group_by(PLT_CN, FORTYPCD) %>%
  summarize(BIO_AG_ACRE = sum(DRYBIO_AG * TPA_UNADJ/ 2000, na.rm = TRUE),
            CARB_AG_ACRE = sum(CARBON_AG * TPA_UNADJ / 2000, na.rm = TRUE), 
            .groups = 'drop')

# Now let's check our estimates against rFIA --> We've got a match!
rFIA_for <- biomass(fiaRI, byPlot = TRUE, treeType = 'all', grpBy = FORTYPCD)
# Plot A
filter(rFIA_for, PLT_CN == a)

## # A tibble: 1 × 7
##    YEAR pltID      FORTYPCD  PLT_CN BIO_ACRE CARB_ACRE PROP_FOREST
##   <int> <chr>         <int>   <dbl>    <dbl>     <dbl>       <dbl>
## 1  2014 1_44_7_288      503 1.68e14     93.0      44.6           1

for_a # One forest type here, so only one row in the output

## # A tibble: 1 × 4
##    PLT_CN FORTYPCD BIO_AG_ACRE CARB_AG_ACRE
##     <dbl>    <int>       <dbl>        <dbl>
## 1 1.68e14      503        93.0         44.6

# Plot B
filter(rFIA_for, PLT_CN == b)

## # A tibble: 2 × 7
##    YEAR pltID     FORTYPCD  PLT_CN BIO_ACRE CARB_ACRE PROP_FOREST
##   <int> <chr>        <int>   <dbl>    <dbl>     <dbl>       <dbl>
## 1  2014 1_44_3_52      505 1.68e14    102.      48.4        0.75 
## 2  2014 1_44_3_52      708 1.68e14     16.2      7.86       0.220

for_b # Two forest types here, so two rows in output, plus the non-forested row

## # A tibble: 3 × 4
##    PLT_CN FORTYPCD BIO_AG_ACRE CARB_AG_ACRE
##     <dbl>    <int>       <dbl>        <dbl>
## 1 1.68e14      505       102.         48.4 
## 2 1.68e14      708        16.2         7.86
## 3 1.68e14       NA         0           0

Estimation at the population-level

In the code below, we highlight the basic estimation procedures used to compute population estimates of forest attributes from Forest Inventory and Analysis (FIA) data. We will demonstrate these procedures with the fiaRI dataset (included in the rFIA package) so that you can follow along.

Our goal here is to estimate total tree biomass, total tree carbon (aboveground) and total forested area in the state of Rhode Island for the year 2018. From these totals, we will compute ratios of average tree biomass/ forested acre and average tree carbon/ forested acre. We will do this with and without sampling errors (without being much simpler), and show you how we handle grouped estimates in both cases. All estimates will be computed for live trees.

Data Preparation

First, let’s load some packages and the fiaRI dataset:

# Load some packages
library(rFIA)
library(dplyr)

# Load the fiaRI dataset (included in rFIA)
data(fiaRI)
db <- fiaRI

To compute population estimates of current area and current biomass/carbon from the FIADB, we need to identify and subset the necessary portions. To do this, we will use what FIA calls an EVALID, hence the name ‘EVALIDator’.

ids <- db$POP_EVAL %>%
  select('CN', 'END_INVYR', 'EVALID') %>%
  inner_join(select(db$POP_EVAL_TYP, c('EVAL_CN', 'EVAL_TYP')), by = c('CN' = 'EVAL_CN')) %>%
  # Now we filter out everything except current area (EXPCURR) and
  # current volume ids (EXPVOL), and only grab those from 2018
  filter(EVAL_TYP %in% c('EXPCURR', 'EXPVOL'), END_INVYR == 2018)


# Now that we have those EVALIDs, let's use clipFIA to subset
db <- clipFIA(db, mostRecent = FALSE, evalid = ids$EVALID)

Since we need some information stored in each of these tables to compute estimates, we will join them into one big dataframe (let’s call that data) that we can operate on.

# Select only the columns we need from each table, to keep things slim
PLOT <- select(db$PLOT, CN, MACRO_BREAKPOINT_DIA)
COND <- select(db$COND, PLT_CN, CONDID, CONDPROP_UNADJ, PROP_BASIS, 
               COND_STATUS_CD, OWNGRPCD)
TREE <- select(db$TREE, PLT_CN, CONDID, SUBP, TREE, STATUSCD, 
               DRYBIO_AG, CARBON_AG, TPA_UNADJ, DIA, SPCD)
POP_ESTN_UNIT <- select(db$POP_ESTN_UNIT, CN, EVAL_CN, AREA_USED, 
                        P1PNTCNT_EU)
POP_EVAL <- select(db$POP_EVAL, EVALID, EVAL_GRP_CN, ESTN_METHOD, CN, 
                   END_INVYR, REPORT_YEAR_NM)
POP_EVAL_TYP <- select(db$POP_EVAL_TYP, EVAL_TYP, EVAL_CN)
POP_PLOT_STRATUM_ASSGN <- select(db$POP_PLOT_STRATUM_ASSGN, STRATUM_CN, 
                                 PLT_CN)
POP_STRATUM <- select(db$POP_STRATUM, ESTN_UNIT_CN, EXPNS, P2POINTCNT,
                      ADJ_FACTOR_MICR, ADJ_FACTOR_SUBP, ADJ_FACTOR_MACR, 
                      CN, P1POINTCNT)
# Join the tables
data <- PLOT %>%
  # Add a PLT_CN column for easy joining
  mutate(PLT_CN = CN) %>%
  # Join COND & TREE
  left_join(COND, by = 'PLT_CN') %>%
  left_join(TREE, by = c('PLT_CN', 'CONDID')) %>%
  # Population tables
  left_join(POP_PLOT_STRATUM_ASSGN, by = 'PLT_CN') %>%
  left_join(POP_STRATUM, by = c('STRATUM_CN' = 'CN')) %>%
  left_join(POP_ESTN_UNIT, by = c('ESTN_UNIT_CN' = 'CN')) %>%
  left_join(POP_EVAL, by = c('EVAL_CN' = 'CN')) %>%
  left_join(POP_EVAL_TYP, by = 'EVAL_CN', relationship = 'many-to-many')

Now let’s make a column that will adjust for non-response in our sample (See Bechtold and Patterson (2005), 3.4.3 ‘Nonsampled Plots and Plot Replacement’). Since we know there are no macroplots in RI, we don’t really need to worry about that here, but we will show you anyways.

# Make some adjustment factors
data <- data %>%
  mutate(
    # AREA
    aAdj = case_when(
           # When NA, stay NA
           is.na(PROP_BASIS) ~ NA_real_,
           # If the proportion was measured for a macroplot,
           # use the macroplot value
           PROP_BASIS == 'MACR' ~ as.numeric(ADJ_FACTOR_MACR),
           # Otherwise, use the subpplot value
           PROP_BASIS == 'SUBP' ~ ADJ_FACTOR_SUBP),
    # TREE
    tAdj = case_when(
           # When DIA is na, adjustment is NA
           is.na(DIA) ~ ADJ_FACTOR_SUBP,
           # When DIA is less than 5", use microplot value
           DIA < 5 ~ ADJ_FACTOR_MICR,
           # When DIA is greater than 5", use subplot value
           DIA >= 5 ~ ADJ_FACTOR_SUBP
         )
  )

Next, we need to construct what Bechtold and Patterson (2005) called a ‘domain indicator function’. (see Eq. 4.1, pg. 47 of the publication). This is essentially just a vector which indicates whether a tree (or plot, condition, etc.) is within our domain of interest (live trees on forest land).

To construct the domain indicator, we just need a vector which is the same length as our joined table (data), and takes a value of 1 if the stem (or condition) is in the domain and 0 otherwise. We build separate domain indicators for estimating tree totals and area totals, because we can specify different domains of interest for both. For example, if we used our tree domain (live trees on forest land) to estimate area, then we would not actually be estimating the full forested area in RI. Instead we would estimate the forested area ONLY where live trees are currently present.

# Build a domain indicator for land type and live trees
# Land type (all forested area)
data$aDI <- if_else(data$COND_STATUS_CD == 1, 1, 0)
# Live trees only (on forested area)
data$tDI <- if_else(data$STATUSCD == 1, 1, 0) * data$aDI

# Now, let's just rename the END_INVYR column to 'YEAR'
# for a pretty output like rFIA
data <- data %>%
  mutate(YEAR = END_INVYR) %>%
  # remove any NAs
  filter(!is.na(YEAR))

Without Sampling Errors

Now we are ready to start computing estimates. If we don’t care aboute sampling errors, we can use the EXPNS column in the POP_STRATUM table to make our lives easier. EXPNS is an expansion factor which descibes the area, in acres, that a stratum represents divided by the number of sampled plots in that stratum (see Bechtold and Patterson (2005) section 4.2 for more information on FIA stratification procedures). When summed across all plots in the population of interest, EXPNS allows us to easily obtain estimates of population totals, without worrying about fancy stratifaction procedures and variance estimators.

No grouping variables

First we compute totals for biomass, carbon, and forested area:

# Estimate Tree totals
tre_bio <- data %>%
  filter(EVAL_TYP == 'EXPVOL') %>%
  # Make sure we only have unique observations of plots, trees, etc.
  distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, SUBP, TREE, .keep_all = TRUE) %>%
  # Plot-level estimates first (multiplying by EXPNS here)
  group_by(YEAR, ESTN_UNIT_CN, ESTN_METHOD, STRATUM_CN, PLT_CN) %>%
  summarize(bioPlot = sum(DRYBIO_AG * TPA_UNADJ * tAdj * tDI * EXPNS  / 2000, na.rm = TRUE),
            carbPlot = sum(CARBON_AG * TPA_UNADJ * tAdj * tDI * EXPNS  / 2000, na.rm = TRUE), 
            .groups = 'drop') %>%
  # Now we simply sum the values of each plot (expanded w/ EXPNS)
  # to obtain population totals
  group_by(YEAR) %>%
  summarize(BIO_AG_TOTAL = sum(bioPlot, na.rm = TRUE),
            CARB_AG_TOTAL = sum(carbPlot, na.rm = TRUE))

# Estimate Area totals
area_bio <- data %>%
  filter(EVAL_TYP == 'EXPCURR') %>%
  # Make sure we only have unique observations of plots, trees, etc.
  distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, .keep_all = TRUE) %>%
  # Plot-level estimates first (multiplying by EXPNS here)
  group_by(YEAR, ESTN_UNIT_CN, ESTN_METHOD, STRATUM_CN, PLT_CN) %>%
  summarize(forArea = sum(CONDPROP_UNADJ * aAdj * aDI * EXPNS, na.rm = TRUE), 
            .groups = 'drop') %>%
  # Now we simply sum the values of each plot (expanded w/ EXPNS)
  # to obtain population totals
  group_by(YEAR) %>%
  summarize(AREA_TOTAL = sum(forArea, na.rm = TRUE))

Then we can join these tables up, and produce ratio estimates:

bio <- left_join(tre_bio, area_bio) %>%
  mutate(BIO_AG_ACRE = BIO_AG_TOTAL / AREA_TOTAL,
         CARB_AG_ACRE = CARB_AG_TOTAL / AREA_TOTAL) %>%
  # Reordering the columns
  select(YEAR, BIO_AG_ACRE, CARB_AG_ACRE, BIO_AG_TOTAL, CARB_AG_TOTAL, AREA_TOTAL)

## Joining with `by = join_by(YEAR)`

Comparing with rFIA, we get a match!

biomass(clipFIA(fiaRI), totals = TRUE)

## # A tibble: 1 × 14
##    YEAR BIO_ACRE CARB_ACRE BIO_TOTAL CARB_TOTAL AREA_TOTAL BIO_ACRE_SE
##   <dbl>    <dbl>     <dbl>     <dbl>      <dbl>      <dbl>       <dbl>
## 1  2018     75.7      36.6 27762772.  13427961.    366959.        3.87
## # ℹ 7 more variables: CARB_ACRE_SE <dbl>, BIO_TOTAL_SE <dbl>,
## #   CARB_TOTAL_SE <dbl>, AREA_TOTAL_SE <dbl>, nPlots_TREE <int>,
## #   nPlots_AREA <int>, N <int>

bio

## # A tibble: 1 × 6
##    YEAR BIO_AG_ACRE CARB_AG_ACRE BIO_AG_TOTAL CARB_AG_TOTAL AREA_TOTAL
##   <dbl>       <dbl>        <dbl>        <dbl>         <dbl>      <dbl>
## 1  2018        75.7         36.6    27762772.     13427961.    366959.

Adding grouping variables

To add grouping variables to the above procedures, we can simply add the names of the variables we wish to group by to the group_by call:

# Grouping by Ownership group (OWNGRPCD)
# Estimate Tree totals
tre_bioGrp <- data %>%
  filter(EVAL_TYP == 'EXPVOL') %>%
  # Make sure we only have unique observations of plots, trees, etc.
  distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, SUBP, TREE, .keep_all = TRUE) %>%
  # Plot-level estimates first (multiplying by EXPNS here)
  group_by(YEAR, OWNGRPCD, ESTN_UNIT_CN, ESTN_METHOD, STRATUM_CN, PLT_CN) %>%
  summarize(bioPlot = sum(DRYBIO_AG * TPA_UNADJ * tAdj * tDI * EXPNS  / 2000, na.rm = TRUE),
            carbPlot = sum(CARBON_AG * TPA_UNADJ * tAdj * tDI * EXPNS  / 2000, na.rm = TRUE), 
            .groups = 'drop') %>%
  # Now we simply sum the values of each plot (expanded w/ EXPNS)
  # to obtain population totals
  group_by(YEAR, OWNGRPCD) %>%
  summarize(BIO_AG_TOTAL = sum(bioPlot, na.rm = TRUE),
            CARB_AG_TOTAL = sum(carbPlot, na.rm = TRUE), 
            .groups = 'drop')

# Estimate Area totals
area_bioGrp <- data %>%
  filter(EVAL_TYP == 'EXPCURR') %>%
  # Make sure we only have unique observations of plots, trees, etc.
  distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, .keep_all = TRUE) %>%
  # Plot-level estimates first (multiplying by EXPNS here)
  group_by(YEAR, OWNGRPCD, ESTN_UNIT_CN, ESTN_METHOD, STRATUM_CN, PLT_CN) %>%
  summarize(forArea = sum(CONDPROP_UNADJ * aAdj * aDI * EXPNS, na.rm = TRUE), 
            .groups = 'drop') %>%
  # Now we simply sum the values of each plot (expanded w/ EXPNS)
  # to obtain population totals
  group_by(YEAR, OWNGRPCD) %>%
  summarize(AREA_TOTAL = sum(forArea, na.rm = TRUE), 
            .groups = 'drop')

# Now we can simply join these two up, and produce ratio estimates
bioGrp <- left_join(tre_bioGrp, area_bioGrp) %>%
  mutate(BIO_AG_ACRE = BIO_AG_TOTAL / AREA_TOTAL,
         CARB_AG_ACRE = CARB_AG_TOTAL / AREA_TOTAL) %>%
  # Reordering the columns
  select(YEAR, OWNGRPCD, BIO_AG_ACRE, CARB_AG_ACRE, BIO_AG_TOTAL, CARB_AG_TOTAL, AREA_TOTAL)

## Joining with `by = join_by(YEAR, OWNGRPCD)`

# Now let's compare with rFIA.... looks like a match!
biomass(clipFIA(fiaRI), totals = TRUE, grpBy = OWNGRPCD)

## # A tibble: 2 × 15
##    YEAR OWNGRPCD BIO_ACRE CARB_ACRE BIO_TOTAL CARB_TOTAL AREA_TOTAL BIO_ACRE_SE
##   <dbl>    <int>    <dbl>     <dbl>     <dbl>      <dbl>      <dbl>       <dbl>
## 1  2018       30     77.2      37.5  8622383.   4186531.    111642.        8.53
## 2  2018       40     75.0      36.2 19140389.   9241429.    255317.        4.12
## # ℹ 7 more variables: CARB_ACRE_SE <dbl>, BIO_TOTAL_SE <dbl>,
## #   CARB_TOTAL_SE <dbl>, AREA_TOTAL_SE <dbl>, nPlots_TREE <int>,
## #   nPlots_AREA <int>, N <int>

bioGrp

## # A tibble: 3 × 7
##    YEAR OWNGRPCD BIO_AG_ACRE CARB_AG_ACRE BIO_AG_TOTAL CARB_AG_TOTAL AREA_TOTAL
##   <dbl>    <int>       <dbl>        <dbl>        <dbl>         <dbl>      <dbl>
## 1  2018       30        77.2         37.5     8622383.      4186531.    111642.
## 2  2018       40        75.0         36.2    19140389.      9241429.    255317.
## 3  2018       NA       NaN          NaN             0             0          0

If adapting this code for your own use, make sure that your grouping variables are included in the select calls in Data Prepration, otherwise the variable will not be found in data.

With Sampling Errors

Computing estimates with associated sampling errors is a bit more complex than what we saw above, as we can no longer rely on EXPNS to do the heavy lifting for us. In short, we will add a few additional steps when computing tree totals and area totals, summarizing at the strata and estimation unit level along the way. When adding grouping variables, we will need to modify our code further, treating each group as a unique population and summarizing these populations individually. We will follow the procedures outlined by Bechtold and Patterson (2005) (see section 4) for our computations.

Before we get started, Let’s check out what type of estimation methods were used to determine values in the POP tables, as this will influence which variance estimators we use.

unique(db$POP_EVAL$ESTN_METHOD)

## [1] "Post-Stratification"

Looks like just post-stratification, so we will use the stratified random sampling estimation approach (known weights)

No grouping variables

First we estimate both tree and area attributes at the plot level, and then join these estimates before proceeding to the strata level so that we can compute the covariance between area and tree attributes

# Estimate Tree attributes -- plot level
tre_pop <- data %>%
  filter(EVAL_TYP == 'EXPVOL') %>%
  # Make sure we only have unique observations of plots, trees, etc.
  distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, SUBP, TREE, .keep_all = TRUE) %>%
  # Plot-level estimates first (note we are NOT using EXPNS here)
  group_by(YEAR, ESTN_UNIT_CN, STRATUM_CN, PLT_CN) %>%
  summarize(bioPlot = sum(DRYBIO_AG * TPA_UNADJ * tAdj * tDI  / 2000, na.rm = TRUE),
            carbPlot = sum(CARBON_AG * TPA_UNADJ * tAdj * tDI  / 2000, na.rm = TRUE), 
            .groups = 'drop')

# Estimate Area attributes -- plot level
area_pop <- data %>%
  filter(EVAL_TYP == 'EXPCURR') %>%
  # Make sure we only have unique observations of plots, trees, etc.
  distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, .keep_all = TRUE) %>%
  # Plot-level estimates first (multiplying by EXPNS here)
  # Extra grouping variables are only here so they are carried through
  group_by(YEAR, P1POINTCNT, P1PNTCNT_EU, P2POINTCNT, AREA_USED, ESTN_UNIT_CN, STRATUM_CN, PLT_CN) %>%
  summarize(forArea = sum(CONDPROP_UNADJ * aAdj * aDI, na.rm = TRUE), 
            .groups = 'drop')

# Joining the two tables
bio_pop_plot <- left_join(tre_pop, area_pop)

## Joining with `by = join_by(YEAR, ESTN_UNIT_CN, STRATUM_CN, PLT_CN)`

Now that we have both area and tree attributes in the same table, we can follow through with the remaining estimation procedures at the strata and estimation unit levels:

# Strata level
bio_pop_strat <- bio_pop_plot %>%
  group_by(YEAR, ESTN_UNIT_CN, STRATUM_CN) %>%
  summarize(aStrat = mean(forArea, na.rm = TRUE), # Area mean
            bioStrat = mean(bioPlot, na.rm = TRUE), # Biomass mean
            carbStrat = mean(carbPlot, na.rm = TRUE), # Carbon mean
            # We don't want a vector of these values, since they are repeated
            P2POINTCNT = first(P2POINTCNT),
            AREA_USED = first(AREA_USED),
            # Strata weight, relative to estimation unit
            w = first(P1POINTCNT) / first(P1PNTCNT_EU),
            # Strata level variances
            aVar = (sum(forArea^2) - sum(P2POINTCNT * aStrat^2)) / 
                   (P2POINTCNT * (P2POINTCNT-1)),
            bioVar = (sum(bioPlot^2) - sum(P2POINTCNT * bioStrat^2)) / 
                     (P2POINTCNT * (P2POINTCNT-1)),
            carbVar = (sum(carbPlot^2) - sum(P2POINTCNT * carbStrat^2)) / 
                      (P2POINTCNT * (P2POINTCNT-1)),
            # Strata level co-varainces
            bioCV = (sum(forArea*bioPlot) - sum(P2POINTCNT * aStrat *bioStrat)) / 
                    (P2POINTCNT * (P2POINTCNT-1)),
            carbCV = (sum(forArea*carbPlot) - sum(P2POINTCNT * aStrat *carbStrat)) / 
                     (P2POINTCNT * (P2POINTCNT-1)), 
            .groups = 'drop')

# Moving on to the estimation unit
bio_pop_eu <- bio_pop_strat %>%
  group_by(YEAR, ESTN_UNIT_CN) %>%
  summarize(aEst = sum(aStrat * w, na.rm = TRUE) * first(AREA_USED), # Mean Area
            bioEst = sum(bioStrat * w, na.rm = TRUE) * first(AREA_USED), # Mean biomass
            carbEst = sum(carbStrat * w, na.rm = TRUE) * first(AREA_USED), # Mean carbon
            # Estimation unit variances
            aVar = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
              (sum(P2POINTCNT*w*aVar) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*aVar)),
            bioVar = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
              (sum(P2POINTCNT*w*bioVar) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*bioVar)),
            carbVar = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
              (sum(P2POINTCNT*w*carbVar) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*carbVar)),
            ## Estimation unit covariances
            bioCV = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
              (sum(P2POINTCNT*w*bioCV) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*bioCV)),
            carbCV = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
              (sum(P2POINTCNT*w*carbCV) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*carbCV)), 
            .groups = 'drop')

Finally, we can sum attributes across estimation units to obtain totals for our region:

# sum across Estimation Units for totals
bio_pop <- bio_pop_eu %>%
  group_by(YEAR) %>%
  summarize(AREA_TOTAL = sum(aEst, na.rm = TRUE),
            BIO_AG_TOTAL = sum(bioEst, na.rm = TRUE),
            CARB_AG_TOTAL = sum(carbEst, na.rm = TRUE),
            # Ratios
            BIO_AG_ACRE = BIO_AG_TOTAL / AREA_TOTAL,
            CARB_AG_ACRE = CARB_AG_TOTAL / AREA_TOTAL,
            # Total samping errors
            AREA_TOTAL_SE = sqrt(sum(aVar, na.rm = TRUE)) / AREA_TOTAL * 100,
            BIO_AG_TOTAL_SE = sqrt(sum(bioVar, na.rm = TRUE)) / BIO_AG_TOTAL * 100,
            CARB_AG_TOTAL_SE = sqrt(sum(carbVar, na.rm = TRUE)) / CARB_AG_TOTAL * 100,
            # Ratio variances
            bioAcreVar = (1 / AREA_TOTAL^2) * (sum(bioVar) + 
                         (BIO_AG_ACRE^2)*sum(aVar) - 
                         2 * BIO_AG_ACRE * sum(bioCV)),
            carbAcreVar = (1 / AREA_TOTAL^2) * (sum(carbVar) + 
                          (CARB_AG_ACRE^2) * sum(aVar) - 
                          2 * CARB_AG_ACRE * sum(carbCV)),
            BIO_AG_ACRE_SE = sqrt(sum(bioAcreVar, na.rm = TRUE)) / 
                                  BIO_AG_ACRE * 100,
            CARB_AG_ACRE_SE = sqrt(sum(carbAcreVar, na.rm = TRUE)) / 
                                   CARB_AG_ACRE * 100) %>%
  # Re ordering, dropping variance
  select(YEAR, BIO_AG_ACRE, CARB_AG_ACRE, BIO_AG_TOTAL, CARB_AG_TOTAL, 
         AREA_TOTAL, BIO_AG_ACRE_SE, CARB_AG_ACRE_SE, BIO_AG_TOTAL_SE, 
         CARB_AG_TOTAL_SE, AREA_TOTAL_SE)

Comparing with rFIA, we get a match!

glimpse(biomass(clipFIA(fiaRI), totals = TRUE))

## Rows: 1
## Columns: 14
## $ YEAR          <dbl> 2018
## $ BIO_ACRE      <dbl> 75.65639
## $ CARB_ACRE     <dbl> 36.59257
## $ BIO_TOTAL     <dbl> 27762772
## $ CARB_TOTAL    <dbl> 13427961
## $ AREA_TOTAL    <dbl> 366958.7
## $ BIO_ACRE_SE   <dbl> 3.874787
## $ CARB_ACRE_SE  <dbl> 3.901188
## $ BIO_TOTAL_SE  <dbl> 4.990582
## $ CARB_TOTAL_SE <dbl> 5.006351
## $ AREA_TOTAL_SE <dbl> 3.531998
## $ nPlots_TREE   <int> 126
## $ nPlots_AREA   <int> 127
## $ N             <int> 199

glimpse(bio_pop)

## Rows: 1
## Columns: 11
## $ YEAR             <dbl> 2018
## $ BIO_AG_ACRE      <dbl> 75.65639
## $ CARB_AG_ACRE     <dbl> 36.59257
## $ BIO_AG_TOTAL     <dbl> 27762772
## $ CARB_AG_TOTAL    <dbl> 13427961
## $ AREA_TOTAL       <dbl> 366958.7
## $ BIO_AG_ACRE_SE   <dbl> 3.874787
## $ CARB_AG_ACRE_SE  <dbl> 3.901188
## $ BIO_AG_TOTAL_SE  <dbl> 4.990582
## $ CARB_AG_TOTAL_SE <dbl> 5.006351
## $ AREA_TOTAL_SE    <dbl> 3.531998

Adding grouping variables

Unlike estimation without sampling errors, we can NOT just add grouping variables to the above procedures in our group_by call. Rather, we will need to account for absence points here (or zero-length outputs) or our estimates will be artificially inflated if groups are not mutually exclusive at the plot level. Example: The presence of red oak on a plot does not negate the potential presence of white ash on the same plot. Therefore, there should be a zero listed for each species not found on the plot. We accomplish this by treating each group as a separate population, computing each individually, and then rejoining the groups at the end of the operation.

First let’s make a dataframe where each row defines a group that we want to estimate:

# All groups we want to estimate
grps <- data %>%
  group_by(YEAR, SPCD) %>%
  summarize()

## `summarise()` has grouped output by 'YEAR'. You can override using the
## `.groups` argument.

Each row in grps now defines an individual population that we would like to produce estimates for, thus our end product should have the same number of rows. Thus, we can loop over each of the rows in grps and use the estimation procedures above to estimate attributes for each group.

Before we get started with the loop we need to find a way to define the population of interest for each iteration (row). To do that, we will modify our domain indicators from above to reflect whether or not an observation falls within the population defined by grps[i,]. Saving the original domain indicators as variables so they are not overwritten in the loop:

tDI <- data$tDI
aDI <- data$aDI

Now we can start our loop:

# Let's store each output in a list
bio_pop <- list()
# Looping
for (i in 1:nrow(grps)){

  # Tree domain indicator
  data$tDI <- tDI * if_else(data$SPCD == grps$SPCD[i], 1, 0) * 
              if_else(data$YEAR == grps$YEAR[i], 1, 0)
  # No need to modify the area domain indicator for SPCD, because
  # we want to estimate all forested area within the year
  data$aDI <- aDI * if_else(data$YEAR == grps$YEAR[i], 1, 0)

  # SAME AS ABOVE, JUST IN A LOOP NOW --> SIMILAR TO USING GROUP_BY
  # Estimate Area attributes -- plot level
  tre_pop <- data %>%
    filter(EVAL_TYP == 'EXPVOL') %>%
    # Make sure we only have unique observations of plots, trees, etc.
    distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, SUBP, TREE, .keep_all = TRUE) %>%
    # Plot-level estimates first (note we are NOT using EXPNS here)
    group_by(ESTN_UNIT_CN, STRATUM_CN, PLT_CN) %>%
    summarize(bioPlot = sum(DRYBIO_AG * TPA_UNADJ * tAdj * tDI  / 2000, na.rm = TRUE),
              carbPlot = sum(CARBON_AG * TPA_UNADJ * tAdj * tDI  / 2000, na.rm = TRUE), 
              .groups = 'drop')

  # Estimate Area attributes -- plot level
  area_pop <- data %>%
    filter(EVAL_TYP == 'EXPCURR') %>%
    # Make sure we only have unique observations of plots, trees, etc.
    distinct(ESTN_UNIT_CN, STRATUM_CN, PLT_CN, CONDID, .keep_all = TRUE) %>%
    # Plot-level estimates first (multiplying by EXPNS here)
    # Extra grouping variables are only here so they are carried through
    group_by(P1POINTCNT, P1PNTCNT_EU, P2POINTCNT, AREA_USED, ESTN_UNIT_CN, STRATUM_CN, PLT_CN) %>%
    summarize(forArea = sum(CONDPROP_UNADJ * aAdj * aDI, na.rm = TRUE), 
              .groups = 'drop')

  # Joining the two tables
  bio_pop_plot <- left_join(tre_pop, area_pop)

  # Now we can follow through with the remaining estimation procedures
  # Strata level
  bio_pop_strat <- bio_pop_plot %>%
    group_by(ESTN_UNIT_CN, STRATUM_CN) %>%
    summarize(aStrat = mean(forArea, na.rm = TRUE), # Area mean
              bioStrat = mean(bioPlot, na.rm = TRUE), # Biomass mean
              carbStrat = mean(carbPlot, na.rm = TRUE), # Carbon mean
              # We don't want a vector of these values, since they are repeated
              P2POINTCNT = first(P2POINTCNT),
              AREA_USED = first(AREA_USED),
              # Strata weight, relative to estimation unit
              w = first(P1POINTCNT) / first(P1PNTCNT_EU),
              # Strata level variances
              aVar = (sum(forArea^2) - sum(P2POINTCNT * aStrat^2)) / 
                     (P2POINTCNT * (P2POINTCNT-1)),
              bioVar = (sum(bioPlot^2) - sum(P2POINTCNT * bioStrat^2)) / 
                       (P2POINTCNT * (P2POINTCNT-1)),
              carbVar = (sum(carbPlot^2) - sum(P2POINTCNT * carbStrat^2)) / 
                        (P2POINTCNT * (P2POINTCNT-1)),
              # Strata level co-varainces
              bioCV = (sum(forArea*bioPlot) - sum(P2POINTCNT * aStrat *bioStrat)) / 
                      (P2POINTCNT * (P2POINTCNT-1)),
              carbCV = (sum(forArea*carbPlot) - sum(P2POINTCNT * aStrat *carbStrat)) / 
                       (P2POINTCNT * (P2POINTCNT-1)), 
              .groups = 'drop')

  # Moving on to the estimation unit
  bio_pop_eu <- bio_pop_strat %>%
    group_by(ESTN_UNIT_CN) %>%
    summarize(aEst = sum(aStrat * w, na.rm = TRUE) * first(AREA_USED), # Mean Area
              bioEst = sum(bioStrat * w, na.rm = TRUE) * first(AREA_USED), # Mean biomass
              carbEst = sum(carbStrat * w, na.rm = TRUE) * first(AREA_USED), # Mean carbon
              # Estimation unit variances
              aVar = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
                (sum(P2POINTCNT*w*aVar) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*aVar)),
              bioVar = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
                (sum(P2POINTCNT*w*bioVar) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*bioVar)),
              carbVar = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
                (sum(P2POINTCNT*w*carbVar) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*carbVar)),
              ## Estimation unit covariances
              bioCV = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
                (sum(P2POINTCNT*w*bioCV) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*bioCV)),
              carbCV = (first(AREA_USED)^2 / sum(P2POINTCNT)) *
                (sum(P2POINTCNT*w*carbCV) + sum((1-w)*(P2POINTCNT/sum(P2POINTCNT))*carbCV)), 
              .groups = 'drop')

  # Finally, sum across Estimation Units for totals
  bio_pop_total <- bio_pop_eu %>%
    summarize(AREA_TOTAL = sum(aEst, na.rm = TRUE),
              BIO_AG_TOTAL = sum(bioEst, na.rm = TRUE),
              CARB_AG_TOTAL = sum(carbEst, na.rm = TRUE),
              # Ratios
              BIO_AG_ACRE = BIO_AG_TOTAL / AREA_TOTAL,
              CARB_AG_ACRE = CARB_AG_TOTAL / AREA_TOTAL,
              # Total samping errors
              AREA_TOTAL_SE = sqrt(sum(aVar, na.rm = TRUE)) / AREA_TOTAL * 100,
              BIO_AG_TOTAL_SE = sqrt(sum(bioVar, na.rm = TRUE)) / BIO_AG_TOTAL * 100,
              CARB_AG_TOTAL_SE = sqrt(sum(carbVar, na.rm = TRUE)) / CARB_AG_TOTAL * 100,
              # Ratio variances
              bioAcreVar = (1 / AREA_TOTAL^2) * (sum(bioVar) + 
                           (BIO_AG_ACRE^2) * sum(aVar) -
                           2 * BIO_AG_ACRE * sum(bioCV)),
              carbAcreVar = (1 / AREA_TOTAL^2) * (sum(carbVar) + 
                            (CARB_AG_ACRE^2)*sum(aVar) -  
                            2 * CARB_AG_ACRE * sum(carbCV)),
              BIO_AG_ACRE_SE = sqrt(sum(bioAcreVar, na.rm = TRUE)) / 
                               BIO_AG_ACRE * 100,
              CARB_AG_ACRE_SE = sqrt(sum(carbAcreVar, na.rm = TRUE)) / 
                                CARB_AG_ACRE * 100) %>%
    # Re ordering, dropping variance
    select(BIO_AG_ACRE, CARB_AG_ACRE, BIO_AG_TOTAL, CARB_AG_TOTAL, 
           AREA_TOTAL, BIO_AG_ACRE_SE, CARB_AG_ACRE_SE, BIO_AG_TOTAL_SE, 
           CARB_AG_TOTAL_SE, AREA_TOTAL_SE)

  # Saving the output in our list...
  bio_pop[[i]] <- bio_pop_total
}

Great, we have our estimates! Except they are locked up in a list. Let’s convert back to a data.frame and rejoining with grps:

bio_pop <- bind_rows(bio_pop)
bio_pop_sp <- bind_cols(grps, bio_pop)

Comparing with rFIA, we have a match!

glimpse(biomass(clipFIA(fiaRI), totals = TRUE, bySpecies = TRUE))

## Rows: 45
## Columns: 17
## $ YEAR            <dbl> 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, …
## $ SPCD            <dbl> 12, 43, 68, 126, 129, 130, 261, 313, 316, 317, 318, 32…
## $ COMMON_NAME     <chr> "balsam fir", "Atlantic white-cedar", "eastern redceda…
## $ SCIENTIFIC_NAME <chr> "Abies balsamea", "Chamaecyparis thyoides", "Juniperus…
## $ BIO_ACRE        <dbl> 9.019379e-03, 4.327762e-02, 4.191399e-02, 1.665285e+00…
## $ CARB_ACRE       <dbl> 0.0045638059, 0.0205568687, 0.0218371876, 0.7943407536…
## $ BIO_TOTAL       <dbl> 3309.7397, 15881.0985, 15380.7024, 611090.6699, 378204…
## $ CARB_TOTAL      <dbl> 1674.7283, 7543.5218, 8013.3459, 291490.2495, 1917499.…
## $ AREA_TOTAL      <dbl> 366958.7, 366958.7, 366958.7, 366958.7, 366958.7, 3669…
## $ BIO_ACRE_SE     <dbl> 114.028966, 56.662929, 68.992421, 49.802399, 18.886660…
## $ CARB_ACRE_SE    <dbl> 114.028966, 56.662929, 68.992421, 49.802399, 18.886660…
## $ BIO_TOTAL_SE    <dbl> 114.043537, 56.722024, 68.649804, 50.248846, 19.200896…
## $ CARB_TOTAL_SE   <dbl> 114.043537, 56.722024, 68.649804, 50.248846, 19.200896…
## $ AREA_TOTAL_SE   <dbl> 3.531998, 3.531998, 3.531998, 3.531998, 3.531998, 3.53…
## $ nPlots_TREE     <int> 1, 3, 5, 10, 63, 1, 8, 1, 107, 1, 8, 1, 22, 40, 1, 8, …
## $ nPlots_AREA     <int> 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127,…
## $ N               <int> 167, 199, 199, 199, 199, 167, 167, 167, 199, 167, 199,…

glimpse(bio_pop_sp)

## Rows: 50
## Columns: 12
## Groups: YEAR [1]
## $ YEAR             <dbl> 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,…
## $ SPCD             <dbl> 12, 43, 68, 96, 126, 129, 130, 261, 313, 316, 317, 31…
## $ BIO_AG_ACRE      <dbl> 9.019379e-03, 4.327762e-02, 4.191399e-02, 0.000000e+0…
## $ CARB_AG_ACRE     <dbl> 0.0045638059, 0.0205568687, 0.0218371875, 0.000000000…
## $ BIO_AG_TOTAL     <dbl> 3309.7397, 15881.0985, 15380.7024, 0.0000, 611090.669…
## $ CARB_AG_TOTAL    <dbl> 1674.7283, 7543.5218, 8013.3459, 0.0000, 291490.2495,…
## $ AREA_TOTAL       <dbl> 366958.7, 366958.7, 366958.7, 366958.7, 366958.7, 366…
## $ BIO_AG_ACRE_SE   <dbl> 114.028966, 56.662929, 68.992421, NaN, 49.802399, 18.…
## $ CARB_AG_ACRE_SE  <dbl> 114.028966, 56.662929, 68.992421, NaN, 49.802399, 18.…
## $ BIO_AG_TOTAL_SE  <dbl> 114.043537, 56.722024, 68.649804, NaN, 50.248846, 19.…
## $ CARB_AG_TOTAL_SE <dbl> 114.043537, 56.722024, 68.649804, NaN, 50.248846, 19.…
## $ AREA_TOTAL_SE    <dbl> 3.531998, 3.531998, 3.531998, 3.531998, 3.531998, 3.5…

Hunter Stanke, Jeffrey W. Doser

2020 (last updated February 7, 2025)

Estimation at the plot-level

Two example plots

Basic Estimation Procedures

Unique domains of interest

Grouped estimation procedures

Estimation at the population-level

Data Preparation

Without Sampling Errors

No grouping variables

Adding grouping variables

With Sampling Errors

No grouping variables

Adding grouping variables