Simulated Test Data

Introduction

Many packages provide their own worked examples as part of their user help. However, issues can arise when users confront packages with their own data. We have developed our simulated dataset (using Stock Synthesis) for two purposes:

  1. an independently generated dataset increases the likelihood that the user will experience pitfalls or difficulties related to either formatting issues or missing data, or to the nature of the data itself (e.g. whether there is adequate “contrast” in the data). As such, we can therefore provide advice from our own experience of confronting packages with independently-derived data.
  2. having a common dataset applied across all packages enables direct comparison of the outcomes of alternate assessments.

Our dataset was developed by Prof. Andre Punt for a “codoid-like” finfish stock.

Background

The population dynamics are governed by the sex- and age-structured population dynamics model underlying Stock Synthesis. Stock Synthesis also provides the expected values for the proportion of the catch in each age-class and length-class, as well as expected values for the index of abundance (data_exp.txt). The model is run for 42 years (the first 11 years have zero catches to simulate an unexploited stock). Such a short time series is typical of data moderate fisheries such as Australia. Table 1 lists the values for the biological parameters of the population dynamics model and Figure 1 (shown in the attached word document) plots the time-series of catches and the selectivity patterns of the fishery and survey. The mathematics of the data generation process is described in the downloadable Description of Data Generation Process Word™ document. All the files mentioned below, including the original Stock Synthesis input and output files, are in the downloadable ZIP file for your use. 

Data generation

Catch data

The catches (in numbers and by mass; files CatchNum.csv and Catches.csv, respectively) are assumed to be known without error.

Composition data

The simulated length-composition data (17 size-classes; 10-15cm cm to 90cm+) and age-composition data (ages 1-25, age-0+1 animals are combined and the last age-class is a plus-group ) are assumed to be Dirichlet distributed with effective sample sizes of 100 (length-frequency) and 75 (age-composition), and 10 (conditional age-at-length by length-class). The length-composition and age-composition data are respectively stored in the files LengthData.csv and AgeData.csv respectively. The code also reports the true (no observation error in the file TrueCAA.csv. These values will differ from the information in AgeData.csv, which are expressed in proportions multiplied by the effective sample size.  The reported catch-at-age data aggregated over stock (TotalCatchAge.csv) is computed by summing the generated catch-at-age by sex over sex to compute age-aggregated catch-at-age data. The weight-at-age by year is computed from the weight of animals of sex, s, and age, a, in the middle of the year, and  the catch of animals of sex and age during year y (in numbers).

Index data

The index data (file CPUE.csv) are lognormally distributed with a CV of 0.3

Proportion of females data

This quantity (which is measured without error and reported in PropFem.csv) is the proportion of the population that is female by age and year.

Proportion mature data

This quantity (which is measured without error and reported in PropMature.csv) is female fecundity by age (proportion mature x weight-at-age) divided by the maximum fecundity over age.

Catch weight data

This quantity (which is measured without error and reported in CatchWght.csv) is the weight of animals by sex and age in the middle of the year.

Selectivity data

The parameters (50% and 95% points) defining selectivity by age and length are provided (without error) in the files AgeSel.csv and LenSel.csv.

Stock weight data

This quantity (which is measured without error and reported in StockWght.csv) is the weight of animals by age and sex at the start of the year in the stock (not the catch) and is computed using the weight of animals of sex s and age a at the start of the year and the number of animals of sex s and age a during year y.

Simulated biological parameters

These parameters are the “real” values assumed for the creation of the simulated dataset.

Table 1. Values for the biological parameters

Females Males
Ages considered (years)

0 – 40+

0 – 40+

Growth (length-at-age) (Growth.csv)
L(cm)

81.44

67.35

 k (yr-1)

0.15

0.20

L(age 0) (cm)

20

16

CV of length-at-age

0.1

0.1

Length-weight relationship (LenW.csv)
a

2.44 x 10-6

b

3.34694

Maturity-at-length (Maturity.csv)
Length-at-50% maturity (cm)

45

Length-at-95% maturity (cm)

56.78

Natural mortality (Natural_Mortality.csv)

0.2

0.25

Selectivity (LenSel.csv)

Length-at-50% selectivity (cm)

50

Length-at-95% selectivity (cm)

70

 

Input files The file (Data.ss) is produced by Stock Synthesis and contains a single stochastic replicate, and also the specifications for the number of years. The file Data_exp.txt is also produced by Stock Synthesis and contains the expected values for the index and composition data – these values form the basis for simulating the data set.