18 LTabundR vs. ABUND9
We have tried to develop LTabundR with the flexibility either to replicate ABUND results or to produce customizable results that could potentially vary from ABUND quite significantly (e.g., formatted for habitat modeling). However, even when we use LTabundR settings intended to replicate ABUND results, there are likely to be some small differences. These are detailed below:
Differences in total effort
After loading the data,
LTabundRremoves rows with invalid Cruise numbers, invalid times, and invalid coordinates. As far as we can tell,ABUNDdoes not remove such missing data. This is a relatively minor point; in processing the 1986-2020 data (623,640 rows), 287 rows are missing Cruise info; 1,430 are missing valid times; and 556 are missing valid coordinates, for a total of 2,273 rows removed out of more than 625,000 (0.3% of rows). Many of these rows with missing data have the same coordinates as complete rows nearby (since WinCruz can sometimes produce multiple lines at the same time when setting up metadata for the research day).In
ABUND, custom functions are used to calculate whether DAS coordinates occur within geostrata are difficult to validate, and it is possible that they differ from the functions used in R for the same purpose.LTabundRuses functions within the well-establishedsfpackage to do these same calculations.Both
ABUNDandLTabundRcalculate the distance surveyed based on the sum of distances between adjacent rows in theDASfile. They do this differently (see below), based on the way they loop through the data, which may yield minor differences in segment track lengths.ABUNDloops through the data one row at a time, calculating distance traveled at the same time as allocating effort to segments and processing sightings. It calculates the distance between each new row and the beginning of a segment of effort. That beginning location (objectBEGTIMEin theFortrancode) is reset with various triggers (including a new date), and the distance traveled is calculated using a subroutine (DISTRAV). For surveys occurring after 1991, the distance between a new coordinate and theBEGTIMEcoordinate is calculated using a subroutine namedGRCIRC(great-circle distance). Prior to 1991, the ship speed and the time sinceBEGTIMEis used to estimate distance traveled. After 1991, the function calculates distance based on coordinates. For all years, the distance calculation only happens if the time gap in time is at least 1.2 minutes (line 405 inABUND9.FOR), otherwise the distance is returned as 0 km. This function also seems to allow for large gaps between subsequent rows within a single day of effort. The subroutine prints a warning message when the gap is greater than 30 km, but does not modify its estimate of distance traveled. This allows for the possibility that, in rare cases, estimates of distance surveyed will be spuriously large.LTabundRprocesses data using a modular approach rather than a single large loop. Prior to the segmentizing stage, it calculates the distance between rows of data. Its approach is to calculate the distance between each row and its subsequent row (it does so using theswfscDASfunctiondistance_greatcircle(), which is a nearly-exact recode of theABUNDsubroutineGRCIRCforR. There are two important differences thatLTabundRapplies: (1) In anticipation ofWinCruzsurveys that operate on much smaller scales with more frequent position updates, we calculate distances for time gaps as small as 30 seconds, not 1.2 minutes. This may generate minor differences in the total length of tracks; (2) If the distance between rows is greater than 30 km, then it is assumed that effort has stopped and the distance is changed to 0 km (that distance can be modified by the user; see theLTabundRfunctionload_survey_settings(). This approach should avoid the misinterpretation of large gaps in effort as large periods of effort.
Differences in on-effort distance
LTabundRworks withDASdata that are loaded and formatted usingswfscDAS:das_read()anddas_process(). It is possible that these functions categorize events as On- or Off-Effort slightly differently thanABUND, or apply other differences that would be difficult for us to know or track.While
ABUNDuses a minimum length threshold to create segments, such that full-length segments are never less than that threshold and small remainder segments always occur at the end of a continuous period of effort,LTabundRuses an approach more similar to the effort-chopping functions inswfscDAS: it looks at continuous blocs of effort, determines how many full-length segments can be defined in each bloc, then randomly places the remainder within that bloc according to a set of user-defined settings (seeload_survey_settings(). This process produces full-length segments whose distribution of exact lengths is centered about the target length, rather than always being greater than the target length.To control the particularities of segmentizing,
LTabundRuses settings such assegment_max_interval, which controls how discontinuous effort is allowed to be pooled into the same segment. These rules may produce slight differences in segment lengths.Note that, since
ABUNDis a loop-based routine whileLTabundRis modular, segments identified by the two program will never be exactly identical, and a 1:1 comparison of segments produced by the two programs is not possible.
Differences in total sightings
- In
ABUND9, only sightings that occur whileOnEffort == TRUEare returned; in contrast,LTabundRdoes not remove any sightings (it just flags them differently, using theincludedcolumn variable). But we can easily filterLTabundRsightings to emulateABUND9output.
Differences in on-effort sightings
LTabundRincludes an additional criterion for inclusion in analysis: the sighting must occur at or forward of the beam (this can be deactivated inload_survey_settings().Since geostratum handling is different in the two programs, it is possible that sightings occurring near stratum margins may be included/excluded differently.
Differences in school size estimation
If an observer is not included in the Group Size Calibration Coefficients
.DATfile,ABUNDapplies a default coefficient (0.8625) to scale group size estimates; however, it applies this calibration to group sizes of all sizes, including solo animals or small groups of 2-3. InLTabundR, users can choose to restrict calibrations for unknown observers to group size estimates of any size (seeload_cohort_settings())Note that
ABUND9calibrates school sizes slightly differently thanABUND7. TheABUND9release notes mention a bug in previous versions that incorrectly calibrated school size.LTabundRcorresponds perfectly withABUND9school size calibrations, but not withABUND8or earlier.