Skip to contents

This function will return the basic sampling summaries of a dataset

Usage

binstat(
  x,
  tax = "genus",
  bin = "stg",
  coll = NULL,
  ref = NULL,
  noNAStart = FALSE,
  duplicates = NULL,
  xexp = NULL,
  indices = FALSE
)

Arguments

x

(data.frame): The occurrence dataset.

tax

(character): The column name of taxon names.

bin

(character): The column name of bin names.

coll

(character): The column name of collection numbers. (optional)

ref

(character): The column name of reference numbers. (optional)

noNAStart

(logical) Useful when the dataset does not start from bin no. 1, but positive integer bin numbers are provided. Then noNAStart=TRUE will cut the first part of the resulting table, so the first row will contain the estimates for the lowest bin number. In case of positive integer bin identifiers, and if noNAStart=FALSE, the index of the row will be the bin number.

duplicates

(logical): The function will check whether there are duplicate occurrences (multiple species/genera). When set to NULL, nothing will happen, but the function will notify you if duplicates are present. If set to TRUE, the function will not do anything with these, if set to FALSE, the duplicates will be omitted.

xexp

(numeric): Argument of the OxW subsampling type (subtrialOXW).Setting this parameter to a valid numeric value will return the maximum quota for xexp.

indices

(logical): Setting this value to TRUE will calculate all indices implemented in (indices).

Value

A data.frame with rows corresponding to bin entries.

Details

Secondary function of the package that calculates a number of sampling related variables and diversity estimators for each bin. In contrast to the (divDyn) function, the bins are treated independently in this function. The function also returns the maximum subsampling quota for OxW subsampling (subtrialOXW) with a given xexp value.

By setting total to FALSE (default), the following results are output:

occs: The number of occurrences in each time bin.

colls: The number of collections in each time bin.

xQuota: The maximum quota for OxW subsampling (subtrialOXW) with the given xexp value. The number of occurrences in each collection is tabulated, and is raised to the power of xexp. The xQuota value is the sum of these values across all collections in a time slice.

refs: The number of references in each time bin.

SIBs: The number of Sampled-In-Bin taxa in each time bin.

occ1: The number of taxa in each time bin, that occur in only 1 collection.

ref1: The number of taxa in each time bin, that occur in only 1 reference.

occ2: The number of taxa in each time bin, that occur in exactly 2 collections.

ref2: The number of taxa in each time bin, that occur in exactly 2 references.

u: Good's u, coverage estimator based on the number of single-collection taxa (occ1).

uPrime: Good's u, coverage estimator based on the number of single-reference taxa (ref1).

chao1occ: Chao1 extrapolation estimator, based on the the number of single-collection and two-collection taxa (occ1).

chao1ref: Chao1 extrapolation estimator, based on the the number of single-reference and two-reference taxa (occ2).

Examples

data(corals)
# slice-specific sampling
basic <- binstat(corals, tax="genus", bin="stg")

# subsampling diagnostic
 subStats <- subsample(corals, method="cr", tax="genus", FUN=binstat, 
   bin="stg", q=100,noNAStart=FALSE)
#> Warning: The argument(s) 'method' is/are not used with current configuration.
#> 1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 
50 


# maximum quota with xexp
more <- binstat(corals, tax="genus", bin="stg", coll="collection_no", xexp=1.4)
#> The database contains duplicate occurrences (multiple species/genus).