This function will return the preferred environment of the taxa, given the distribution of occurrences.
Usage
affinity(
x,
tax,
bin,
env,
coll = NULL,
method = "binom",
alpha = 1,
reldat = NULL,
na.rm = FALSE,
bycoll = FALSE,
output = "levels"
)
Arguments
- x
(data.frame)
The occurrence dataset containing the taxa with unknown environmental affinities.- tax
(character)
The column name of taxon names.- bin
(character)
The column name of bin names.- env
(character)
The environmental variable of the occurrences.- coll
(character)
The column name of collection identifiers (optional). If this is provided, then then the multiple entries of a taxon within the collections will be treated as one (recommended).- method
(character)
The method used for affinity calculations. Can be either"binom"
or"majority"
.- alpha
(numeric)
The alpha value of the binomial tests. By default binomial testing is off (alpha=1
) and the methods returns that environment as the preferred one, which has the highest likelihood (odds ratio).- reldat
(data.frame)
Database with the same structure asx
.x
is typically a subset ofreldat
. If given, the occurrence distribution ofreldat
is used as the null model of sampling. Defaults toNULL
, which means thatx
itself will be used asreldat
.- na.rm
(logical)
Should theNA
entries in the relevant columns ofx
be omitted automatically?- bycoll
(logical)
If set toTRUE
, the number of collections (or samples, incoll
) will be used rather than the number of occurrences.- output
(character)
The type of output, defaults to"levels"
, which will return the affinities of the taxa. Can also be"counts"
, which will return the original counts calculated from the data. The third option"proportions"
will calculate the proportions from the counts.
Details
Sampling patterns have an overprinting effect on the frequency of taxon occurrences in different environments. The environmental affinity (Foote, 2006; Kiessling and Aberhan, 2007; Kiessling and Kocsis, 2015) expresses whether the taxa are more likely to occur in an environment, given the sampling patterns of the dataset at hand. The function returns the likely preferred environment for each taxon as a vector. NA
outputs indicate that the environmental affinity is equivocal based on the selected method.
The following methods are implemented:
'majority'
: Environmental affinity will be assigned based on the number of occurrences of the taxon in the different environments, without taking sampling of the entire dataset into account. If the taxon has more occurrences in environment 1, the function will return environment 1 as the preferred habitat.
'binom'
: The proportion of occurrences of a taxon in environment 1 and environment 2 will be compared to a null model, which is based on the distribution of all occurrences from the stratigraphic range of the taxon (in x
or if provided, in reldat
). Then a binomial test is run on with the numbers of the most likely preference (against all else). The alpha
value indicates the significance of the binomial tests, setting alpha
to 1
will effectively switch the testing off: if the ratio of occurrences for the taxon is different from the ratio observed in the dataset, an affinity will be assigned. This is the default method. If an environment is not sampled at all in the dataset to which the taxon's occurrences are compared to, the binomial method returns NA
for the taxon's affinity.
References
Foote, M. (2006). Substrate affinity and diversity dynamics of Paleozoic marine animals. Paleobiology, 32(3), 345-366.
Kiessling, W., & Aberhan, M. (2007). Environmental determinants of marine benthic biodiversity dynamics through Triassic–Jurassic time. Paleobiology, 33(3), 414-434.
Kiessling, W., & Kocsis, Á. T. (2015). Biodiversity dynamics and environmental occupancy of fossil azooxanthellate and zooxanthellate scleractinian corals. Paleobiology, 41(3), 402-414.