seqduplicated()
The function determines which elements of a vector are duplicates (similarly to duplicated
) in consecutive rows.
collapse()
Omits duplicates similarly to unique
, but only in consecutive rows, so the sequence of state changes remains, but without duplicates.
Usage
seqduplicated(x, na.rm = FALSE, na.breaks = TRUE)
collapse(x, na.rm = FALSE, na.breaks = TRUE)
Arguments
- x
(
vector
): input object.- na.rm
(
logical
): AreNA
entries to be treated as duplicates (TRUE
) or just like a normal value (FALSE
)?- na.breaks
(
logical
): Ifna.rm=TRUE
and theNA
values are surrounded by the same values, should the streak be treated as broken? Runningseqduplicated(, na.rm=TRUE)
on(2, 1,NA, 1)
while settingna.breaks
toTRUE
will return(FALSE, FALSE, TRUE, FALSE)
, and withTRUE
it will return(FALSE, FALSE, TRUE, TRUE)
. The results with the same argumentation ofcollapse()
will be(2,1)
and(2,1,1)
.
Details
These functions are essentially about checking whether a value in a vector at index is the same as the value at the previous index. This seamingly primitive task had to be rewritten with Rcpp for speed and the appropriate handling of NA
values.
Examples
# example vector
examp <- c(4,3,3,3,2,2,1,NA,3,3,1,NA,NA,5, NA, 5)
# seqduplicated()
seqduplicated(examp)
#> [1] FALSE FALSE TRUE TRUE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
#> [13] TRUE FALSE FALSE FALSE
# contrast with
duplicated(examp)
#> [1] FALSE FALSE TRUE TRUE FALSE TRUE FALSE FALSE TRUE TRUE TRUE TRUE
#> [13] TRUE FALSE TRUE TRUE
# with NA removal
seqduplicated(examp, na.rm=TRUE)
#> [1] FALSE FALSE TRUE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
#> [13] TRUE FALSE TRUE FALSE
# the same with collapse()
collapse(examp)
#> [1] 4 3 2 1 NA 3 1 NA 5 NA 5
# contrast with
unique(examp)
#> [1] 4 3 2 1 NA 5
# with NA removal
collapse(examp, na.rm=TRUE)
#> [1] 4 3 2 1 3 1 5 5
# with NA removal, no breaking
collapse(examp, na.rm=TRUE, na.breaks=FALSE)
#> [1] 4 3 2 1 3 1 5