Skip to contents

seqduplicated() The function determines which elements of a vector are duplicates (similarly to duplicated) in consecutive rows.

collapse() Omits duplicates similarly to unique, but only in consecutive rows, so the sequence of state changes remains, but without duplicates.

Usage

seqduplicated(x, na.rm = FALSE, na.breaks = TRUE)

collapse(x, na.rm = FALSE, na.breaks = TRUE)

Arguments

x

(vector): input object.

na.rm

(logical): Are NA entries to be treated as duplicates (TRUE) or just like a normal value (FALSE)?

na.breaks

(logical): If na.rm=TRUE and the NA values are surrounded by the same values, should the streak be treated as broken? Running seqduplicated(, na.rm=TRUE) on (2, 1,NA, 1) while setting na.breaks to TRUE will return (FALSE, FALSE, TRUE, FALSE), and with TRUE it will return (FALSE, FALSE, TRUE, TRUE). The results with the same argumentation of collapse() will be (2,1) and (2,1,1).

Value

A logical vector.

Details

These functions are essentially about checking whether a value in a vector at index is the same as the value at the previous index. This seamingly primitive task had to be rewritten with Rcpp for speed and the appropriate handling of NA values.

Examples

  
# example vector
  examp <- c(4,3,3,3,2,2,1,NA,3,3,1,NA,NA,5, NA, 5)

# seqduplicated()
  seqduplicated(examp)
#>  [1] FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE
#> [13]  TRUE FALSE FALSE FALSE

  # contrast with 
  duplicated(examp)
#>  [1] FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
#> [13]  TRUE FALSE  TRUE  TRUE

  # with NA removal
  seqduplicated(examp, na.rm=TRUE)
#>  [1] FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE
#> [13]  TRUE FALSE  TRUE FALSE
 
# the same with collapse()
  collapse(examp)
#>  [1]  4  3  2  1 NA  3  1 NA  5 NA  5

  # contrast with 
  unique(examp)
#> [1]  4  3  2  1 NA  5

  # with NA removal
  collapse(examp, na.rm=TRUE)
#> [1] 4 3 2 1 3 1 5 5

  # with NA removal, no breaking
  collapse(examp, na.rm=TRUE, na.breaks=FALSE)
#> [1] 4 3 2 1 3 1 5