vec_locate_sorted_groups()
returns a data frame containing a key
column
with sorted unique groups, and a loc
column with the locations of each
group in x
. It is similar to vec_group_loc()
, except the groups are
returned sorted rather than by first appearance.
Usage
vec_locate_sorted_groups(
x,
...,
direction = "asc",
na_value = "largest",
nan_distinct = FALSE,
chr_proxy_collate = NULL
)
Arguments
- x
A vector
- ...
These dots are for future extensions and must be empty.
- direction
Direction to sort in.
A single
"asc"
or"desc"
for ascending or descending order respectively.For data frames, a length
1
orncol(x)
character vector containing only"asc"
or"desc"
, specifying the direction for each column.
- na_value
Ordering of missing values.
A single
"largest"
or"smallest"
for ordering missing values as the largest or smallest values respectively.For data frames, a length
1
orncol(x)
character vector containing only"largest"
or"smallest"
, specifying how missing values should be ordered within each column.
- nan_distinct
A single logical specifying whether or not
NaN
should be considered distinct fromNA
for double and complex vectors. IfTRUE
,NaN
will always be ordered betweenNA
and non-missing numbers.- chr_proxy_collate
A function generating an alternate representation of character vectors to use for collation, often used for locale-aware ordering.
If
NULL
, no transformation is done.Otherwise, this must be a function of one argument. If the input contains a character vector, it will be passed to this function after it has been translated to UTF-8. This function should return a character vector with the same length as the input. The result should sort as expected in the C-locale, regardless of encoding.
For data frames,
chr_proxy_collate
will be applied to all character columns.Common transformation functions include:
tolower()
for case-insensitive ordering andstringi::stri_sort_key()
for locale-aware ordering.
Value
A two column data frame with size equal to vec_size(vec_unique(x))
.
A
key
column of typevec_ptype(x)
.A
loc
column of type list, with elements of type integer.
Details
vec_locate_sorted_groups(x)
is equivalent to, but faster than:
<- vec_group_loc(x)
info vec_slice(info, vec_order(info$key))
Examples
df <- data.frame(
g = sample(2, 10, replace = TRUE),
x = c(NA, sample(5, 9, replace = TRUE))
)
# `vec_locate_sorted_groups()` is similar to `vec_group_loc()`, except keys
# are returned ordered rather than by first appearance.
vec_locate_sorted_groups(df)
#> key.g key.x loc
#> 1 1 2 8
#> 2 1 3 9
#> 3 1 4 3, 10
#> 4 1 5 5
#> 5 1 NA 1
#> 6 2 1 6, 7
#> 7 2 2 4
#> 8 2 5 2
vec_group_loc(df)
#> key.g key.x loc
#> 1 1 NA 1
#> 2 2 5 2
#> 3 1 4 3, 10
#> 4 2 2 4
#> 5 1 5 5
#> 6 2 1 6, 7
#> 7 1 2 8
#> 8 1 3 9