Compute per-column summaries and return as a data.frame
. Warning: can be an expensive operation.
rsummary(
db,
tableName,
...,
countUniqueNum = FALSE,
quartiles = FALSE,
cols = NULL,
qualifiers = NULL
)
database connection.
name of table.
force additional arguments to be bound by name.
logical, if TRUE include unique non-NA counts for numeric cols.
logical, if TRUE add Q1 (25%), median (50%), Q3 (75%) quartiles.
if not NULL set of columns to restrict to.
optional named ordered vector of strings carrying additional db hierarchy terms, such as schema.
data.frame summary of columns.
For numeric columns includes NaN
in nna
count (as is typical for R
, e.g.,
is.na(NaN)
).
if (requireNamespace("DBI", quietly = TRUE) &&
requireNamespace("RSQLite", quietly = TRUE)) {
d <- data.frame(p= c(TRUE, FALSE, NA),
s= NA,
w= 1:3,
x= c(NA,2,3),
y= factor(c(3,5,NA)),
z= c('a',NA,'a'),
stringsAsFactors=FALSE)
db <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
RSQLite::initExtension(db)
rq_copy_to(db, "dRemote", d,
overwrite = TRUE, temporary = TRUE)
print(rsummary(db, "dRemote"))
DBI::dbDisconnect(db)
}
#> column index class nrows nna nunique min max mean sd lexmin lexmax
#> 1 p 1 integer 3 1 NA 0 1 0.5 0.7071068 <NA> <NA>
#> 2 s 2 integer 3 3 0 NA NA NA NA <NA> <NA>
#> 3 w 3 integer 3 0 NA 1 3 2.0 1.0000000 <NA> <NA>
#> 4 x 4 numeric 3 1 NA 2 3 2.5 0.7071068 <NA> <NA>
#> 5 y 5 character 3 1 2 NA NA NA NA 3 5
#> 6 z 6 character 3 1 1 NA NA NA NA a a