Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to export data from grouped extract_stats efficiently? #917

Closed
lxsteiner opened this issue Mar 14, 2024 · 3 comments · Fixed by #955
Closed

How to export data from grouped extract_stats efficiently? #917

lxsteiner opened this issue Mar 14, 2024 · 3 comments · Fixed by #955

Comments

@lxsteiner
Copy link

I'd like to export the individual statistics from a grouped_ggbetweenstats analysis as a .csv or .xslx file, or any other format really. Ideally you'd get the individual $caption_data, $subtitle_data, $pairwise_comparisons_data and other sections in a single file/sheet, but no matter what solution I come up with I'm hitting an issue with "expression" columns within the tibbles that contain a list of statistical expressions.

e.g.

library(PMCMRplus)
p <- grouped_ggbetweenstats(data = mtcars, x = gear, y = mpg, grouping.var = am)
p

image

all columns inside the invididual tibbles are fine except for "expression":

extract_stats(p[[1]])$caption_data
# A tibble: 1 × 16
  term       effectsize      estimate conf.level conf.low conf.high    pd prior.distribution prior.location
  <chr>      <chr>              <dbl>      <dbl>    <dbl>     <dbl> <dbl> <chr>                       <dbl>
1 Difference Bayesian t-test    -3.69       0.95    -7.63    0.0293 0.974 cauchy                          0
  prior.scale  bf10 method          conf.method log_e_bf10 n.obs expression
        <dbl> <dbl> <chr>           <chr>            <dbl> <int> <list>    
1       0.707  3.43 Bayesian t-test ETI               1.23    19 <language>

Trying to write to an .xlsx for example with single sheets containing all the statistics for a group:

library(do)
for (i in 1:length(p)) {
  subplot <- extract_stats(p[[i]])
  sheetname <- paste0("group", i)
  do::write_xlsx(subplot$subtitle_data, file = "stats.xlsx", sheet = sheetname)
  do::write_xlsx(subplot$caption_data, file = "stats.xlsx", sheet = sheetname, append = TRUE)
}

but there's always an error in most functions that export the data objects (also with e.g. openxlsx::writeData):

Error in FUN(X[[i]], ...) : 
  argument `...` should be a character vector (or an object coercible to)
In addition: Warning message:
In is.na(x) : is.na() applied to non-(list or vector) of type 'language'

Any suggestions or recommendations for better practices on getting all the statistics in some type of delimited file? I guess one option could be to omit the "expression" column before, but I don't think that all tibbles listed inside extract_stats have that column, if that could be an issue.
Also, any way to easily access the label of the group to print that label along with all the exported data (in the above example comes from grouping.var = am which is just "0" or "1"), other than how it is indexed and ordered?

Any ideas or suggestions would really be welcome. Thank you.

@oranwutang
Copy link

oranwutang commented Jul 25, 2024

Hi, @lxsteiner!

I came across with exactly the same problem trying to do exactly what you're trying to do!

it results that the 'expression' column is a list containing a language expression and that's not an atomic vector, you can do something like:

subplot$subtitle_data$expression <- as.character(subplot$subtitle_data$expression)

Your code should look like:

library(do)
for (i in 1:length(p)) {
  subplot <- extract_stats(p[[i]])
  sheetname <- paste0("group", i)
  subplot$subtitle_data$expression <- as.character(subplot$subtitle_data$expression)
  do::write_xlsx(subplot$subtitle_data, file = "stats.xlsx", sheet = sheetname)
  do::write_xlsx(subplot$caption_data, file = "stats.xlsx", sheet = sheetname, append = TRUE)
}

This will convert the expression column to an atomic vector of chars. Then, the resulting dataframe can be safely passed to the exporting function, in my case openxlsx::write.xlsx()

@IndrajeetPatil
Copy link
Owner

library(ggstatsplot)

p <- grouped_ggpiestats(mtcars, x = cyl, grouping.var = am)
extract_stats(p)
#> [[1]]
#> $subtitle_data
#> # A tibble: 1 × 13
#>   statistic    df p.value method                                   effectsize 
#>       <dbl> <dbl>   <dbl> <chr>                                    <chr>      
#> 1      7.68     2  0.0214 Chi-squared test for given probabilities Pearson's C
#>   estimate conf.level conf.low conf.high conf.method conf.distribution n.obs
#>      <dbl>      <dbl>    <dbl>     <dbl> <chr>       <chr>             <int>
#> 1    0.537       0.95   0.0666     0.725 ncp         chisq                19
#>   expression
#>   <list>    
#> 1 <language>
#> 
#> $caption_data
#> # A tibble: 1 × 4
#>    bf10 prior.scale method                                      expression
#>   <dbl>       <dbl> <chr>                                       <list>    
#> 1  1.15           1 Bayesian one-way contingency table analysis <language>
#> 
#> $pairwise_comparisons_data
#> NULL
#> 
#> $descriptive_data
#> # A tibble: 3 × 4
#>   cyl   counts  perc .label
#>   <fct>  <int> <dbl> <chr> 
#> 1 8         12  63.2 63%   
#> 2 6          4  21.1 21%   
#> 3 4          3  15.8 16%   
#> 
#> $one_sample_data
#> NULL
#> 
#> $tidy_data
#> NULL
#> 
#> $glance_data
#> NULL
#> 
#> attr(,"class")
#> [1] "ggstatsplot_stats" "list"             
#> 
#> [[2]]
#> $subtitle_data
#> # A tibble: 1 × 13
#>   statistic    df p.value method                                   effectsize 
#>       <dbl> <dbl>   <dbl> <chr>                                    <chr>      
#> 1      4.77     2  0.0921 Chi-squared test for given probabilities Pearson's C
#>   estimate conf.level conf.low conf.high conf.method conf.distribution n.obs
#>      <dbl>      <dbl>    <dbl>     <dbl> <chr>       <chr>             <int>
#> 1    0.518       0.95        0     0.741 ncp         chisq                13
#>   expression
#>   <list>    
#> 1 <language>
#> 
#> $caption_data
#> # A tibble: 1 × 4
#>    bf10 prior.scale method                                      expression
#>   <dbl>       <dbl> <chr>                                       <list>    
#> 1 0.434           1 Bayesian one-way contingency table analysis <language>
#> 
#> $pairwise_comparisons_data
#> NULL
#> 
#> $descriptive_data
#> # A tibble: 3 × 4
#>   cyl   counts  perc .label
#>   <fct>  <int> <dbl> <chr> 
#> 1 8          2  15.4 15%   
#> 2 6          3  23.1 23%   
#> 3 4          8  61.5 62%   
#> 
#> $one_sample_data
#> NULL
#> 
#> $tidy_data
#> NULL
#> 
#> $glance_data
#> NULL
#> 
#> attr(,"class")
#> [1] "ggstatsplot_stats" "list"
extract_subtitle(p)
#> [[1]]
#> list(chi["gof"]^2 * "(" * 2 * ")" == "7.68", italic(p) == "0.02", 
#>     widehat(italic("C"))["Pearson"] == "0.54", CI["95%"] ~ "[" * 
#>         "0.07", "0.73" * "]", italic("n")["obs"] == "19")
#> 
#> [[2]]
#> list(chi["gof"]^2 * "(" * 2 * ")" == "4.77", italic(p) == "0.09", 
#>     widehat(italic("C"))["Pearson"] == "0.52", CI["95%"] ~ "[" * 
#>         "0.00", "0.74" * "]", italic("n")["obs"] == "13")
extract_caption(p)
#> [[1]]
#> list(log[e] * (BF["01"]) == "-0.14", italic("a")["Gunel-Dickey"] == 
#>     "1.00")
#> 
#> [[2]]
#> list(log[e] * (BF["01"]) == "0.83", italic("a")["Gunel-Dickey"] == 
#>     "1.00")

Created on 2024-07-27 with reprex v2.1.1

@oranwutang
Copy link

So cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants