-
Notifications
You must be signed in to change notification settings - Fork 24
/
04-modify-columns.qmd
400 lines (320 loc) · 11.7 KB
/
04-modify-columns.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
---
title: "Modifying Columns"
format: html
html-table-processing: none
---
```{r}
library(gt)
library(tidyverse)
```
## Intro
The `cols_*()` functions allow for modifications that act on entire columns. This includes alignment of the data in columns (`cols_align()`), hiding columns from view (`cols_hide()`), re-labeling the column labels (`cols_label()`), merging two columns together (`cols_merge*()`), moving columns around (`cols_move*()`), and using a column name delimiter to create labels in the column spanner (`cols_split_delim()`).
------
### Functions in this module
- `cols_align()`
- `cols_label()`
- `cols_width()`
- `cols_move()`
- `cols_hide()`
- `cols_merge_range()`
- `cols_merge_uncert()`
- `cols_merge_n_pct()`
- `cols_merge()`
------
### `cols_align()`: Set the alignment of columns
```r
cols_align(
data,
align = c("auto", "left", "center", "right"),
columns = everything()
)
```
The individual alignments of columns (which includes the column labels and all of their data cells) can be modified. We have the option to align text to the `left`, the `center`, and the `right`. In a less explicit manner, we can allow **gt** to automatically choose the alignment of each column based on the data type (with the `auto` option).
##### EXAMPLE
Use `countrypops` to create a **gt** table. Align the `population` column data to the left.
```{r}
countrypops |>
dplyr::select(-contains("code")) |>
dplyr::filter(country_name == "Mongolia") |>
tail(5) |>
gt() |>
cols_align(
align = "left",
columns = population
)
```
------
### `cols_label()`: Relabel one or more columns
```r
cols_label(
.data,
...,
.list = list2(...),
.fn = NULL
)
```
Column labels can be modified from their default values (the names of the columns from the input table data). When you create a **gt** table object using `gt()`, column names effectively become the column labels. While this serves as a good first approximation, column names aren't often appealing as column labels in a **gt** output table. The `cols_label()` function provides the flexibility to relabel one or more columns and we even have the option to use the `md()` of `html()` helper functions for rendering column labels from Markdown or using HTML.
##### EXAMPLES
Use `countrypops` to create a **gt** table; label all the table's columns to present better.
```{r}
countrypops |>
dplyr::select(-contains("code")) |>
dplyr::filter(country_name == "Mongolia") |>
tail(5) |>
gt() |>
cols_label(
country_name = "Name",
year = "Year",
population = "Population"
)
```
Use `countrypops` to create a **gt** table; label columns as before but make them bold with markdown formatting.
```{r}
countrypops |>
dplyr::select(-contains("code")) |>
dplyr::filter(country_name == "Mongolia") |>
tail(5) |>
gt() |>
cols_label(
country_name = "**Name**",
year = "**Year**",
population = "**Population**",
.fn = md
)
```
------
### `cols_width()`: Set the widths of columns
```r
cols_width(
.data,
...,
.list = list2(...)
)
```
Manual specifications of column widths can be performed using the
`cols_width()` function. We choose which columns get specific widths (in
pixels, usually by use of the `px()` helper function). Width assignments are supplied in `...` through two-sided formulas, where the left-hand side
defines the target columns and the right-hand side is a single width value in pixels.
##### EXAMPLES
Use `exibble` to create a **gt** table. With named arguments in `...`, we can specify the exact widths for table columns (using `everything()` will capture all remaining columns).
```{r}
exibble |>
dplyr::select(num, char, date, datetime, row) |>
gt() |>
cols_width(
num ~ px(150),
ends_with("r") ~ px(100),
starts_with("date") ~ px(200),
everything() ~ px(60)
)
```
------
### `cols_move()`: Move one or more columns
```r
cols_move(
data,
columns,
after
)
```
On those occasions where you need to move columns this way or that way, we can make use of the `cols_move()` function. While it's true that the movement of columns can be done upstream of **gt**'s API, it is much easier and less error prone to use the function provided here. The movement procedure here takes one or more specified columns (in the `columns` argument) and places them to the right of a different column (the `after` argument). The ordering of the columns to be moved is preserved, as is the ordering of all other columns in the table.
##### EXAMPLE
Use `countrypops` to create a **gt** table; With the remaining columns, position `population` after `country_name`.
```{r}
countrypops |>
dplyr::select(-contains("code")) |>
dplyr::filter(country_name == "Mongolia") |>
tail(5) |>
gt() |>
cols_move(
columns = population,
after = country_name
)
```
------
### `cols_hide()`: Hide one or more columns
```r
cols_hide(
data,
columns
)
```
The `cols_hide()` function allows us to hide one or more columns from appearing in the final output table. While it's possible and often desirable to omit columns from the input table data before introduction to the `gt()` function, there can be cases where the data in certain columns is useful (as a column reference during formatting of other columns) but the final display of those columns is not necessary.
##### EXAMPLES
Use `countrypops` to create a **gt** table; Hide the columns `country_code_2` and `country_code_3`.
```{r}
countrypops |>
dplyr::filter(country_name == "Mongolia") |>
tail(5) |>
gt() |>
cols_hide(columns = c(country_code_2, country_code_3))
```
Use `countrypops` to create a **gt** table and use the `population` column to provide the conditional placement of footnotes, then hide that column and one other.
```{r}
countrypops |>
dplyr::filter(country_name == "Mongolia") |>
tail(5) |>
gt() |>
cols_hide(columns = c(country_code_3, population)) |>
tab_footnote(
footnote = "Population above 3,200,000.",
locations = cells_body(
columns = year,
rows = population > 3200000
),
placement = "right"
)
```
------
### `cols_merge_range()`: Merge two columns to a value range column
```r
cols_merge_range(
data,
col_begin,
col_end,
rows = everything(),
sep = "--",
autohide = TRUE
)
```
The `cols_merge_range()` function is a specialized variant of the `cols_merge()` function. It operates by taking a two columns that constitute a range of values (`col_begin` and `col_end`) and merges them into a single column. What results is a column containing both values separated by a long dash (e.g., `12.0 — 20.0`). The column specified in `col_end` is dropped from the output table
##### EXAMPLE
Use `gtcars` to create a **gt** table, keeping only the `model`, `mpg_c`, and `mpg_h` columns; merge the mpg columns together as a single range column (which is labeled as `MPG`, in italics).
```{r}
gtcars |>
dplyr::select(model, starts_with("mpg")) |>
dplyr::slice(1:8) |>
gt() |>
cols_merge_range(
col_begin = mpg_c,
col_end = mpg_h
) |>
cols_label(
mpg_c = md("*MPG*")
)
```
------
### `cols_merge_uncert()`: Merge two columns to a value & uncertainty column
```r
cols_merge_uncert(
data,
col_val,
col_uncert,
rows = everything(),
sep = " +/- ",
autohide = TRUE
)
```
The `cols_merge_uncert()` function is a specialized variant of the `cols_merge()` function. It operates by taking a base value column (`col_val`) and an uncertainty column (`col_uncert`) and merges them into a single column. What results is a column with values and associated uncertainties (e.g., `12.0 ± 0.1`), and, the column specified in `col_uncert` is dropped from the output table.
##### EXAMPLE
Use `exibble` to create a **gt** table, keeping only the `currency` and `num` columns; merge columns into one with a base value and uncertainty (after formatting the `num` column).
```{r}
exibble |>
dplyr::select(currency, num) |>
dplyr::slice(1:7) |>
gt() |>
fmt_number(
columns = num,
decimals = 3,
use_seps = FALSE
) |>
cols_merge_uncert(
col_val = currency,
col_uncert = num
) |>
cols_label(currency = "value + uncert.")
```
------
### `cols_merge_n_pct()`: Merge two columns to combine counts and percentages
```r
cols_merge_uncert(
data,
col_n,
col_pct,
rows = everything(),
autohide = TRUE
)
```
The `cols_merge_n_pct()` function is a specialized variant of the `cols_merge()` function. It operates by taking two columns that constitute both a count (`col_n`) and a fraction of the total population (`col_pct`) and merges them into a single column. What results is a column containing both counts and their associated percentages (e.g., `12 (23.2%)`). The column specified in `col_pct` is dropped from the output table.
##### EXAMPLE
Use `pizzaplace` to create a **gt** table that displays the counts and percentages of the top 3 pizzas sold by pizza category in 2015. The `cols_merge_n_pct()` function is used to merge the `n` and `frac` columns (and the `frac` column is formatted using `fmt_percent()`).
```{r}
pizzaplace |>
dplyr::group_by(name, type, price) |>
dplyr::summarize(
n = dplyr::n(),
frac = n / nrow(pizzaplace),
.groups = "drop"
) |>
dplyr::arrange(type, dplyr::desc(n)) |>
dplyr::group_by(type) |>
dplyr::slice_head(n = 3) |>
gt(
rowname_col = "name",
groupname_col = "type"
) |>
fmt_integer(n) |>
fmt_percent(frac) |>
cols_merge_n_pct(
col_n = n,
col_pct = frac
) |>
fmt_currency(price) |>
cols_label(
n = "*N* \n(%)",
price = "Price \n*(USD)*",
.fn = md
) |>
tab_style(
style = cell_text(font = "monospace"),
locations = cells_stub()
) |>
tab_stubhead(md("Cat. and \nPizza Code")) |>
tab_header(title = "Top 3 Pizzas Sold by Category in 2015") |>
tab_options(table.width = px(512))
```
------
### `cols_merge()`: Merge data from two or more columns to a single column
```r
cols_merge(
data,
columns,
hide_columns = columns[-1],
rows = everything(),
pattern = NULL
)
```
This function takes input from two or more columns and allows the contents to be merged them into a single column, using a pattern that specifies the formatting. We can specify which columns to merge together in the `columns` argument. The string-combining pattern is given in the `pattern` argument. The first column in the `columns` series operates as the target column (i.e., will undergo mutation) whereas all following `columns` will be untouched.
##### EXAMPLE
Use `sp500` to create a **gt** table; merge the `open` & `close` columns together, and, the `low` & `high` columns (putting an em dash between both); rename the columns.
```{r}
sp500 |>
dplyr::slice(50:55) |>
dplyr::select(-volume, -adj_close) |>
gt() |>
cols_merge(
columns = c(open, close),
hide_columns = close,
pattern = "{1} - {2}"
) |>
cols_merge(
columns = c(low, high),
hide_columns = high,
rows = 1:3,
pattern = "{1} - {2}"
) |>
cols_label(
open = "open/close",
low = "low/high"
)
```
------
### SUMMARY
1. Values in columns can be aligned with the `cols_align()` function.
2. Column labels can be set with `cols_label()` (they otherwise inherit column names from the input dataset).
3. The widths of columns can precisely sized with `cols_width()` (HTML output only, for now); use the `px()` and `pct()` helpers to make this easier.
4. Didn't arrange columns in the original dataset? It's possible with `cols_move()`.
5. Columns you don't want to show can be 'hidden' (not shown) with `cols_hide()`; this is good when you a column for something but don't want to display it in the end.
6. Create ranged values in a column using two different columns and `cols_merge_range()`.
7. Create values with uncertainties with two columns and `cols_merge_uncert()`.
8. Merge multiple columns in an arbitrary way with `cols_merge()` and a pattern (`{1} + {2}...`)