Creating a custom gt function for aligning first-row text and testing it with testthat

Creating and testing your own functions is fun!

gt
functions
testing
Author

Thomas Mock

Published

March 7, 2021

Creating and testing gt functions

It’s no secret, but I love the gt package and tables in general. I’ve been on a big table kick for almost a year at this point!

While I love all the amazing features built into gt, sometimes I also want to create my own functions to wrap or extend features.

For example, I’ve done:

  • Custom gt-themes and functions Blogpost
  • Embedding custom HTML Blogpost
  • Created repeatable beautiful table reporting Gist and Gist
  • Using patchwork to combine ggplot2 + gt Gist

This blogpost will cover how to solve a fairly common ask, how to add a symbol/character to the end of ONLY the first row of a column and maintain the alignment of the entire column. We’ll walk through how to accomplish this with gt only, creating our own function to do it more succinctly, and then how to further test our gt outputs with testthat!

No repeats

I’ve always been a fan of not having to repeat symbols/prefixes/suffixes inside tables. There’s some ongoing work here in gt to add this as a feature, but in the meantime I wanted to play around with a few ways to accomplish this with gt as it is, and/or a custom function as of today.

You can imagine a situation like below, where we want to label cells within a column as a percent, and want to indicate that it’s a percent ONLY on the first row.

head(gtcars) %>%
  mutate(hp_pct = (hp/max(hp) * 100)) %>% 
  dplyr::select(mfr, model, year, trim, hp, hp_pct) %>%
  gt() %>% 
  fmt_percent(columns = vars(hp_pct), rows = 1, scale_values = FALSE) %>% 
  fmt_number(columns = vars(hp_pct), rows = 2:6) %>% 
  tab_style(
    style = cell_text(color = "red"), 
    locations = cells_body(vars(hp_pct), rows = 1)
    )
Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead
mfr model year trim hp hp_pct
Ford GT 2017 Base Coupe 647 97.88%
Ferrari 458 Speciale 2015 Base Coupe 597 90.32
Ferrari 458 Spider 2015 Base 562 85.02
Ferrari 458 Italia 2014 Base Coupe 562 85.02
Ferrari 488 GTB 2016 Base Coupe 661 100.00
Ferrari California 2015 Base Convertible 553 83.66

However, you can quickly see that this misaligned the first row from the remaining rows.


No repeats in gt

An alternative would be to convert those rows to text and apply specific changes.

There’s quite a bit going on here:

  • Must use a mono space font for the column of interest
    • Must be mono-spaced so that everything aligns properly
  • Align the now text column to be right-aligned
    • Align to right, so again the decimal places align (text default aligns to left otherwise)
  • Use gt::text_transform() to add percent to the first row
    • use base::format() to round and “force” a specific number of decimal places
  • Use gt::text_transform() to add non-breaking space "&nbsp" to remaining rows
    • Must use "&nbsp", which is the HTML code for nonbreaking space, as a raw space (eg " ") will not work

I want to pause here and say with the code below, we have officially accomplished our goal. However, this was fairly manual and can be repetitive for adding several of these transformations in a single table.

head(gtcars) %>%
  mutate(hp_pct = (hp/max(hp) * 100)) %>% 
  dplyr::select(mfr, model, year, trim, hp, hp_pct) %>%
  gt() %>%
  # use a mono-spaced font
  tab_style(
    style = cell_text(font = google_font("Fira Mono")),
    locations = cells_body(columns = vars(hp_pct))
    ) %>% 
  # align the column of interst to right
  cols_align(align = "right", columns = vars(hp_pct)) %>% 
  # round and transform the first row to percent
  text_transform(
    locations = cells_body(vars(hp_pct), rows = 1),
    fn = function(x){ 
      fmt_val <- format(as.double(x), nsmall = 1, digits = 1)
      paste0(fmt_val, "%") %>% gt::html()}
  ) %>% 
  text_transform(
    locations = cells_body(vars(hp_pct), rows = 2:6),
    fn = function(x){ 
      # round remaining rows, add a non-breaking space
     fmt_val <- format(as.double(x), nsmall = 1, digits = 1)
     lapply(fmt_val, function(x) paste0(x, '&nbsp') %>% gt::html())
  })
Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead
mfr model year trim hp hp_pct
Ford GT 2017 Base Coupe 647 97.9%
Ferrari 458 Speciale 2015 Base Coupe 597 90.3&nbsp
Ferrari 458 Spider 2015 Base 562 85.0&nbsp
Ferrari 458 Italia 2014 Base Coupe 562 85.0&nbsp
Ferrari 488 GTB 2016 Base Coupe 661 100.0&nbsp
Ferrari California 2015 Base Convertible 553 83.7&nbsp

Format symbol first Function

We can try to wrap some of the gt code into a function and apply these transformations in bulk at the location of our choosing! This is especially important for making it generally apply to other types of inputs instead of JUST %. The function of interest is actually two custom functions, some gt functions, and a good chunk of logic.

I’ve commented the individual sections as to their purpose, and included quite a bit of error-handling or protecting against various user inputs.

fmt_symbol_first <- function(
  gt_data,
  column = NULL,        # column of interest to apply to
  symbol = NULL,        # symbol to add, optionally
  suffix = "",          # suffix to add, optionally
  decimals = NULL,      # number of decimal places to round to
  last_row_n,           # what's the last row in data?
  symbol_first = FALSE  # symbol before or after suffix?
) {
  
  # Test and error out if mandatory columns are missing
  stopifnot("`symbol_first` argument must be a logical" = is.logical(symbol_first))
  stopifnot("`last_row_n` argument must be specified and numeric" = is.numeric(last_row_n))
  stopifnot("Input must be a gt table" = class(gt_data)[[1]] == "gt_tbl")

  # needs to type convert to double to play nicely with decimals and rounding
  # as it's converted to character by gt::text_transform
  add_to_first <- function(x, suff = suffix, symb = symbol) {
    if (!is.null(decimals)) {
      x <- suppressWarnings(as.double(x))
      fmt_val <- format(x = x, nsmall = decimals, digits = decimals)
    } else {
      fmt_val <- x
    }

    # combine the value, passed suffix, symbol -> html
    if (isTRUE(symbol_first)) {
      paste0(fmt_val, symb, suff) %>% gt::html()
    } else {
      paste0(fmt_val, suff, symb) %>% gt::html()
    }
  }

  # repeat non-breaking space for combined length of suffix + symbol
  # logic is based on is a NULL passed or not
  if (!is.null(symbol) | !identical(as.character(symbol), character(0))) {
    suffix <- ifelse(identical(as.character(suffix), character(0)), "", suffix)
    length_nbsp <- c("&nbsp", rep("&nbsp", nchar(suffix))) %>%
      paste0(collapse = "")
  } else {
    suffix <- ifelse(identical(as.character(suffix), character(0)), "", suffix)
    length_nbsp <- rep("&nbsp", nchar(suffix)) %>%
      paste0(collapse = "")
  }

  # affect rows OTHER than the first row
  add_to_remainder <- function(x, length = length_nbsp) {
    if (!is.null(decimals)) {
      # if decimal not null, convert to double
      x <- suppressWarnings(as.double(x))
      # then round and format ALL to force specific decimals
      fmt_val <- format(x = x, nsmall = decimals, digits = decimals)
    } else {
      fmt_val <- x
    }
    paste0(fmt_val, length) %>% lapply(FUN = gt::html)
  }

  # pass gt object
  # align right to make sure the spacing is meaningful
  gt_data %>%
    cols_align(align = "right", columns = vars({{ column }})) %>%
    # convert to mono-font for column of interest
    tab_style(
      style = cell_text(font = google_font("Fira Mono")),
      locations = cells_body(columns = vars({{ column }}))
    ) %>%
    # transform first rows
    text_transform(
      locations = cells_body(vars({{ column }}), rows = 1),
      fn = add_to_first
    ) %>%
    # transform remaining rows
    text_transform(
      locations = cells_body(vars({{ column }}), rows = 2:last_row_n),
      fn = add_to_remainder
    )
}

Use the function

We can now use that fmt_symbol_first() function, note that I’m testing a few different combinations of suffix/symbols, decimals, etc that may be a bit nonsensical in the table itself but are interactively testing that the results are what I expect. Specifically, I’m making sure that symbols/suffixes are added, and that the spacing is correct. While this is useful for sanity checking quickly, we can also take another step to apply some proper unit-testing in the next section.

gtcars %>% 
  head() %>% 
  dplyr::select(mfr, year, bdy_style, mpg_h, hp) %>% 
  dplyr::mutate(mpg_h = rnorm(n = dplyr::n(), mean = 22, sd = 1)) %>% 
  gt() %>% 
  opt_table_lines() %>% 
  fmt_symbol_first(column = mfr, symbol = "&#x24;", suffix = " ", last_row_n = 6) %>%
  fmt_symbol_first(column = year, symbol = NULL, suffix = "%", last_row_n = 6) %>%
  fmt_symbol_first(column = mpg_h, symbol = "&#37;", suffix = NULL, last_row_n = 6, decimals = 1) %>% 
  fmt_symbol_first(column = hp, symbol = "&#176;", suffix = "F", last_row_n = 6, decimals = NULL, symbol_first = TRUE)
Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead
mfr year bdy_style mpg_h hp
Ford $ 2017% coupe 23.4% 647°F
Ferrari&nbsp&nbsp 2015&nbsp coupe 22.1&nbsp 597&nbsp&nbsp
Ferrari&nbsp&nbsp 2015&nbsp convertible 21.4&nbsp 562&nbsp&nbsp
Ferrari&nbsp&nbsp 2014&nbsp coupe 23.7&nbsp 562&nbsp&nbsp
Ferrari&nbsp&nbsp 2016&nbsp coupe 23.1&nbsp 661&nbsp&nbsp
Ferrari&nbsp&nbsp 2015&nbsp convertible 22.3&nbsp 553&nbsp&nbsp

Unit testing

At this point, we’ve created a custom gt wrapper function, added some relatively robust checks into the function, but are still manually checking the output confirms to our expectations. We can perform proper unit testing with the {testthat} package.

Testing your code can be painful and tedious, but it greatly increases the quality of your code. testthat tries to make testing as fun as possible, so that you get a visceral satisfaction from writing tests.

While an in-depth run through of testhat is beyond the scope of this post, I have included an expandable section with a minimal example below, expanded from the “R Packages” book chapter on testing:

testthat Example
[1] 1
str_length("ab")  # 2
[1] 2
str_length("abc") # 3
[1] 3

So str_length() counts the length of a string, fairly straightforward!

We can convert this to a logical confirmation, which means that a computer can understand if the output was as expected, rather than just printing and reading which is mainly for our interactive use. I have included one FALSE output just as an example.

str_length("a")   == 1 # 1 TRUE
[1] TRUE
str_length("ab")  == 2 # 2 TRUE
[1] TRUE
str_length("abc") == 3 # 3 TRUE
[1] TRUE
str_length("abc") == 1 # 3 FALSE
[1] FALSE

While this testing is useful, we can make it even easier with testhat, by using expect_equal(). Now, these functions will not return anything if they pass. If they fail, then they will print an error, and a helpful statement saying what the failure was.

### All TRUE
testthat::expect_equal(str_length("a"),   1) # TRUE
testthat::expect_equal(str_length("ab"),  2) # TRUE
testthat::expect_equal(str_length("abc"), 3) # TRUE

Just to show you, here’s one where we get a FALSE, the match is off by 2.

testthat::expect_equal(str_length("a"),  3) # FALSE
Error: str_length("a") not equal to 3.
1/1 mismatches
[1] 1 - 3 == -2

The last step, is wrapping our various tests into test_that structure. Here, while the individual tests return no visible output, we can get a friendly message saying they have all passed!

test_that(
  desc = "str_length is number of characters",
  code = {
    expect_equal(str_length("a"), 1)
    expect_equal(str_length("ab"), 2)
    expect_equal(str_length("abc"), 3)
  }
)
Test passed 🥳

We can also see what happens if there is a failure (abcd is not 3 characters, but 4).

test_that(
  desc = "str_length is number of characters",
  code = {
    expect_equal(str_length("a"), 1)
    expect_equal(str_length("ab"), 2)
    expect_equal(str_length("abc"), 3)
    expect_equal(str_length("abcd"), 3)
  }
)
── Failure (<text>:7:5): str_length is number of characters ────────────────────
str_length("abcd") not equal to 3.
1/1 mismatches
[1] 4 - 3 == 1
Error:
! Test failed

These tests can be used interactively, but ultimately are even more useful when rolled into an R package. For that next step, I recommend reading through the “R Packages” book, specifically the Packages Chapter.

Testing gt

Now you may say, well those minimal example tests were easy, it’s just counting?! How do I test gt? We can treat gt exactly like what it is, a HTML table. Quick example below using our custom function (fmt_symbol_first()).

ex_gt <- gtcars %>% 
  head() %>% 
  dplyr::select(mfr, year, bdy_style, mpg_h, hp) %>% 
  dplyr::mutate(mpg_h = c(20.2, 22.0, 20.8, 21.2, 22.8, 22.7)) %>% 
  gt() %>% 
  opt_table_font(font = google_font("Roboto Mono")) %>%
  opt_table_lines() %>% 
  fmt_symbol_first(column = mfr, symbol = "&#x24;", suffix = " ", last_row_n = 6) %>%
  fmt_symbol_first(column = year, symbol = NULL, suffix = "%", last_row_n = 6) %>%
  fmt_symbol_first(column = mpg_h, symbol = "&#37;", suffix = NULL, last_row_n = 6, decimals = 1) %>% 
  fmt_symbol_first(column = hp, symbol = "&#176;", suffix = "F", last_row_n = 6, decimals = NULL, symbol_first = TRUE)
Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead

Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
* please use `columns = c(...)` instead
# example table
ex_gt