CRAN release
tidyxl 1.0.0
This is a major release with some breaking changes. The previous release can be installed as follows.
devtools::install_version("tidyxl", version = "0.2.3", repos = "http://cran.us.r-project.org")
New features
xlsx_cells()
andxlsx_formats()
replacetidy_xlsx()
, which has been
deprecated.xlsx_cells()
returns a single data frame of all the cells in
scope (the whole workbook, or chosen sheets), rather than a list of separate
data frames for each sheet.xlsx_formats()
performs orders of magnitude
faster.xlsx_validation()
imports validation rules from cells that restrict data
input, such as cells that require a selection from a drop-down list. See
the
vignette
vignette("data-validation-rules", package = "tidyxl")
.xlsx_names()
imports defined names (aka named ranges/formulas), which can
be used to filter for particular ranges of cells by name. Useis_range()
to filter for ones that are named ranges, and then read
joining rules to cells
for how to join cell ranges to cell addresses. This will become easier in a
future release.is_range()
checks whether a formula is simply ranges of cells.xlex()
tokenises formulas. This is useful for detecting
spreadsheet smells like embedded constants and deep nesting. There is a
demo Shiny app, and a
vignette
vignette("smells", package = "tidyxl")
. A vector of Excel function names
excel_functions
can be used to separated built-in functions from custom
functions. More experimental features will be implemented in the off-CRAN
package lexl before becoming part of
tidyxl.xlsx_cells()$character_formatted
is a new column for the in-cell formatting
of text (#5). This is for when different parts of text in a single cell
have been formatted differently from one another.is_date_format()
checks whether a number format string is a date format.
This is useful if a cell formula contains a number formatting string (e.g.
TEXT(45678,"yyyy")
), and you need to know that the constant 45678 is a
date in order to recover it at full resolution (rather than parsing the
character output "2025" as a year).xlsx_color_theme()
and it's British aliasxlsx_colour_theme()
returns the
theme colour palette used in a file. This is useful to monitor use of a
corporate standard theme.xlsx_color_standard
and it's British aliasxlsx_colour_standard
are data
frames of the standard Excel palette (red
,blue
, etc.).- Shared formulas are propogated to all the cells that use the same formula
definition. Relative cell references are handled, so that the formula
=A1*2
in cellB1
becomes=A2*2
in cellB2
(for more details see
issue #7). - Formatting of alignment and cell protection is returned (#20).
Breaking changes and deprecations
tidy_xlsx()
has been deprecated in favour ofxlsx_cells()
,
which returns a data frame of all the cells in the workbook (or in the
requested sheets), andxlsx_formats()
, which returns a lookup list of cell
formats.- In
tidy_xlsx()
and one of it's replacmentsxlsx_cells()
- the column
content
has been replaced byis_blank
, a logical value
indicating whether the cell contains data. Please replace!is.na(content)
with!is_blank
to filter out blank cells (ones with formatting but no
value). - the column
formula_type
has been replaced byis_array
, a logical value
indicating whether the cell's formula is an array formula or not. In Excel
array formulas are represented visually by being surrounded by curly braces
{}
. - The order of columns has been changed so that the more useful columns are
visible in narrow consoles.
- the column
- in
xlsx_formats()
andtidy_xlsx()
, theme colours are given by name rather
than by number, e.g."accent6"
instead of4
.
Minor fixes and improvements
- Certain unusual custom number formats that specify colours (e.g.
"[Cyan]0%"
)
are no longer mis-detect as dates (#21).is_date_format()
tests whether a
number format is a date format. xlsx_formats()
is now thoroughly tested, and several relatively minor bugs
fixed. For example,xlsx_formats(path)$local$fill$patternFill$patternType
consistently returnsNA
and never"none"
when a pattern fill has not
been set, and escape-backslashes are consistently omitted from numFmts.
New dependency
xlex()
, is_range()
and the handling of relative references in shared
formulas requires a dependency on the
piton package, which wraps the
PEGTL C++ parser generator.