This function takes delimiters for the beginning and (optionally, if different) end of sections of a string, and returns a vector with split string elements.

split_sandwiches(.string, start_rx, end_rx = NULL)

Arguments

.string

A string

start_rx

A regular expression denoting the beginning of a section. Use fixed() for literals.

end_rx

A regular expression denoting the end of a section. If none supplied, sections end when the next start_rx is encountered.

Value

A vector of strings

Details

The main use case for split_sandwiches() is for html editing: You might want to separate the original text from the html tags, make certain edits to the text only, and then re-wrap the tags.

This is different from str_split() or similar, because the delimiters are preserved and remain attached to a section.

Note that split_sandwiches() is not vectorized (sorry). It only takes a single character object.

Examples

my_string <- "<span style='text-color:blue'> I am blue and <b>bold</b>, yay! </span>" split_sandwiches(my_string, "\\<[^\\>\\<]*\\>")
#> [1] "<span style='text-color:blue'>" " I am blue and " #> [3] "<b>" "bold" #> [5] "</b>" ", yay! " #> [7] "</span>"