class: center, middle, inverse, title-slide # Advanced R - Hadley Wickham
Chapter 18
Metaprogramming: Expressions ### Alejandra Hernandez ### 27/10/2020 --- <style> .col2 { columns: 2 200px; /* number of columns and width in pixels*/ -webkit-columns: 2 200px; /* chrome, safari */ -moz-columns: 2 200px; /* firefox */ } </style>
# Welcome! - This book club is a joint effort between RLadies Nijmegen, Rotterdam, 's-Hertogenbosch (Den Bosch), Amsterdam and Utrecht - We meet every 2 weeks to go through a chapter - Use the [HackMD](https://hackmd.io/RTwJmhNKRPas0JUrHjAGqw) to present yourself, ask questions and see your breakout room - There are still possibilities to present a chapter :) Sign up at [rladiesnl.github.io/book_club](https://rladiesnl.github.io/book_club/) --- class: middle, inverse # .fancy[Let's start!] <img src="Begin.gif" style="display: block; margin: auto;" /> --- # Metaprogramming - Code is data - Code is a tree - Code can generate code --- # Expressions > "Expressions, an object that captures the structure of the code without evaluating it (i.e. running it)" -- Here is some code: ```r y <- x * 10 ``` ``` Error in eval(expr, envir, enclos): object 'x' not found ``` -- Here is some expression: ```r z <- rlang::expr(y <- x * 10) z ``` ``` y <- x * 10 ``` -- Here is how to evaluate an expression: ```r x <- 4 eval(z) y ``` ``` [1] 40 ``` --- # Objectives - Understand expressions! - Learn how to inspect and modify captured code - Eventually be able to generate code with code -- We will use two packages for that: ```r library(rlang) library("lobstr") ``` --- # Abstract Syntax Trees (AST) - Useful to inspect and modify expressions + Let's us see the "hierarchy" of the code -- ```r lobstr::ast(f(x, "y", 1)) ``` ``` o-f +-x +-"y" \-1 ``` <div class="col2"> <img src="fig1_1.png" style="display: block; margin: auto;" /> <img src="fig1.png" width="80%" style="display: block; margin: auto;" /> </div> --- # Understanding the tree - Leaves: symbols or constants (comments not included) - Branches: call objects (function calls) + The first child (f) is the function that gets called + The second and subsequent children (x, "y", and 1) are the arguments to that function .panelset[ .panel[.panel-name[lobstr AST] ```r lobstr::ast(f(g(1, 2), h(3, 4, i()))) ``` ``` o-f +-o-g | +-1 | \-2 \-o-h +-3 +-4 \-o-i ``` ] .panel[.panel-name[Graphical AST] <img src="fig2.png" width="50%" style="display: block; margin: auto;" /> ] ] --- # Infix calls These two are the same: ```r y <- x * 10 `<-`(y, `*`(x, 10)) ``` -- ```r expr(`<-`(y, `*`(x, 10))) ``` ``` y <- x * 10 ``` -- .panelset[ .panel[.panel-name[lobstr AST] ```r lobstr::ast(y <- x * 10) ``` ``` o-`<-` +-y \-o-`*` +-x \-10 ``` ] .panel[.panel-name[Graphical AST] <img src="fig3.png" width="30%" style="display: block; margin: auto;" /> ] ] --- # Expressions - Constant - Symbol - Call | Expression type | Content | Creation | Testing function | |:---------------:|:-----------------------:|:---------------:|:-----------------------------:| | Constant | NULL or length-1 vector | self-quoting | rlang::is_syntactic_literal() | | Symbol | Name of an object | expr() or sym() | is.symbol() | | Call | Captured function call | expr() | is.call() | --- # Expressions - Examples **Constant** ```r x <- "y" x ``` ``` [1] "y" ``` ```r rlang::is_syntactic_literal(2L) ``` ``` [1] TRUE ``` **Symbol** ```r #Symbol x <- sym("y") x ``` ``` y ``` ```r is.symbol(x) ``` ``` [1] TRUE ``` --- **Call** ```r lobstr::ast(read.table("important.csv", row.names = FALSE)) ``` ``` o-read.table +-"important.csv" \-row.names = FALSE ``` ```r x <- expr(read.table("important.csv", row.names = FALSE)) x ``` ``` read.table("important.csv", row.names = FALSE) ``` ```r is.call(x) ``` ``` [1] TRUE ``` --- # Subsetting calls - Calls behave like lists + First element is the function being called + The other elements are the arguments ```r x[[1]] ``` ``` read.table ``` ```r is.symbol(x[[1]]) ``` ``` [1] TRUE ``` ```r as.list(x[-1]) ``` ``` [[1]] [1] "important.csv" $row.names [1] FALSE ``` ```r x$row.names ### Note this only works if the arguments are named in your call!! ``` ``` [1] FALSE ``` --- - What if you forgot to name your arguments? How to find in which position it is inside the call? -- ANSWER: You don't need to! ```r x <- rlang::call_standardise(x) as.list(x[-1]) ``` ``` $file [1] "important.csv" $row.names [1] FALSE ``` ```r x$file ``` ``` [1] "important.csv" ``` -- - Still, `rlang::call_standardise()` will have problems with the `...` arguments -- - Do you start seeing applications? --- # Example from a friend Note of caution! There are other ways (probably better) to do this! But my friend had reasons to want to do it this way: ```r # VERY simplified version of her data: df_1 <- matrix(1:12, ncol = 3) df_2 <- matrix(letters, ncol = 2) df_3 <- matrix(20:200, ncol = 10) all_my_matrices <- c("df_1", "df_2", "df_3") for(mx in all_my_matrices){ # Whatever changes she wanted to make write.csv(eval(as.name(mx)), paste0(mx, ".csv")) } ``` --- # Function position - First position in the call object ```r lobstr::ast(foo()) ``` ``` o-foo ``` - What about functions that do not exist in the current environment? .panelset[ .panel[.panel-name[lobstr AST] ```r lobstr::ast(pkg::foo(1)) ## Function belongs to a different package ``` ``` o-o-`::` | +-pkg | \-foo \-1 ``` ```r lobstr::ast(obj$foo(1)) ## Function is a method of an R6 object ``` ``` o-o-`$` | +-obj | \-foo \-1 ``` ] .panel[.panel-name[Graphical AST] <img src="fig4.png" width="50%" style="display: block; margin: auto;" /> ] ] --- # Constructing calls - Create a call from its components using `rlang::call2()` ```r # Note the use of "" or expr() when calling existing objects call2("mean", x = expr(x), na.rm = TRUE) ``` ``` mean(x = x, na.rm = TRUE) ``` ```r call2(expr(base::mean), x = expr(x), na.rm = TRUE) ``` ``` base::mean(x = x, na.rm = TRUE) ``` -- - And now... do you see the application? --- # Parsing Some definitions: > **Parsing:** process by which a computer language takes a string and constructs an expression > **Grammar:** rules that govern parsing -- Important to consider: - *Operator precedence:* In the expression `1 + 2 * 3`, which function would be evaluated first? (`+` or `*`) + What about the function `!`? -- - *Associativity:* In the expression `1 + 2 + 3`, which function would be evaluated first? (the first `+` or the second `+`) + And when you have two `^` like in `2^3^2`? + And when you have two `<-` like in `x <- y <- 3`? --- # Parsing - Sometimes you have code stored in a string, and you want to parse it yourself + `rlang::parse_expr()` or (`rlang::parse_exprs()` when you have multiple expressions separated by "\n" or ";") ```r x1 <- "y <- x + 10" x1 ``` ``` [1] "y <- x + 10" ``` ```r is.call(x1) ``` ``` [1] FALSE ``` ```r x2 <- rlang::parse_expr(x1) x2 ``` ``` y <- x + 10 ``` ```r is.call(x2) ``` ``` [1] TRUE ``` --- # Deparsing - Given an expression, you want the string that would generate it ```r z <- expr(y <- x + 10) expr_text(z) ``` ``` [1] "y <- x + 10" ``` -- - Do you see the use of parsing/deparsing? --- # An example from Twitter <div class="figure" style="text-align: center"> <img src="Twitter_example.jpg" alt="Taken from a post from Garrick Aden-Buie (@grrrck, 18-10-2020)" width="78%" /> <p class="caption">Taken from a post from Garrick Aden-Buie (@grrrck, 18-10-2020)</p> </div> --- # A real life example (mine) **My problem:** I have many S4 objects that need to be merged using a special function from a package (`pkg::merge_S4()`). However, every time I have a different number of S4_objects and the `pkg::merge_S4()` does not accept lists or strings! :( ```r # Make a list of my S4 objects (I actually use functions to create those S4 objects and then get the list of their names with `ls()`) my_list_of_S4s <- c("my_S4_1", "your_S4_2", "your_S4_3") #Build my call my_call_in_text <- paste("pkg::merge_S4(", paste(my_list_of_S4s, collapse = ","), ")") my_call_in_text ``` ``` [1] "pkg::merge_S4( my_S4_1,your_S4_2,your_S4_3 )" ``` ```r my_call_ready <- parse_expr(my_call_in_text) my_call_ready ``` ``` pkg::merge_S4(my_S4_1, your_S4_2, your_S4_3) ``` ```r # Actually call it (not run now because my fake function does not exist) #eval(my_call_ready) ``` --- # Disclaimers - I did not cover the last two sections of the chapter 18! - There are other ways to do metaprogramming (follow chapter 19 about quasiquotation!) or other ways to solve the examples I put here - Although I hope you learnt something today, this is by no means a deep dive into metaprogramming... this is just the beginning! --- class: middle, inverse # .fancy[Time for exercises!] <img src="practice.gif" width="90%" style="display: block; margin: auto;" /> --- # Exercise 1 .panelset[ .panel[.panel-name[Question] Reconstruct the function corresponding to these AST ``` o-f \-o-g \-o-h ``` ``` o-`+` +-o-`+` | +-1 | \-2 \-3 ``` ``` o-`*` +-o-`(` | \-o-`+` | +-x | \-y \-z ``` ] .panel[.panel-name[Answer] ```r lobstr::ast(f(g(h()))) ``` ``` o-f \-o-g \-o-h ``` ```r ast(1 + 2 + 3) ``` ``` o-`+` +-o-`+` | +-1 | \-2 \-3 ``` ```r ast((x + y) * z) ``` ``` o-`*` +-o-`(` | \-o-`+` | +-x | \-y \-z ``` ] ] --- # Exercise 2 .panelset[ .panel[.panel-name[Question] What’s happening with the ASTs below? (Hint: carefully read ?"^") ```r ast(x ** y) ``` ``` o-`^` +-x \-y ``` ```r ast(1 -> x) ``` ``` o-`<-` +-x \-1 ``` ] .panel[.panel-name[Answer] 1. ** is translated by R’s parser into ^. 2. The expression is flipped when R parses it: ```r str(expr(a -> b)) ``` ``` language b <- a ``` ] ] --- # Exercise 3 .panelset[ .panel[.panel-name[Question] What does the call tree of an if statement with multiple else if conditions look like? Why? ] .panel[.panel-name[Answer_curly] In the "else" part of the AST just another expression is being evaluated, which happens to be an if statement. ```r ast( if (FALSE) { 1 } else if (TRUE) { 2 } ) ``` ``` o-`if` +-FALSE +-o-`{` | \-1 \-o-`if` +-TRUE \-o-`{` \-2 ``` ] .panel[.panel-name[Answer_no_curly] Without curly brackets: ```r ast( if (FALSE) 1 else if (TRUE) 2 ) ``` ``` o-`if` +-FALSE +-1 \-o-`if` +-TRUE \-2 ``` ] ] --- # Exercise 4 .panelset[ .panel[.panel-name[Question] What happens when you subset a call object to remove the first element, e.g. `expr(read.csv("foo.csv", header = TRUE))[-1]` Why? ] .panel[.panel-name[Answer] When the first element of a call object is removed, the second element moves to the first position, which is the function to call. Therefore, we get `"foo.csv"(header = TRUE)` ] ] --- # Exercise 5 .panelset[ .panel[.panel-name[Question] What happens when you subset a call object to remove the first element? e.g. `expr(read.csv("foo.csv", header = TRUE))[-1]`. Why? ] .panel[.panel-name[Answer] When the first element of a call object is removed, the second element moves to the first position, which is the function to call. Therefore, we get `"foo.csv"(header = TRUE)` ] ] --- # Exercise 6 .panelset[ .panel[.panel-name[Question] Why does this code not make sense? ```r x <- expr(foo(x = 1)) names(x) <- c("x", "") ``` ] .panel[.panel-name[Answer] We know that the first element of a call is always the function that gets called. Let’s see what happens when we run the code: ```r x <- rlang::expr(foo(x = 1)) x ``` ``` foo(x = 1) ``` ```r names(x) <- c("x", "") x ``` ``` foo(1) ``` ```r names(x) <- c("", "x") x ``` ``` foo(x = 1) ``` So giving the first element a name just adds metadata that R ignores. ] ] --- # Exercise 7 .panelset[ .panel[.panel-name[Question] Construct the expression `if(x > 1) "a" else "b"` using multiple calls to `call2()`. How does the structure code reflect the structure of the AST? ] .panel[.panel-name[Answer] Similar to the prefix version we get ```r call2("if", call2(">", sym("x"), 1), "a", "b") ``` ``` if (x > 1) "a" else "b" ``` When we read the AST from left to right, we get the same structure: Function to evaluate, expression, which is another function and is evaluated first, and two constants which will be evaluated next. ```r ast(`if`(x > 1, "a", "b")) ``` ``` o-`if` +-o-`>` | +-x | \-1 +-"a" \-"b" ``` ] ] --- class: middle, inverse # .fancy[We are done!] <img src="done.gif" width="130%" style="display: block; margin: auto;" /> --- class: middle, inverse # .fancy[Thank you!] <img src="thanks.gif" width="40%" style="display: block; margin: auto;" /> ## Do you want to present next? ### Or just follow the book club until the end!!