pacman::p_load(tidyverse, patchwork,
DT, ggiraph, plotly,
crosstalk)Hands-on Exercise 3A
3 Programming Interactive Data Visualisation with R
3.1 Overview and Learning Outcomes
This hands-on exercise is based on Chapter 3 of the R for Visual Analytics book.
The learning outcome is to create interactive data visualisation using functions in the ggiraph and plotlyr package.
3.2 Getting Started
3.2.1 Installing and Loading Required Libraries
In this hands-on exercise, the following R packages are used:
tidyverse (i.e. readr, tidyr, dplyr) for performing data science tasks such as importing, tidying, and wrangling data;
patchwork for preparing composite figures created using ggplot2;
DT for interfacing with JavaScript library DataTables that create interactive tables on html pages;
ggiraph for making ggplot graphics interactive;
plotly for plotting interactive statistical graphs; and
crosstalk for inter-widget interactivity for html widgets.
The code chunk below uses the p_load() function in the pacman package to check if the packages are installed. If yes, they are then loaded into the R environment. If no, they are installed, then loaded into the R environment.
3.2.2 Importing Data
The dataset for this hands-on exercise is imported into the R environment using the read_csv() function in the readr package and stored as the R object, exam_data.
exam_data = read_csv("data/Exam_data.csv")The tibble data frame, exam_data, has 7 columns and 322 rows.
It consists of the year-end examination grades of a cohort of 322 Primary 3 students from a local school.
The 7 variables/attributes are:
Categorical: ID, CLASS, GENDER, and RACE.
Continuous: MATHS, ENGLISH, and SCIENCE.
3.3 Interactive Data Visualisation: ggiraph Methods
The ggiraph package is an html widget and a ggplot2 extension that allows ggplot graphics to be interactive. This is achieved using the interactive geometries that can understand three arguments:
tooltip: a column of datasets that contain tooltips to be displayed when the mouse is pointing to the elements.;
onclick: a column of datasets that contain a JavaScript function to be executed when the elements are clicked on; and
data_id: a column of datasets that contain an ID to be associated with the elements.
If used within a Shiny application, the elements associated with an ID (data_id) can be selected and manipulated on the client and server sides.
3.3.1 Tooltip Effect with tooltip Aesthetic
A typical code chunk to plot an interactive statistical graph using functions in the ggiraph package consists of two parts:
An interactive version of a ggplot object is created using the
geom_dotplot_interactive()function; andThe
girafe()function is then used to generate an interactive svg object to be displayed on the html page.
The “tooltip” aesthetic argument of the geom_dotplot_interactive() function is used to specify the field that will be displayed in the tooltip.
In the plot below, when the mouse pointer hovers over a data point of interest, the student’s ID is displayed.
p1 = ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = ID),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(ggobj = p1,
width_svg = 6,
height_svg = 6*0.618)3.3.2 Displaying Multiple Information on Tooltip
The content of the tooltip can be customised by including a list object. A new field, tooltip, was created in the tibble data frame, exam_data. It is populated with information from the ID and CLASS fields. This tooltip is then used in place of ID in the “tooltip” aesthetic argument of the geom_dotplot_interactive() function.
When the mouse pointer hovers over a data point of interest, the student’s ID and class are displayed.
exam_data$tooltip = c(paste0(
"Name = ", exam_data$ID,
"\n Class = ", exam_data$CLASS))
p2 = ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = exam_data$tooltip),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(ggobj = p2,
width_svg = 8,
height_svg = 8*0.618)3.3.3 Customising Tooltip Style
The opts_tooltip() function in the ggiraph package is used to customise the tooltip rendering by adding css declarations.
Note: The background for the tooltip has been changed from black to white colour, and the text colour has been changed from white to black.
tooltip_css = "background-color:white; #<<
font-style:bold; color:black;" #<<
girafe(ggobj = p2,
width_svg = 6,
height_svg = 6*0.618,
options = list( #<<
opts_tooltip( #<<
css = tooltip_css)) #<<
)3.3.4 Displaying Statistics on Tooltip
Derived statistics can also be displayed in a tooltip. In the example below, a function is used to compute the 90% confidence interval of the mean of Maths scores by RACE are plotted in a bar chart.
tooltip = function(y, ymax, accuracy = .01) {
mean = scales::number(y, accuracy = accuracy)
sem = scales::number(ymax - y, accuracy = accuracy)
paste("Mean Maths Scores:", mean, "+/-", sem)
}
gg_point = ggplot(data = exam_data,
aes(x = RACE)) +
stat_summary(aes(y = MATHS,
tooltip = after_stat(
tooltip(y, ymax))),
fun.data = "mean_se",
geom = GeomInteractiveCol,
fill = "light blue") +
stat_summary(aes(y = MATHS),
fun.data = mean_se,
geom = "errorbar", width = 0.2, size = 0.2)
girafe(ggobj = gg_point,
width_svg = 8,
height_svg = 8*0.618)3.3.5 Hover Effect with data_id Aesthetic
The “data_id” aesthetic argument of the geom_dotplot_interactive() function is used to show associated elements of the same designated field.
In the plot below, elements of the same CLASS are highlighted when the mouse hovers over any one of them.
Note: [In-class Exercise (Week 4)] The inclusion of “tooltip = CLASS” in the “aes” argument of the
geom_dotplot_interactive()function allows the student’s class to be displayed at the tooltip when the mouse hovers over it.
p3 = ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = CLASS,
data_id = CLASS),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(ggobj = p3,
width_svg = 6,
height_svg = 6*0.618)3.3.6 Styling Hover Effect
The highlighting effect can be changed using css codes to show the selected associated elements and fade the non-selected elements.
girafe(ggobj = p3,
width_svg = 6,
height_svg = 6*0.618,
options = list(
opts_hover(css = "fill: #202020;"),
opts_hover_inv(css = "opacity:0.2;")))3.3.7 Combining Tooltip and Hover Effect
The tooltip and hover effect can be combined in an interactive statistical graph.
The associated elements are highlighted when the mouse hovers over one of them. At the same time, the tooltip will show which CLASS the highlighted elements belong to.
p4 = ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(tooltip = CLASS,
data_id = CLASS),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(ggobj = p4,
width_svg = 6,
height_svg = 6*0.618,
options = list(
opts_hover(css = "fill: #202020;"),
opts_hover_inv(css = "opacity:0.2;")))3.3.8 Click Effect with onclick Aesthetic
Finally, the “onclick” aesthetic argument of the geom_dotplot_interactive() function is used to provide hotlink interactivity on the web.
Upon clicking one of the elements, the web document link will open.
exam_data$onclick = sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))
p5 = ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(onclick = onclick),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
scale_y_continuous(NULL,
breaks = NULL)
girafe(ggobj = p5,
width_svg = 6,
height_svg = 6*0.618)3.3.9 Coordinated Multiple Views with ggiraph
The coordinated multiple views methods can be used to show corresponding data points based on the same ID, with the following steps:
Appropriate interactive functions of the ggiraph package is used to create the multiple views.
The patchwork package is used inside the
girafe()function to create the interactive coordinated multiple views.
Note: The “data_id” aesthetic argument is critical to link observations between plots and the”tooltip” aesthetic argument is optional but nice to have when the mouse hovers over a point.
Note: [In-class Exercise (Week 4)] The same approach used at sub-section 3.3.2 for displaying multiple information on the tooltip is applied here. The inclusion of “tooltip = exam_data$tooltip2” in the “aes” argument of the
geom_dotplot_interactive()function allows the student’s ID, class, Maths and English scores to be displayed at the tooltip when the mouse hovers over it.
exam_data$tooltip2 = c(paste0(
"Name = ", exam_data$ID,
"\n Class = ", exam_data$CLASS,
"\n Maths Score = ", exam_data$MATHS,
"\n English Score = ", exam_data$ENGLISH))
p6 = ggplot(data = exam_data,
aes(x = MATHS)) +
geom_dotplot_interactive(
aes(data_id = ID,
tooltip = exam_data$tooltip2),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
coord_cartesian(xlim = c(0,100)) +
scale_y_continuous(NULL,
breaks = NULL)
p7 = ggplot(data = exam_data,
aes(x = ENGLISH)) +
geom_dotplot_interactive(
aes(data_id = ID,
tooltip = exam_data$tooltip2),
stackgroups = TRUE,
binwidth = 1,
method = "histodot") +
coord_cartesian(xlim = c(0,100)) +
scale_y_continuous(NULL,
breaks = NULL)
girafe(code = print(p6 + p7),
width_svg = 6,
height_svg = 3,
options = list(
opts_hover(css = "fill: #202020;"),
opts_hover_inv(css = "opacity:0.2;")))3.4 Interactive Data Visualisation: plotly Methods
The plotly package can be used to create interactive web graphics from ggplot2 graphs and/or a custom interface to the (MIT-licensed) JavaScript library plotly.js inspired by the Grammar of Graphics. Unlike other plotly platforms, plot.R is free and open-source.
There are two ways to create an interactive graph using the plotly package:
Using the
plot_ly()function; andUsing the
ggplotly()function.
3.4.1 Creating Interactive Scatter Plot Using plot_ly() Method
A basic interactive plot is created using the plot_ly() function.
plot_ly(data = exam_data,
x = ~MATHS,
y = ~ENGLISH)3.4.2 Working with Visual Variable Using plot_ly() Method
The “colour” argument is used to map a qualitative visual variable (e.g. RACE).
plot_ly(data = exam_data,
x = ~ENGLISH,
y = ~MATHS,
color = ~RACE)3.4.3 Creating Interactive Scatter Plot Using ggplotly() Method
A basic interactive plot is created using the gglotly() function.
p8 = ggplot(data=exam_data,
aes(x = MATHS,
y = ENGLISH)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0,100),
ylim = c(0,100))
ggplotly(p8)3.4.4 Coordinated Multiple Views with plotly
A coordinated linked plot can be created using the plotly package function in three steps:
The
highlight_key()function in the plotly package is used as shared data.The two scatterplots are created by using functions in the ggplot2 package.
The subplot() function in the plotly package is used to place the two scatterplots side-by-side.
d = highlight_key(exam_data)
p9 = ggplot(data = d,
aes(x = MATHS,
y = ENGLISH)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0,100),
ylim = c(0,100))
p10 = ggplot(data = d,
aes(x = MATHS,
y = SCIENCE)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0,100),
ylim = c(0,100))
subplot(ggplotly(p9),
ggplotly(p10))3.5 Interactive Data Visualisation: Interactive Data Table Using DT Package
The DT package provides interfacing with JavaScript library DataTables that create interactive tables on html page. Data objects in R can be rendered as HTML tables using the JavaScript library DataTables (typically via R Markdown or Shiny).
datatable(exam_data, class= "compact")3.6 Interactive Data Visualisation: crosstalk Methods
3.6.1 Linked Brushing Using crosstalk Method
Crosstalk is an add-on to the htmlwidgets package. It extends the package with a set of classes, functions, and conventions for implementing cross-widget interactions (currently, linked brushing and filtering).
Coordinated brushing is implemented using:
The
highlight()function in the plotly package sets a variety of options for brushing (i.e., highlighting) multiple plots. These options are primarily designed for linking multiple plotly graphs, and may not behave as expected when linking plotly to another htmlwidget via crosstalk. In some cases, other htmlwidgets will respect these options, such as persistent selection in leaflet.The
bscols()function in the crosstalk package makes it easy to put html elements side by side. It is especially designed to work in an R Markdown document.
p11 = ggplot(d,
aes(ENGLISH, MATHS)) +
geom_point(size = 1) +
coord_cartesian(xlim = c(0,100),
ylim = c(0,100))
gg = highlight(ggplotly(p11),
"plotly_selected")
bscols(gg,
datatable(d),
widths = 5)3.7 References
3.7.1 ggiraph
3.7.2 plotly
Carson Sievert (2020) Interactive Web-based Data Visualization with R, plotly, and Shiny, Chapman and Hall/CRC. Online version.
Plotly R Figure Reference provides a comprehensive discussion of each visual representations.
Plotly R Library Fundamentals is a good place to learn the fundamental features of plotly’s R API.
~~~ End of Hands-on Exercise 3A ~~~