Jupyter Notebook, RMarkdown & PDF

Image from Unsplash

Jupyter notebook is a great documentation option when it comes to writing R or Python code. Personally, I use Jupyter notebook together with Visual Studio code a lot when it comes to data analysis and visualization work. But here’s the catch, oftentimes, I could not share my work to others that use RStudio, since you need additional tool to render the Jupyter notebook and if your laptop does not configure Jupyter with R kernel properly, you cannot reproduce the result using R code written on Jupyter notebook.

Today I will walk you through a simple program written in R that helps you to easily convert a Jupyter notebook written with R to a RMarkdown file and also a PDF output with visuals and visual numbering.

Get The Files Ready

For starter, you can prepare a Jupyter notebook with some R codes in it, notice there is also yaml code which will be used for the PDF formatting later:

The yaml code is written in box like this:

---
title:
author:
date:
---

You can also add more information to the yaml code, you can read more about it here:

---
title: "Title"
subtitle: "Subtitle"
author:
  - Name^[Position, email]
abstract: "Blah * 3."
urlcolor: "blue"
date: 19-09-2023
---

Conversion to RMarkdown

To convert this ipynb notebook to a RMarkdown file, we can use the convert_ipynb() function in the rmarkdown package.

rmarkdown::convert_ipynb('filename.ipynb')

After running the code, you will notice that a filename.Rmd file will be produced. You will notice that the file contains two yml sections, you can leave it as it is since only the one from your ippynb notebook will be read and used.

Conversion to PDF

Now, to produce the PDF output from the RMarkdown file, we will need to perform some configurations. Here are some of the things we plan to do:

Hide the input code, for our PDF output, we would like to see the output only
Label the plot and table, we want to give our plots and tables some numbering with captions

In fact, there is also a problem about converting RMarkdown file to PDF format, sometimes the plot position will be incorrect as it will jump to the other section. To resolve this problem, we will need a tex file to configure the output properly.

By running the below code, we will produce a PDF version of our notebook without the input code. You can create a new ipynb notebook and run the code in it:

# generate the RMarkdown file
rmarkdown::convert_ipynb('filename.ipynb')

# turn of showing the input code
knitr_options(
    opts_chunk = opts_chunk$set(echo = FALSE),
)

# convert Rmd to PDF, we also turn on the plot caption and numbering with some latex formatting
render(
    'filename.Rmd',
    pdf_document(fig_caption = TRUE, includes = includes(in_header = "my_header.tex")),
    clean = TRUE,
)

Notice that we also have a command includes = includes(in_header = "my_header.tex"), this code works to prevent the plot image to be improperly placed. You need to create a .tex file name my_header and paste this content into it:

\usepackage{float}
\let\origfigure\figure
\let\endorigfigure\endfigure
\renewenvironment{figure}[1][2] {
    \expandafter\origfigure\expandafter[H]
} {
    \endorigfigure
}

If you run the above code, you will notice that there is still no numbering for the table and plot. To allow numbering, we need to perform some workarounds:

For table output, we can use the kable function from the knit package, it allows us to convert the table into a markdown format with caption for rendering:

kable(df, caption = "This is caption")

To write caption to out plot, it is simple, in the code chunk that produces plot, you can insert the below code:

opts_chunk$set(fig.cap = "This is caption")

There are other configuration you can add to the opts_chunk function such as setting the plot width and height and more, you can check here.

These code should be written in the notebook that you want to render:

After running the conversion program, you can see the rendered PDF document based on your ipynb notebook. You can see that the plot and table are properly number and the plot position is also correct as it is:

Get The Files Ready#

Conversion to RMarkdown#

Conversion to PDF#

Get The Files Ready

Conversion to RMarkdown

Conversion to PDF