ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

A tool for reproducible research: From data analysis (in R) to a typeset laboratory notebook (as .pdf) using the text editor Emacs with the 'mp' package

[version 2; peer review: 2 not approved]
Growth and motility of a melanoma cell line are inhibited in the presence of beta-hydroxybutyrate.
PUBLISHED 30 Mar 2016
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Software
Much scientific research makes use of commonly available 'office' software. While numerous more fully-featured open-source alternatives exist, the integration of diverse tools and platforms which their use often entails can be challenging. The mp package for Emacs aims to bring together a number of these elements with the goal of simplifying the process of converting an .R file, as used for data analysis, to a nicely formatted .pdf which includes the complete description of the methods and interpretation. We discuss the rationale for development of the package and illustrate its applications and options with a series of experiments from our laboratory.

Experimental work
We demonstrate the inhibitory effects of the ketone body \emph{beta}-hydroxybutyrate (BHB) on the growth and motility of a cancer cell line. BHB is produced endogenously; levels may be increased in certain medial conditions e.g. diabetic ketoacidosis. They may also be raised voluntarily e.g. by adopting the ketogenic diet. 

BHB is known to inhibit the growth of other neoplastic cell lines. However the finding that it can do so in a cell line selected for their propensity to metastasize to the brain is novel. Given the challenges in treating patients with melanoma metastatic to brain, this work strengthens the rationale for investigating the ketogenic diet as a potential adjunct to treatment in such cases.

Keywords

latex, R, knitr, emacs, reproducible research, beta-hydroxybutyrate, melanoma, brain metastases

Revised Amendments from Version 1

We have re-structured the article and added some new sections and references in order to address the issues raised by the referees.

See the authors' detailed response to the review by Eric Schulte
See the authors' detailed response to the review by Frank Harrell

Introduction

Motivation

One of the primary goals of any experimental research is to produce a nicely typeset document which explains the methods and results. This should be sufficient to allow the reader to recreate the work and thus to verify the results (given the correct tools).

In practice, much research is documented by adapting existing 'office'-type software for this purpose (Microsoft, OpenOffice etc.).

While there is much to be said for the ease of use of these techniques, they are not ideally suited to the purpose. In particular, those that employ a 'point-and-click' graphical user interface (GUI) make it impossible to recreate these steps (mouse movements and clicks). The options for generating graphs and analysing data are typically limited and often require the use of separate 'third-party' software for these steps (e.g. SPSS, GraphPad Prism). This again makes the reproduction of results a challenge.

There are many free and open source alternatives which are designed with the needs of the laboratory researcher in mind. Ease of use appears to be the principle reason for their lack of widespread use. The mp (for 'make-pdf') package came about as an attempt to bring elements from a number of these diverse sources together. Data analysis is performed using R1. Typesetting is performed using LaTeX, which has become the industry standard for scientific publications2,3. mp was also motivated by the repetitive nature of much laboratory research. Successive experiments often differ little in method and the analysis often uses the same techniques with new data each time.

Emacs is the text editor which brings these methods together. Emacs itself has been criticized for lack of ease of use, although if used purely as a text/file editor, as in the examples here, it remains quite simple4.

The examples do require some familiarity with R. The transition from familiar GUI-style data-analysis to terminal-based output may also appear daunting at first. For those considering taking the plunge we hope that these simple examples will help to illustrate how easy the language can be to use. As a long-term investment, we feel that the time taken to become familiar with these methods is likely to be more than compensated by subsequent improvements in the speed and simplicity of work-flow.

What does mp do

mp is a collection of functions and variables which makes and displays a .pdf from 'the materials available'. It aims to do so 'at the touch of a button'. It works primarily with the following file types:

  • 1. .R

  • 2. .Rnw or .org

  • 3. .tex

  • 4. .el

The final .pdf is generated with latexmk5. mp supports the use of indexes, glossaries, nomenclatures and bibliographies (e.g. with a separate .bib file)68.

The output from the whole process is shown in a new window; once complete, errors, warnings and success are highlighted; this typically makes correcting errors more straightforward. Once the .pdf has been generated, it is opened with a viewer (by default 'evince'9).

The red arrow in Figure 1 shows the default route taken through these file types. The function mp-mp can be called on any of the file types above, or on a directory containing such files. Thus it may be used solely, for example, in converting .tex to .pdf.

ed03f923-bcc7-4721-b82c-52f648482f61_figure1.gif

Figure 1. Flow diagram showing file types and converters in the mp package.

The red arrow shows the default flow path.

When the process is repeated, prior files are as over-written. Thankfully, Emacs will, by default, save files automatically when modified, although clearly caution is still required.

The most important variable is the intermediary (or 'go-between') file and the step which follows. This intermediary is either a no-web (.Rnw) or an .org file10,11.

.R files. Typically, it is easier to perform data analysis using an .R file directly vs. a more cumbersome intermediary .Rnw or .org.

The .R file is broken up into 'chunks' of code, which (by default) correspond to sections in a corresponding LaTeX document (to be generated). These are separated by headers: ## ---- chunkName. This follows the convention introduced by knitr for naming chunks.

No other 'markup' is employed when processing the .R file. If .org is used as the intermediary, there is an option to convert LaTeX math mode in the .R comments to inline math in LaTeX (as shown later in example 4).

A typical use case would thus involve writing R code (in a .R file), using mp to generate an 'intermediary' .Rnw file then 'fleshing out' the latter to include additional explanatory text:

  • 1. Create a new directory to hold all the files below.

  • 2. Draft the code for analysis in .R.

    Arbitrary 'placeholder' data may be used at this stage.

  • 3. Run mp to create a 'draft' or 'skeleton' .pdf from .R.

  • 4. Use the intermediary .Rnw file generated to add an experimental protocol.

  • 5. Complete the experiment.

  • 6. Update the .R or intermediary file generated with the results and conclusions.

  • 7. Run mp again on the intermediary .Rnw file to generate the final .pdf.

.Rnw files. Another approach is to write the .Rnw file directly. R code can be integrated into the document; alternatively, an .R file with the same name may be used for the code chunks; mp will try to match the chunks in that file to the sections in the .Rnw document. New sections are added, as necessary, while preserving the a common order between the documents where possible.

Entwiners

The intermediary file is then passed on to one of the tools below. Following the existing trend to name these instruments after processes involved in fabric making, we refer to them collectively as 'entwiners'.

  • knitr ('knitter') - the default for mp.

  • Sweave ('S-weave').

  • Emacs' own Org mode (herein org-mode).

mp uses templates (particularly for the preamble) when generating intermediary files; there is one for each entwiner. These contain defaults that are based on the authors' own work-flow. For example, the LaTeX package siunitx is included to allow for the correct display of scientific units12. Our 'hello world' example (Listing 2 below), while too simple to make use of the additional packages loaded, does show how multiple options are set in LaTeX and R (including knitr).

.el files. mp can also generate documentation for an elisp package which is contained in one file. This is done using .org as the intermediary. This feature was added to emulate the nicely typeset package documentation that is standard in R and LaTeX. As an example, the manual for the mp package itself was generated this way and is given as supplementary material.

Control flow

A more detailed diagram, which also shows the customizable variables, is available as Supplementary material (mpFlow.pdf). Some familiarity with setting 'customizable variables' is required when moving beyond the default settings for mp. For more experienced users, knowledge of these functions and variables allows for highly granular control, if desired.

Why elisp?

mp depends on an array of tools. Some are command-line, some use R and some are Emacs packages5,11,1315. We settled on elisp, the native language of Emacs as it allows for easy integration of these diverse methods 'under one roof'16. There is a good deal of 'text manipulation' involved; as an extensible text-editor, Emacs is particularly well suited to this task.

Elisp supports asynchronous processing. That is, the Emacs terminal is not 'frozen' during the process of .pdf generation and thus an R session or text editing can continue uninterrupted.

Lisp itself, while no longer popular, is arguably comparable in efficiency and speed to any of the widely used programming languages. It can be interpreted, allowing for rapid development; and compiled, allowing for improvements in speed when required. While the syntax if often said to be 'off-putting' initially, it may also be described as 'expressive', allowing for the concise and efficient representation of problems. Any language with the ability to write to and read from files and send text to the command line could have been used for the purpose. Thus the methods could have been implemented in R or LaTeX, although perhaps in a more lengthy and less readable form. Doing so would also lose some of the tight integration with Emacs which was a goal of the package.

Related tools

Methods used by mp

Converting from an intermediary to a .tex file is performed by one of the entwiners shown in Table 1.

Table 1. Choice of mp-entwiner when converting from .R to .tex.

MethodAdvantagesDisadvantages
knitrNice code formattingSettings maybe incompatible with
other loaded LaTeX packages
SweaveSupported by R-core
.tex files easy to read
More limited options than knitr
Only one figure per chunk
OrgTables easy to write
Markup is simpler than LaTeX
Can include code from other languages
Use math in code comments
Export to HTML
Formatting not as 'nice' as knitr
Only one figure per chunk

Sweave. This is the oldest and best-supported of these converters13. It is the only such method supported by R-core. It continues to be used as a standard tool for R package developers writing vignettes.

It does suffer from a number of limitations relative to its counterparts. In particular, the displayed code 'as-is' has little formatting and no color. Only one figure per 'chunk' is supported. This may be overcome by reading/writing files in R directly, although this can be tricky to implement.

Knitr. This has superseded Sweave for most practical purposes14. The code is much easier to read. It also allows for multiple figures per chunk, with their own captions. It allows the chunks to be kept in a separate source file - as opposed to requiring them to be part of the .Rnw file. There are more options for the display of terminal output, including handling of error messages. Like Sweave, it can be used to build R package vignettes.

org-mode. .org files are arguably more intuitive to read and edit than .tex, particularly for users new to the latter. Tables are simpler to read, create and modify. The use of 'collapsible' section headers makes it easy to see the structure of a document at-a-glance before expanding one section for further editing17,18.

While org-mode loses the attractive code printing of knitr, some worthy alternatives are provided by the LaTeX package listings19. In mp, the default settings adopted for listings are modeled after the knitr defaults (although admittedly not quite as attractive).

These include the option to include LaTeX maths markup in the code commentary, for example to display equations.

Like Sweave, org-mode chunks suffer from the drawback that multiple figures per chunk are not supported by default.

By contrast with the other entwiners, Org-mode allows for conversion/export to multiple file types. By default, mp converts .org to .tex but alternatives are straightforward, such as to .html or to .MARKDOWN.

Allied approaches

Converting .org to .Rnw is possible using the ravel package for Emacs20. This is also possible with the pander package for R, which integrates R with the Haskell library pandoc21,22. It is broader in scope than ravel and aims to convert a wide range of file types. Both could serve as alternatives to any of the entwiners above or be used in conjunction with them. We have chosen to stick with the three above as these appear better established.

Closer integration between R and LaTeX is possible using the LaTeX packages knitrl and spaper; although not yet part of the Comprehensive TeX Archive Network (CTAN), they are readily available on github23,24. The author, Dr F Harrell, also provides some very useful .Rnw templates for use in statistical reporting; we encourage the reader to explore these methods at https://github.com/harrelfe/rlatex. mp, by contrast, takes the approach of storing its templates as customizable variables within Emacs. This has the potential advantage of having them 'close to hand' when working in Emacs, easy customization, type-checking and persistence across Emacs sessions.

Alternatives to mp

There are many other good tools available which aim to bridge the gap between .R and .pdf, although to our knowledge, mp is the one which allows the user to generate one file type from the other with a single keystroke. We suggest that having more options available for the task is by no means a bad thing; doubtless some of these methods will be more appealing depending on the users background or the task at hand.

RStudio. This integrated development environment (IDE) is the leading alternative to Emacs for working with R and generating .pdfs25. It is probably the best choice for those new to combining R and LaTeX. It has a 'friendlier' GUI than Emacs and the menus arguably simplify access to functions. It has better integration with Rmarkdown (see below)26. XeLaTeX is supported, although at the time of writing LuaLaTeX does not appear to be (in contrast to mp).

Having used RStudio on a daily basis for two years, our first author ultimately found Emacs to be preferable, primarily on account of the gain in speed when editing text/code and also due to the ability to customize and improve the environment as required for specific tasks. With RStudio, there is also the major drawback when compiling .pdfs that the application pauses with no output until the process is complete. It is also typically more 'memory hungry' than Emacs. If running multiple R sessions, a new copy of the application needs to opened for each. These limitations are minor for small files and data sets but can become a major inconvenience with more complex tasks. Finally, RStudio requires a license for commercial use, whereas all the elements in mp are freely available.

Rmarkdown. This R package is another way of feeding text and code to knitr. It combines the attractive features of the latter with a simplicity of style similar to org-mode. It too allows for export to multiple file types, including .html and Microsoft .doc. The process generates an additional .md file which is then processed by pandoc. Due to the additional dependencies introduced, we have not sought to include these methods in mp.

knitr. This entwiner already integrates with some GUI-style .tex editors, particularly LyX. The latter is part of 'Scientific Workplace' (SW), which, like mp, tries to make life easier for the laboratory researcher by providing a simplified work space. In the case of SW, a GUI is preferred to directly editing files27.

Minted. This is an alternative to the listings package for typesetting code with LaTeX28. While admittedly often more attractive for code display, it requires an external python dependency. Also, for code chunks which span more than one page, automating background coloring is currently challenging.

Experimental work: BHB, cell growth and migration

Beta-hydroxybutyrate is a source of energy produced by the liver when the body is in ketosis, i.e. when the availability of glucose/sugars as a source of fuel is limited.

Increasing ketones in the blood lead to higher rates of fatty acid oxidation and an increase in the production of acetyl-CoA. When the amount of acetyl-CoA exceeds the capacity of the tricarboxylic acid cycle to utilize it, there is an increase in the production of the ketone bodies (BHB and acetoacetate (AcAc)).

One of the hallmarks of cancer is the dysregulation of metabolism. Cancer cells are particularly dependent on glucose as an energy source whereas normal tissue can readily adapt to using ketone bodies as an alternative. This is in part due to genetic and mitochondrial defects in cancer cells2935.

Thus, a number of treatments involving the modification of diet to stimulate ketone production have been suggested: the ketogenic diet, caloric restriction and intermittent fasting. These strategies have been studied in various in vivo models of glioma, a malignant brain tumor. They have demonstrated increased survival as well as anti-tumor effects36.

The ability of BHB to inhibit cancer cell growth and migration has long been recognized3739. The work from Magee et al., from 1979, also features the B16 melanoma cell line. These investigators demonstrated a reduction in the number of lung metastases following an injection of cells to the tail vein of mice, in those receiving a diet of just fats and water vs. sucrose and water.

This phenomenon is of particular interest to our laboratory as we have demonstrated that the ketogenic diet (whereby energy requirements are met almost exclusively with fat) enhances the response of glioma to radiation and chemotherapy in a mouse model40.

We sought to determine whether the same phenomenon would be observed with other cancer types which are commonly metastatic to the brain; in particular melanoma.

Methods

Software implementation

The software may be obtained via e.g. git clone https://www.github.com/dardisco/mp

The following should then be placed in your init.el file:

(add-to-list 'load-path " /path/to/mp")
(require 'mp)

mp should then be available once Emacs starts. It is a 'minor mode' for Emacs; the sole keybinding invokes the function mp-mp with Ctrl-Alt-|(usually found above the 'Return' key; in Emacs parlance this is also known as C-M-|). mp-mp is a gateway to all of the package functions; these can also be run individually/'interactively' as required (using execute-extended-command).

mp-mp prompts for a file name; if none is supplied, it will look first at the current buffer. If this is not an .R, .Rnw, .org .tex or .el file it will select the appropriate file from the default-directory as that which has most recently modified. Thereafter it will search up the directory tree if no such file is found.

Instead of a file name, the single character 'p' may be given to display the appropriate .pdf associated with the current file or directory.

Operation

System requirements: this should work with any recent version of Emacs, which is platform independent (i.e. works on Windows, Linux, Mac-OS). Version ≥ 24.4 is recommended to allow for automated export of .org to .tex.

To export to HTML, the elisp package htmlize is required15.

A recent installation of R (>3.0) and TeX (2013 and on) is also assumed. We used TeX Live 2013 for these examples.

No support for caching is provided, although with short documents similar in scale to the examples below this should not result in much loss of performance. The time to compile is <10 secs with an Intel i5-2430M processor for all of the examples given. When speed is an issue, we suggest temporarily changing your PATH to allow the relevant R and LaTeX binaries to run from a temporary RAM drive.

'Hello world' with mp. We begin with the simplest use case. We create the directory hello and place the contents of Listing 1 into hello.R.

Listing 1: The file hello.R

## ---- Print hello
print(”Hello world”)
## ---- Here are two graphs
## here’s a comment
plot(1:10, 10:100)
## this second graph should appear
## below the first
plot(1:10, sqrt(1:10))

The final product, hello.pdf is shown in Figure 2.

ed03f923-bcc7-4721-b82c-52f648482f61_figure2.gif

Figure 2. The file hello.pdf.

The intermediary file hello.Rnw will be left open in Emacs for further modification and this is shown in Listing 2.

Listing 2: The file hello.Rnw

%
\documentclass{article}
%
% modified from default setup for knitr
%
\usepackage[]{graphicx}
\usepackage[]{color}
\usepackage{framed}
% recommended with ‘knitr’
\usepackage{alltt}
\usepackage{mathtools}
\usepackage[sc]{mathpazo}
\usepackage{geometry}
\geometry{verbose, tmargin=2.5cm, bmargin
    =2.5cm,
  lmargin=2.5cm, rmargin=2.5cm}
\setcounter{secnumdepth}{2}
\setcounter{tocdepth}{2}
\usepackage{url}
\usepackage{hyperref}
\hypersetup{unicode=true, pdfusetitle}
\hypersetup{bookmarks=true,
    bookmarksnumbered=true}
\hypersetup{bookmarksopen=true,
    bookmarksopenlevel=2}
\hypersetup{breaklinks=false, pdfborder={0 0
     1}}
\hypersetup{backref=false}
\hypersetup{colorlinks=true}
\definecolor{myDarkBlue}{rgb}{0, 0, 0.5}
\hypersetup{linkcolor=myDarkBlue}
\hypersetup{pdfstartview={XYZ null null 1}}
%
% Palatino family
\usepackage[T1]{fontenc}
\usepackage{mathpazo}
\linespread{1.05}
\usepackage[scaled]{helvet}
%
% other useful additions
%
%% for rerunfilecheck:
% no need to rerun to get outlines right
\usepackage{bookmark}
%% for nice tables
\usepackage{booktabs}
%% for e.g. \formatdate
\usepackage{datetime}
%% for SI units
\usepackage{siunitx}
%% for chemical symbols
\usepackage[version=3]{mhchem}
%% to use forced ’here’
%% e.g. \begin{figure}[H]
\usepackage{float}
%% for large numbers of floats
\usepackage{morefloats}
%% to keep floats in same section
\usepackage[section]{placeins}
%% for tables > 1 page
\usepackage{longtable}
%
%%----------------------------------------
%
\begin{document}
%
% knitr chunks
<<setup, include=FALSE>>=
library(knitr)
### Set global chunk options
opts_chunk$set(
eval=TRUE,
## text results
echo=TRUE,
results=c(’markup’, ’asis’, ’hold’, ’hide’)
    [1],
collapse=FALSE,
warning=TRUE, message=TRUE, error=TRUE,
split=FALSE, include=TRUE, strip.white=TRUE,
## code decoration
tidy=FALSE, prompt=FALSE, comment=’##’,
highlight=TRUE, size=’normalsize’,
background=c(’#F7F7F7’, colors()[479], c
    (0.1, 0.2, 0.3))[1],
## cache
cache=FALSE,
## plots
fig.path=c(’figure’, ’figure/minimal-’),
fig.keep=c(’high’, ’none’, ’all’, ’first’, ’
    last’)[1],
fig.align=c(’center’, ’left’, ’right’, ’
    default’)[1],
fig.show=c(’hold’, ’asis’, ’animate’, ’
    hide’)[1],
dev=c(’pdf’, ’png’, ’tikz’)[1],
fig.width=7, fig.height=7, #inches
fig.env=c(’figure’, ’marginfigure’)[1],
fig.pos=c(’’, ’h’, ’t’, ’b’, ’p’, ’H’)[1])
### Set R options
options(formatR.arrow=TRUE, width=60)
knit_hooks$set(inline = function(x) {
   ## if (is.numeric(x)) return(knitr:::format
       _sci(x, ’latex’))
  highr::hi_latex(x)
})
## uncomment below to change theme
## knit_theme$get()
## opts_knit$set(out.format=’latex’)
## thm1 <- knit_theme$get(’acid’)
## knit_theme$set(thm1)
@
% knitr read chunks
<<readChunks, include=FALSE>>=
read_chunk(’hello.R’)
@
\title{hello}
\author{Chris Dardis}

\maketitle
% page numbers appear top-right
\pagestyle{headings}
\tableofcontents

\subsection{Print hello}

<<Print hello>>=
@

\subsection{Here are two graphs}

<<Here are two graphs>>=
@

%
%% \bibliographystyle{plain}
%% \bibliography{hello}
 \end{document}

Experimental work

The purpose of the current work was to demonstrate an inhibitory effect of BHB on the growth and migration of a melanoma cell line in vitro. Details of the aims, methods, results and conclusions are given for each experiment separately as Supplementary material (see Table 2).

Table 2. Steps taken in generating examples.

FileEntwinerEditTeXFinal
18kknitr.Rnw*LaTeX.pdf*
216kknitr.Rnw*LaTeX.pdf*
316k2Sweave.Rnw*XeTeX.pdf*
432kOrg.org*LaTeX.pdf*
532k2Org.org*LaTeX.html*
6saknitr.Rnw*XeTeX.pdf*

* = file available as Supplementary material

These experiments would normally make up just one part of an article, where it would be reasonable to combine all of these as one supplement. We provide the experiments separately for the sake of illustrating features of the mp package.

Cells. Melanoma B-16 cells were obtained from American Type Culture Collection (ATCC). To facilitate quantitative measurement of tumor growth, they were modified as described previously41. The cells were stably transfected with the gene encoding luc2 (luciferase) using the pGL4.51 [luc2/CMV/Neo] vector (Promega Corp, Madison, WI) and FuGENEH 6 Transfection Reagent (Roche Applied Science, Indianapolis, IN) following conditions specified by the manufacturer. They were then injected into the right ventricle of a mouse. These animals were sacrificed when bioluminescence was detected in the brain. Cells metastatic to the brain were recovered put into culture. These cells are designated B16-F1-Luc2-BR2.

Effects of BHB on growth. Firstly, it was necessary to determine the optimal starting number of cells to seed a growth curve. If the cells are plated too sparsely, they have a tendency not to grow. Likewise if the concentration is too great to start with, then this will make any difference due to the effect of BHB difficult to see. This is complicated further by growing the cells in a circular well - the cells are in closer communication close to the center and more separated close to the edge. One technical report suggested a starting number of 4000 per well in a 12-well plate for most cell lines42. In choosing a starting number, we were guided primarily by prior work in our own laboratory.

To determine whether BHB would inhibit cell growth, we chose a concentration of 10 mmol/L, again based primarily on our prior experience. Under normal conditions, serum BHB has an upper reference limit of 0.3–0.5 mmol/L. Tests of BHB concentration (for patients) are available commercially and are commonly used to aid in the diagnosis of diabetic ketoacidosis, where levels are usually at least 2 mmol/L.

Even higher levels are seen in patients adhering to the ketogenic diet. This diet is most commonly employed in the treatment of drug-resistant epilepsy, where serum BHB concentration has been shown to correlate with the degree of seizure control43. In this study of 74 children, median levels were 8–10 mmol/L.

We chose a concentration of 10 mmol/L in order to be certain that an effect was present, before considering whether lower levels would be similarly efficacious.

Effects of BHB on migration. The scratch assay is a simple and widely used test to assess cell migration. Details of a typical protocol for the technique are available44. Using the same cell line, we sought to illustrate impaired migration in the presence of BHB.

Results

Each experiment is contained in a separate file, shown in Table 2. These are all available as Supplementary material and we encourage the reader to have these available for reference. Each experiment is typeset in a different way to demonstrate some of the options available when doing so with mp.

The file stems (i.e. the file name without a suffix) refer to the starting quantity of cells per 12-well plate. 'sa' in the last example stands for 'scratch assay'.

Experiment 1: 8k.pdf

Experiment: Establish number of cells required to establish sustained growth. Start with 8000 cells per 12-well plate. Our first aim was to establish the optimal number of B16-F1-Luc2-BR2 cells with which to seed a growth curve. We wished to establish sustained growth over the period of the experiment, without reaching a ceiling (i.e. maximum density).

Typesetting: default settings. The file 8k.R is given in Listing 3:

Listing 3: The file 8k.R

## ---- data
(df1 <- data.frame(Day = rep(0:4, each=3),
		Well = rep(LETTERS[1:3], 5),
		Count = c (rep (8e3, 3),
		193e3, 148e3, 159e3,
		78e3, 63e3, 79e3,
		33e3, 56e3, 33e3,
		13e3, 15e3, 10e3)))
## ---- stdError
stdErr <- function(x) sqrt(var(x)) / sqrt(
    length(x))
library(plyr)
(df2 <- ddply(df1, .(Day),
	    function(df)
             c(mean = mean(df$Count),
               SE = stdErr(df$Count),
              ’mean+SE’ = mean(df$Count) +
		  stdErr(df$Count),
              ’mean−SE’ = mean(df$Count) -
		  stdErr(df$Count))))
## ---- plot
library(plotrix)
# confidence interval plot
with(df2, plotCI(x=Day, y=mean, ui=mean+SE, li
    =mean-SE,
	       xlab=”Day”,
	       ylab=”Cell counts (Cells/ ml)”,
	       axes=FALSE))
axis(side=1, at=c(0:4))
axis(side=2, at=seq(0, 150e3, by=50e3))
box()
with(df2, lines(Day, mean))

With the above file open, we call mp-mp to generate the a.pdf; page 1 of the result is shown in part in Figure 3.

ed03f923-bcc7-4721-b82c-52f648482f61_figure3.gif

Figure 3. Page 1 of the initial 8k.pdf.

As an intermediary step, the file 8k.Rnw is generated. This is an R no-web file, which is essentially a typical .tex file with the additional code chunks inserted10. (The term 'web' is not in reference to the world-wide web but to distinguish it from a contemporaneous approach to literate program ming known as 'web'.)

We modify this to add details of the experimental protocol, written in LaTeX. The principal modificiations are the use of 12pt for typesetting, which we adopt from hereon and the inclusion of some graphics with the LaTeX subfig package45.

With 8k.Rnw still open, we call mp-mp again to generate the final .pdf.

Experiment 2: 16k.pdf

Experiment: Number of cells required to establish sustained growth. 16000 cells per 12-well plate. Again, we tried to establish the optimal number of B16-F1-Luc2-BR2 cells with which to seed a growth curve. We increased the starting number of cells per well from 8000 to 16000. This appeared to be more promising.

Typesetting: using R to generate .tex output. We again begin with an .R file. This time however, we use R to create LaTeX output, rather than displaying the results of a chunk as R code. The output from R is interpreted as LaTeX directly - rather than as typical terminal output. We use the R xtable package for this purpose; there are many good alternatives46.

An example of R code generating output which can be interpreted by LaTeX is shown in Listing 4.

Listing 4: From the file 16k.R

## ---- Cell counts on Day 0
df1 <- data.frame(Cells_per_ml = c(392e3, 332
    e3, 300e3))
library(xtable)
print(xtable(df1,
	    caption=”Cell counts on day 0”,
	    align=c(”l”, ”c”),
	    display=c(”d”, ”e”),
	    label=”tab:ccd0”),
      booktabs=TRUE)

Again, the .Rnw file needs some modification. The main change is to set the knitr option results='markup' to results='asis' to allow the results to be interpreted as .tex.

We can also restructure the document to allow the code output (here, in the form of tables or single lines of terminal output) to be mixed with the main text. This ensures there is no needless duplication of data entry, as is the case in the earlier example.

The output is shown as Supplementary material in 16k.pdf. Most of the materials and methods are unchanged from 8k.pdf.

Experiment 3: 16k2.pdf

Experiment: Demonstrate impaired growth in the presence of BHB. 16000 cells per 12-well plate. We sought to determine whether the growth of B16-F1-Luc2-BR2 cells would be impaired in the presence of BHB at a concentration of 10 mmol/L.

Due to the large standard error on day 4, no difference could be demonstrated via a t-test. However comparing linear models with and without treatment did show a significant effect (analysis of variance (ANOVA) p = 0.048).

Typesetting: using Sweave and XeTeX. Here, we change the variable mp-entwiner to Sweave. We also change mp-latex to XeLaTeX to allow the use of an OpenType font (by default this is 'Lix Libertine'). While certain OpenType fonts are available as packages for LaTeX, XeTeX allows for a greater range and greater flexibility.

As in example 2, the output from each chunk is again typically interpreted as .tex (rather than displayed as R code) by including results=tex in mp-Sweave-opts.

Some modifications to the .Rnw file generated follow. There is one chunk where we prefer the output to retain typical R code formatting. Also we need to change the variable mp-Sweave-opts to fig=TRUE for the chunk which produces a plot.

This example also shows the value of LaTeX for typesetting equations - calculation of correct masses are nicely displayed and easy to follow. This is shown in Figure 4.

ed03f923-bcc7-4721-b82c-52f648482f61_figure4.gif

Figure 4. Typeset equations, from 16k2.pdf.

Experiment 4: 32k.pdf

Experiment: Demonstrate impaired growth in the presence of BHB. 32000 cells per 12-well plate. As the preceding experiment was inconclusive, we tried the same technique again, this time using a higher number of cells to start with.

Unfortunately, no observations were taken on day 3 of the experiment. This meant that both the t-test on day 4 and the ANOVA comparing linear models were not significant, although the latter did come close (p = 0.07).

Typesetting: using Org. Changing mp-entwiner to Org allows us to use an .org file as the intermediary step.

The output from this process is given as 32k.pdf. Again, some minor modifications to the intermediary file are required. In this case we need to change the headers for the chunk which produces a plot to specify the output of graphics (as opposed to text) and to specify an external file for the graphical output.

We also change some of the results to latex and pp (pretty print) to give examples of different styles of output. 32k.pdf also features a printout of help for an R function.

Experiment 5: 32k2.pdf

Experiment: Demonstrate impaired growth in the presence of BHB. 32000 cells per 12-well plate. We repeated the preceding experiment and this time there was a conclusive difference between the cell counts in the those grown with BHB vs. controls.

Typesetting: using Org to generate HTML output. This example shows how the .org intermediary can be converted to .html as an alternative to .pdf. Converting to HTML means we lose all but the simplest LaTeX commands and so the document needs to be written in a simpler style. While we lose some of the nice typesetting features of LaTeX, this simplicity has its own attraction.

These files are faster to produce and smaller than their .pdf counterparts (1/5 the size in this example). HTML is also likely to be simpler to integrate into an existing website and offers the possibility of almost 'real-time' reporting of experimental results.

Experiment 6: sa.pdf

Experiment: Demonstrate impaired cell mobility in the presence of BHB. Here we use a scratch assay44. A line is made in the center of the wells on a 12-well plate with confluent cell growth. Images taken at various time points are analyzed with freely available imageJ software. The distance between the cells on either side of the scratch decreases as they migrate back to the center. This corresponds to a decrease in 'image density' as measured by imageJ.

Typesetting: emulating a laboratory notebook with XeTeX. The final example shows how a cursive font may be used to mimic a typical lab notebook. It also shows how graphics may be combined in a table to help with clarity of display. We set mp-entwiner to knitr and set mp-latex to XeTeX to allow the use of an unusual font. A sample is shown in Figure 5.

ed03f923-bcc7-4721-b82c-52f648482f61_figure5.gif

Figure 5. Cursive script in sa.pdf.

Summary of significant results. As shown in Figure 6, we were able to demonstrate a difference in growth rates with an initial concentration of cells of 32,000 per well and a concentration of BHB of 10mmol/L. This is shown in full in 32k2.html.

ed03f923-bcc7-4721-b82c-52f648482f61_figure6.gif

Figure 6. Effect of BHB (10 mmol/L) on the growth of B16-F1-Luc2-BR2 cells.

Taken from 32k2.html (Supplementary material).

We were also able to demonstrate a decrease in cell migration using the scratch assay, as shown in Figure 7, which is taken from sa.pdf.

ed03f923-bcc7-4721-b82c-52f648482f61_figure7.gif

Figure 7. Effect of BHB (10 mmol/L) on the migration of B16-F1-Luc2-BR2 cells.

Taken from sa.pdf (Supplementary material).

Discussion

Software

The above examples illustrate some of the options available when using mp. While it may not be the tool of choice for every application, it has been a valuable addition to our own laboratory work and we hope this to be the case for others engaged in similar work.

Much scientific research is reported in the form of a summary and it has not been traditional for researchers to provide detailed accounts of the original experiments along with the original observations/results made at the time. We acknowledge that the details of individual experiments, particularly 'pilot' studies of an exploratory nature or those which yielded negative results are likely to be of limited interest to a general readership. Including these as supplements appears a reasonable approach.

The classic quote on reproducible research is from Buckheit and Donoho47:

  •    An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and complete set of instructions which generated the figures.

The same may well be said of biology. Within a traditional print-form journal, this is justifiable as constraints of space naturally limit the length of articles. However, this has been obviated by the option to refer to online-only Supplementary material.

Such resources may be valuable for those seeking to reproduce particular steps in the research. They may also may be of interest to those working on closely related problems, in particular by saving time on preparatory work. Providing original 'raw' data may also allow results to be combined with subsequent experiments. Furthermore, novel techniques, perhaps not developed at the time of writing, may later be applied by other researchers, perhaps leading to new or deeper insights.

Experimental work

The inhibitory effects of BHB on the growth of neoplastic cells has now been observed in a number of cells lines in our laboratory, as well as those of other investigators37.

Impairment of growth of neoplastic cells due to ketones has been suggested to be in part due to the polyacetylation of histones48. Ketone bodies also impair glycolysis, on which neoplastic cells are more heavily dependent. This occurs in part through inhibition of the activity of phosphofructokinase and hexokinase49,50.

The finding that inhibition of growth occurs in a cell line selected for their propensity to metastasize to the brain is novel. Brain metastases are a significant contributor to morbidity and mortality in patients with cancer51. We feel that this approach deserves further investigation. In particular, it would be of interest to determine whether the same phenomenon would be observed in breast and lung cancer cell lines with a propensity to metastasize to the brain. It would also be interesting to determine whether the inhibitory effects of BHB on cancer cells are evident in vivo when ketones are given as a dietary supplement with a relatively normal diet - as opposed to the ketogenic diet, which may be challenging to adopt for patients affected by cancer.

Conclusions

Software

The mp package appears to be a useful addition to the tools available for facilitating reproducible research. Future development will likely include increased support for the LaTeX biblatex package. Collaboration and suggestions for improvement are welcome. Please try to address these to the development site on github if possible.

Experimental work

We have demonstrated here that BHB is inhibitory to the growth and migration of B16 B16-F1-Luc2-BR2 melanoma cells, which were selected for their propensity to form brain metastases, at a concentration of 10 mmol/L.

Data availability

The data sets for all experiments are available as part of the Supplementary material.

One of the primary motivations for developing this package was to facilitate sharing of data sets such as these.

Software availability

Archived source code as at the time of publication

http://dx.doi.org/10.5281/zenodo.2118345

Software license

GPL >= 3

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 05 Aug 2015
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Dardis C, Woolf EC and Scheck AC. A tool for reproducible research: From data analysis (in R) to a typeset laboratory notebook (as .pdf) using the text editor Emacs with the 'mp' package [version 2; peer review: 2 not approved]. F1000Research 2016, 4:483 (https://doi.org/10.12688/f1000research.6800.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 30 Mar 2016
Revised
Views
64
Cite
Reviewer Report 31 Mar 2016
Frank Harrell, Department of Biostatistics, Vanderbilt University, Nashville, USA 
Not Approved
VIEWS 64
The authors have significantly improved the article but I regret that I cannot approve an article that implies that analysts should learn Sweave when it is now completely obsolete.  In addition, the article fails to take into account the Emacs ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Harrell F. Reviewer Report For: A tool for reproducible research: From data analysis (in R) to a typeset laboratory notebook (as .pdf) using the text editor Emacs with the 'mp' package [version 2; peer review: 2 not approved]. F1000Research 2016, 4:483 (https://doi.org/10.5256/f1000research.8987.r13111)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 05 Aug 2015
Views
93
Cite
Reviewer Report 23 Nov 2015
Frank Harrell, Department of Biostatistics, Vanderbilt University, Nashville, USA 
Not Approved
VIEWS 93
I cannot approve this article in its current form.  The authors may be able to undertake a major revision that makes the paper acceptable.

I realize that this is a subjective opinion, but I just cannot agree with the software implementation ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Harrell F. Reviewer Report For: A tool for reproducible research: From data analysis (in R) to a typeset laboratory notebook (as .pdf) using the text editor Emacs with the 'mp' package [version 2; peer review: 2 not approved]. F1000Research 2016, 4:483 (https://doi.org/10.5256/f1000research.7310.r11299)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 30 Mar 2016
    Chris Dardis, Department of Neurology, Barrow Neurological Institute, Phoenix, USA
    30 Mar 2016
    Author Response
    Apologies for the delay in responding. Thank you for this review; we appreciate the points you raise and feel the article has been improved in addressing these.

    1) I (first author) ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 30 Mar 2016
    Chris Dardis, Department of Neurology, Barrow Neurological Institute, Phoenix, USA
    30 Mar 2016
    Author Response
    Apologies for the delay in responding. Thank you for this review; we appreciate the points you raise and feel the article has been improved in addressing these.

    1) I (first author) ... Continue reading
Views
96
Cite
Reviewer Report 10 Sep 2015
Eric Schulte, Department of Computer Science, University of New Mexico, Albuquerque, NM, USA 
Not Approved
VIEWS 96
The report is insufficiently well written for me to evaluate the importance of the suggested tool.

The current version of this article is unclear as to (1) the functionality provided by mp, (2) the intended usage model of mp. It also ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Schulte E. Reviewer Report For: A tool for reproducible research: From data analysis (in R) to a typeset laboratory notebook (as .pdf) using the text editor Emacs with the 'mp' package [version 2; peer review: 2 not approved]. F1000Research 2016, 4:483 (https://doi.org/10.5256/f1000research.7310.r10253)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 30 Mar 2016
    Chris Dardis, Department of Neurology, Barrow Neurological Institute, Phoenix, USA
    30 Mar 2016
    Author Response
    Apologies for the delay in responding. Thank you for this review; your points are well-taken. I have tried to clarify things by re-structuring the document and adding some new material. ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 30 Mar 2016
    Chris Dardis, Department of Neurology, Barrow Neurological Institute, Phoenix, USA
    30 Mar 2016
    Author Response
    Apologies for the delay in responding. Thank you for this review; your points are well-taken. I have tried to clarify things by re-structuring the document and adding some new material. ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 05 Aug 2015
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.