using r markdown to share analysis

Using R Markdown to Share Analysis

You have some R code that has performed some analysis. How can you share the analysis without having to spend even more time creating a slideshow or document in some other tool? Enter R Markdown, an R package that allows you to embed code snippets in Markdown documents.

Definitions

R is a statistical programming language that is often used for data analysis and visualizations. R is open source and has a very robust community that share content and ideas.

RStudio is a popular Integrated Development Environment (IDE) for R.

An R package is a group of functions created by the community that can be installed using R code or through an IDE such as R Studio.

R Markdown is a package for R and the center of this post.

Markdown is a language that uses plain text to format documents.

Getting Started

There are four easy steps to designing and publishing an R Markdown file. First, open a file with the .rmd extension. Second, write code using the R Markdown syntax. Third, add R code that generates an output to the report. Lastly, render the document into a slideshow, HTML, PDF, or Word file.

For this walkthrough, we will be using RStudio, which can be downloaded from here. If you do not have the R language installed as well, you can get that from here. In RStudio, you can install packages by clicking the Install button on the Packages window. Alternatively, you can use the following code to install a package directly.

install.packages(“rmarkdown”)

 

install r package
The Install Packages window in RStudio

To create an R Markdown file, go to File>New File>R Markdown… in RStudio. A window will appear where you can select the title, author, and output format of the file. This method will create an .rmd file with these sections already filled out as well as some instructions and examples of proper syntax for inserting code and generating the document. Alternatively, you can save any file using the .rmd extension.

Format

An R Markdown file has many syntax options to transform your code into a descriptive and well formatted document. The YAML header at the top is optional but can contain information such as the title, author, date, and much more. From there, you can begin adding formatted text and code snippets. The most useful and unique part about R Markdown is the ability to determine how a code snippet will function. You can choose whether or not each block of code executes, displays the output or displays the code and output. Some use cases of these options are executing code that may need to be referenced later in the script without showing it in the resulting document, executing code and only showing the results in the document, or executing code and showing both the code used and its output. 

code chunks
Example RMarkdown file in RStudio that shows code with various output types on the left and the resulting output on the right.

Beyond code snippets, there are many text formatting options to improve the professionalism and readability of your document. There are custom templates, the option to use a CSS file, add images, web links, scientific notation, bibliographies, and more. For a full formatting guide, check out the R Markdown Reference Guide and cheat sheet. In addition to formatting text, R Markdown can contain code snippets from other programming languages such as Python and SQL as well.

markdown basics
Example R Markdown file in RStudio that shows code with formatting on the left and the resulting output on the right.

Publishing

The wide variety of options to publish content makes R Markdown scripts very useful and versatile. You can relay your code through a document, slideshow, website, book, interactive document, and more. There are two ways to publish your script using RStudio. The first is calling the render function directly. The name of the file to be created is the first parameter and the second parameter is the output type. If the output type parameter is blank, it will render the script using the output type identified in the YAML header. The second way to publish content is to select the Knit button in the RStudio window. It will render a script using the output that was selected in the YAML header. The output types include popular formats such as Word, PowerPoint, PDF, GitHub, and HTML. The output type used in the following code example is “html_document”.

Code Example

Download the example files

This example walks through the entire process of gathering data, performing simple data modeling tasks, creating some visualizations, and turning the code into a markdown document in R. The example R script demonstrates the entire process of data analysis using a public data source. The script has comments explaining each step of importing the dataset, cleaning and reformatting the fields, creating combination and subset tables based on what is to be analyzed, and finally, creating the visualizations to display analysis. The R Markdown script example uses the code from the R script but presents it in a format for non-programmers to consume. The document created by the R Markdown script has descriptions of each outputted visual while hiding the underlying code used to create them.

rmarkdown data visualization
R Markdown Data Visualization - Download the code examples above to see the full output.

Wrap Up

R Markdown is a simple way to share R code results that includes several options for output types. The improved output allows users to understand and take action from the results quickly and efficiently. Programmers will find it easy to use and non-programmers will marvel (this is a great joke if you looked at the code examples) at the quality of your slides, document, web page, or interactive content. To learn more about R Markdown, check out the R Markdown from RStudio website and a free book, R Markdown: The Definitive Guide by Yihui Xie, J. J. Allaire, Garrett Grolemund.

We Are Hiring!

Come work with our award winning team! We are looking for mid to senior level developers for positions in both Hampton Roads and Richmond VA. Check out our careers page and send us your resume!

Brian Knox

Brian Knox is a business analyst at Marathon Consulting. He is a recent graduate from the College of William and Mary, having completed the inaugural class of the Master of Science in Business Analytics program. He has an undergraduate degree in Information Science with minors in Computer Science and Business Administration from Christopher Newport University. Brian learned early on that he wanted to combine his passions of business and technology. Brian loves all things data and is always looking for ways to improve his data science skills in order to help others throughout Hampton Roads.  

Leave a Comment

Let's talk about your project.

We are full-service, flexible, and believe that successful projects are the result of working collaboratively with our clients. Are you looking for a better user experience for your website or application? Need an experienced database architect or business analyst? Let’s talk!

X