STATISTICS 5
CHAPTER 4: REPRODUCIBLE REPORTS
4.2 FUNCTIONS USED
4.3 REQUIRED PACKAGES
1. Install Quarto on your computer
2. Packages:
a. Knitr
b. Rmarkdown
c. Tidyverse
d. Palmerpenguins
e. Broom
f. Gtsummary
g. Gt
4.4 WHY USE REPRODUCIBLE REPORTS?
A reproducible report allows you to achieve the same results by using the same raw data,
computational steps, and methods
Reproducible reports help prevent transcription and rounding errors, promoting more transparent and
reproducible project communication
4.5 REPORTS
Projects in RStudio are a way to group all the files you need for 1 project most projects include
scripts, data files and output files like images of PDF reports
Start a project in RStudio
- Create a new project
o File new project select new directory select new project name the project
save it inside the default R-projects directory click create project
1
, - Don’t ever save a new project inside another project directory. This can cause some hard-to-
identify problems.
4.6 INTRODUCTION TO QUARTO
Use Quarto to create reproducible reports with a table of contents, text, tables, images and code
Why use it?
- Integrated workflow: you can write your thesis and run your analysis in the same document
no more switching between word and R
- Reproducibility: if your data changes, your document updates automatically
- Clarity and transparency: you can show your methods, code and results clearly
Getting started:
- Open a new quarto document: file new file quarto document
- To render a Quarto document means to convert your .qmd file into a final, readable format such
as HTML, PDF, Word, etc
Source vs visual editor
- Visual editor: familiar to working with word examples in this course are shown for the source
editor delete the line editor : visual if needed
o You won’t see the hashes that create headers or the asterisk that create bold and italic
text
o You also won’t see the backticks that demarcate inline code
4.7 QUARTO SYNTAX AND STRUCTURE
A quarto document (.qmd) is a plain text file with the following main components:
1. Metadata (YAML)
o At the top of the file, you’ll see some text between a pair of 3 dashes this is the YAML
header, which provides info to quarto about how you want to render a document
o It plays a key role in how your document is processed and displayed
o Components:
i. Title: sets the document’s name
ii. Author: specifies the creator of the document
iii. Format: specifies the desired output (ex: HTML, PDF, Word)
iv. Editor: opens the file in the visual editor
2. Markdown
o If you start a line with hashes, it creates a header
1 hash makes a document title
2 hashes make a document header
3 makes a subheader …
o Make sure you leave a blank line before and after a header and don’t put any spaces or
other characters before the first hash
o Put a blank line between paragraphs of text
o Bullet-point list items start with * or – and numbered list items start with 1
- Text style
o You can format test (to create headings, bold or italic text) using the toolbar through the
format menu
o You can insert code chunks, links, images or tables by using the insert menu
Shortcut: type a slash (/) at the start of a new paragraph, which lets you quickly
insert these elements
o Markdown syntax: simple, plain-text way to format you document
2
, Typing ##header at the start of a line creates a heading
Placing ** around text makes it bold, placing * around text makes it italic
3. Code chunks
o Anything written between lines that start with 3 backticks (```) is processed as code and
anything written outside is processed as markdown
o Code chunks are grey and plain text is white, but the actual colors will depend on which
theme you have applied
o In this book, code chunks will be labelled with whether you should run them in the
console or add the code to a script
4. Running code several ways to run your code
a. Highlight the code run run selected line(s) (this is tedious and can cause problems if
you don’t highlight exactly the code you want to run
b. Press the green play button at the top right of the code chunk this will run all the lines
of code in that chunk
c. Keyboard shortcuts (for MacBook)
i. Run a single line of code: make sure the cursor is in the line of code you want to
run command + enter
ii. Run all of the code in the code chunk command + shift + enter
5. Inline code
o You can combine text and code to insert values into your writing using inline coding
o Ex: My name is `r name` and I am `r age` years old. It is `r halloween - today` days until
Halloween, which is my favorite holiday.
6. Rendering your file
o To render your file, click the Render button at the top of the source pane
o The console pane will open a tab called Background Jobs (because quarto isn’t an R
package)
o Your rendered html file may pop up in a separate web browser, a pop-up window in
RStudio, or in the Viewer tab of the lower right pane, depending on your RStudio settings
o Ex from the inline code: My name is Lisa and I am 48 years old. It is 4 days until
Halloween, which is my favourite holiday.
o You can also render you file with code: quarto::quarto_render("reporting_example.qmd")
(never puts this in a qmd script itself, or it will try to itself in a infinite loop)
4.8 WRITING A REPORT
We’re going to write a basic report for this dataset using quarto to show you some more of the
features. We’ll be expanding on almost every bit of what we’re about to show you throughout this
course.
Setup chunk:
- Most of your quarto docs should have a setup chunk at the top that loads any necessary
libraries and sets default values
Ex: add this just below the YAML header:
- Library(tideverse) makes tidyverse functions available
Chunk options:
- The chunk execution option label designates this as the setup chunk
- The include option makes sure that this chunk and any output it produces don’t end up in you
rendered doc
- Chunk options are structured like #| option: value and go at the very top of a code chunk
- Make sure there are no blank lines, code, or comments before any chunk options, otherwise the
options will not be applied.
Online sources:
3
, - We will read in a data file
- Load in data that is stored online: create a level 2 header (here calles “Data Analysis”), add a
code chunk below it and copy, paste and run this code: smalldata <-
read_csv("https://psyteachr.github.io/reprores/data/smalldata.csv")
o The data is stored in a .csv file read_csv()
o The url is contained within double quotation marks (“ “), it won’t work without it
- There are multiple ways to view and check a dataset in R
o Head()
o Summary()
o Str()
o View()
Local data files:
- More commonly, you’ll be working with data files that are stored on your computer
- You usually want to have all your scripts and data files for a single project inside one folder on
your computer, that project’s working directory, and we have already set up the main
directory reprores for this course
- You can organize files in subdirectories inside this main project directory (such as putting all raw
data files into a subdirectory called data)
- In your reprores directory, create a new folder named data, download a copy of the data file and
save it in this new subdirectory
o Or use this code: dir.create(path = "data", showWarnings = FALSE)
url <- “https://psyteachr.github.io/reprores/data/smalldata.csv”
download.file(url = url, destfile = "data/lecture4/smalldata.csv")
- To load in data from a local file, we can use the read_csv() function, but this time rather than
specifying an url, give it the subdirectory and file name
o Ex: smalldata <- read_csv(“data/lecture4/smalldata.csv”)
- Thing to note:
o You must include the file extension (here: .csv)
o The subdirectory folder name (data) and all the file names are separated by a forward
slash (/)
o Precision is important, if you have a typo in the file name, it won’t be able to find your file
(even with capital letters)
Data analysis:
- Count the number of rows in each group (when each row = participant): create a new code
chunk, then copy, paste and run the following code group_counts <- count(smalldata, group)
- Ex: The total number of participants in the **control** condition was `r group_counts$n[1]`.
o The $ sign is used to indicate specific variables (or columns) in an object using the
object$variable syntax
o Square brackets with a number (ex [1]) indicate a particular observation
o So group_counts$n[1] asks the inline code to display the first observation of the variable
n in the dataset group_counts
Code comments:
- You can add comments inside R chunks with the hash symbol (#)
- It’s usually good practice to start a code chunk with a comment that explains what you’re doing
there, especially if the code isn’t explained in the text of the report
- If you name your objects clearly, you often don’t need to add clarifying comments
Images:
- Create a code chunk to display a graph of the data in your doc after the text we’ve written so
far
4
CHAPTER 4: REPRODUCIBLE REPORTS
4.2 FUNCTIONS USED
4.3 REQUIRED PACKAGES
1. Install Quarto on your computer
2. Packages:
a. Knitr
b. Rmarkdown
c. Tidyverse
d. Palmerpenguins
e. Broom
f. Gtsummary
g. Gt
4.4 WHY USE REPRODUCIBLE REPORTS?
A reproducible report allows you to achieve the same results by using the same raw data,
computational steps, and methods
Reproducible reports help prevent transcription and rounding errors, promoting more transparent and
reproducible project communication
4.5 REPORTS
Projects in RStudio are a way to group all the files you need for 1 project most projects include
scripts, data files and output files like images of PDF reports
Start a project in RStudio
- Create a new project
o File new project select new directory select new project name the project
save it inside the default R-projects directory click create project
1
, - Don’t ever save a new project inside another project directory. This can cause some hard-to-
identify problems.
4.6 INTRODUCTION TO QUARTO
Use Quarto to create reproducible reports with a table of contents, text, tables, images and code
Why use it?
- Integrated workflow: you can write your thesis and run your analysis in the same document
no more switching between word and R
- Reproducibility: if your data changes, your document updates automatically
- Clarity and transparency: you can show your methods, code and results clearly
Getting started:
- Open a new quarto document: file new file quarto document
- To render a Quarto document means to convert your .qmd file into a final, readable format such
as HTML, PDF, Word, etc
Source vs visual editor
- Visual editor: familiar to working with word examples in this course are shown for the source
editor delete the line editor : visual if needed
o You won’t see the hashes that create headers or the asterisk that create bold and italic
text
o You also won’t see the backticks that demarcate inline code
4.7 QUARTO SYNTAX AND STRUCTURE
A quarto document (.qmd) is a plain text file with the following main components:
1. Metadata (YAML)
o At the top of the file, you’ll see some text between a pair of 3 dashes this is the YAML
header, which provides info to quarto about how you want to render a document
o It plays a key role in how your document is processed and displayed
o Components:
i. Title: sets the document’s name
ii. Author: specifies the creator of the document
iii. Format: specifies the desired output (ex: HTML, PDF, Word)
iv. Editor: opens the file in the visual editor
2. Markdown
o If you start a line with hashes, it creates a header
1 hash makes a document title
2 hashes make a document header
3 makes a subheader …
o Make sure you leave a blank line before and after a header and don’t put any spaces or
other characters before the first hash
o Put a blank line between paragraphs of text
o Bullet-point list items start with * or – and numbered list items start with 1
- Text style
o You can format test (to create headings, bold or italic text) using the toolbar through the
format menu
o You can insert code chunks, links, images or tables by using the insert menu
Shortcut: type a slash (/) at the start of a new paragraph, which lets you quickly
insert these elements
o Markdown syntax: simple, plain-text way to format you document
2
, Typing ##header at the start of a line creates a heading
Placing ** around text makes it bold, placing * around text makes it italic
3. Code chunks
o Anything written between lines that start with 3 backticks (```) is processed as code and
anything written outside is processed as markdown
o Code chunks are grey and plain text is white, but the actual colors will depend on which
theme you have applied
o In this book, code chunks will be labelled with whether you should run them in the
console or add the code to a script
4. Running code several ways to run your code
a. Highlight the code run run selected line(s) (this is tedious and can cause problems if
you don’t highlight exactly the code you want to run
b. Press the green play button at the top right of the code chunk this will run all the lines
of code in that chunk
c. Keyboard shortcuts (for MacBook)
i. Run a single line of code: make sure the cursor is in the line of code you want to
run command + enter
ii. Run all of the code in the code chunk command + shift + enter
5. Inline code
o You can combine text and code to insert values into your writing using inline coding
o Ex: My name is `r name` and I am `r age` years old. It is `r halloween - today` days until
Halloween, which is my favorite holiday.
6. Rendering your file
o To render your file, click the Render button at the top of the source pane
o The console pane will open a tab called Background Jobs (because quarto isn’t an R
package)
o Your rendered html file may pop up in a separate web browser, a pop-up window in
RStudio, or in the Viewer tab of the lower right pane, depending on your RStudio settings
o Ex from the inline code: My name is Lisa and I am 48 years old. It is 4 days until
Halloween, which is my favourite holiday.
o You can also render you file with code: quarto::quarto_render("reporting_example.qmd")
(never puts this in a qmd script itself, or it will try to itself in a infinite loop)
4.8 WRITING A REPORT
We’re going to write a basic report for this dataset using quarto to show you some more of the
features. We’ll be expanding on almost every bit of what we’re about to show you throughout this
course.
Setup chunk:
- Most of your quarto docs should have a setup chunk at the top that loads any necessary
libraries and sets default values
Ex: add this just below the YAML header:
- Library(tideverse) makes tidyverse functions available
Chunk options:
- The chunk execution option label designates this as the setup chunk
- The include option makes sure that this chunk and any output it produces don’t end up in you
rendered doc
- Chunk options are structured like #| option: value and go at the very top of a code chunk
- Make sure there are no blank lines, code, or comments before any chunk options, otherwise the
options will not be applied.
Online sources:
3
, - We will read in a data file
- Load in data that is stored online: create a level 2 header (here calles “Data Analysis”), add a
code chunk below it and copy, paste and run this code: smalldata <-
read_csv("https://psyteachr.github.io/reprores/data/smalldata.csv")
o The data is stored in a .csv file read_csv()
o The url is contained within double quotation marks (“ “), it won’t work without it
- There are multiple ways to view and check a dataset in R
o Head()
o Summary()
o Str()
o View()
Local data files:
- More commonly, you’ll be working with data files that are stored on your computer
- You usually want to have all your scripts and data files for a single project inside one folder on
your computer, that project’s working directory, and we have already set up the main
directory reprores for this course
- You can organize files in subdirectories inside this main project directory (such as putting all raw
data files into a subdirectory called data)
- In your reprores directory, create a new folder named data, download a copy of the data file and
save it in this new subdirectory
o Or use this code: dir.create(path = "data", showWarnings = FALSE)
url <- “https://psyteachr.github.io/reprores/data/smalldata.csv”
download.file(url = url, destfile = "data/lecture4/smalldata.csv")
- To load in data from a local file, we can use the read_csv() function, but this time rather than
specifying an url, give it the subdirectory and file name
o Ex: smalldata <- read_csv(“data/lecture4/smalldata.csv”)
- Thing to note:
o You must include the file extension (here: .csv)
o The subdirectory folder name (data) and all the file names are separated by a forward
slash (/)
o Precision is important, if you have a typo in the file name, it won’t be able to find your file
(even with capital letters)
Data analysis:
- Count the number of rows in each group (when each row = participant): create a new code
chunk, then copy, paste and run the following code group_counts <- count(smalldata, group)
- Ex: The total number of participants in the **control** condition was `r group_counts$n[1]`.
o The $ sign is used to indicate specific variables (or columns) in an object using the
object$variable syntax
o Square brackets with a number (ex [1]) indicate a particular observation
o So group_counts$n[1] asks the inline code to display the first observation of the variable
n in the dataset group_counts
Code comments:
- You can add comments inside R chunks with the hash symbol (#)
- It’s usually good practice to start a code chunk with a comment that explains what you’re doing
there, especially if the code isn’t explained in the text of the report
- If you name your objects clearly, you often don’t need to add clarifying comments
Images:
- Create a code chunk to display a graph of the data in your doc after the text we’ve written so
far
4