4  Basic Syntax, Variables, and Naming Conventions

NoteWhat This Chapter Covers

This chapter introduces the fundamental rules of writing R code. You will learn how R reads and evaluates code line by line, how to write comments and multi-line expressions, how R handles spaces and case, how to create variables using the three assignment operators, and what names you can legally give those variables. You will also see the naming conventions professional R users follow, the reserved words the language will not let you use, and the built-in functions that let you inspect or clear variables in your workspace. By the end of this chapter you will be able to write short R programs that assign, inspect, and reassign values with confidence.

flowchart LR
    S["Source Script <br> (your .R or .qmd file)"] --> E["R Parser <br> reads one statement"]
    E --> EV["R Evaluator <br> computes the value"]
    EV --> B["Binding <br> name -> value in environment"]
    B --> O["Output or next statement"]
    classDef default fill:#004466,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;


4.1 R Is an Expression-Driven Language

NoteCore Concept: Every Line Returns a Value

Unlike languages where statements and expressions are different things, almost every piece of R code is an expression that evaluates to a value. When you type 2 + 2 at the console and press Enter, R computes the result and prints it. When you type x <- 5, R also produces a value (the number 5), but it hides the output because assignment is treated as an invisible operation. This design keeps the language small and composable; you can put nearly any piece of code inside any other piece of code.

TipExpert Insight: The Console as a Calculator

New users often treat R like a scripting language where they must put everything inside a file. The R console is also a calculator. When you want to check an expression, compute a quick number, or test a function call, type it at the console. That habit shortens the learning loop enormously.


4.2 Statements, Expressions, and Comments

NoteHow To: Write Clean R Statements

An R statement is a single expression that R can evaluate. You end a statement by pressing Enter or by writing a semicolon ;. You do not need to end every line with a semicolon in R; line breaks are the natural separator. Comments begin with the hash symbol # and continue to the end of the line.

NoteSemicolons and Multi-Statement Lines

You can place more than one statement on a single line by separating them with semicolons. This is sometimes useful for tight scripts but is usually discouraged because it reduces readability.

TipBest Practice: One Idea per Line

Write one statement per line, comment the why (not the what), and let R’s own output do the talking. Future-you reading a script six months from now will thank present-you.

WarningCommon Mistake: Forgetting That R Has No Block Comments

R does not have a native multi-line comment syntax the way C uses /* ... */. To comment out a block of code, prefix every line with #. Most IDEs do this for you with a keyboard shortcut: Ctrl+Shift+C in RStudio.


4.3 Case Sensitivity and Whitespace

NoteCore Concept: R Is Case Sensitive

Age, age, and AGE are three different names in R. The same applies to function names: mean() works, Mean() does not. Treat capitalisation as meaningful information.

NoteWhitespace Is Mostly Ignored

R ignores spaces around operators, inside parentheses, and between tokens. Use spaces to improve readability; do not use them in a way that hides meaning.

Readable Legal but Hard to Read
x <- 5 + 3 x<-5+3
mean(c(1, 2, 3)) mean(c(1,2,3))
y <- (a + b) / 2 y<-(a+b)/2
WarningCommon Mistake: x<-5 Can Be Parsed as x < -5

Writing the assignment operator without spaces around it is usually safe, but in a few contexts it is genuinely ambiguous. The expression x<-5 reads as x <- 5 (assignment), but x < -5 reads as “is x less than negative five”. Always write a space on both sides of <- to avoid the trap.


4.4 Variables and the Three Assignment Operators

NoteCore Concept: A Variable Is a Name for a Value

In R, a variable is not a box that holds a value. It is a name that is bound to a value stored somewhere in memory. When you write x <- 10, R creates the number 10 and binds the name x to it. When you reassign x to something else, the old binding is replaced; the old value may be garbage collected.

R provides three operators for assignment. All three work, but they are not identical in where you can use them or how they read.

Operator Direction Typical Use
<- right-to-left The idiomatic choice in scripts and the R community.
= right-to-left Common inside function calls (for named arguments), sometimes used for assignment.
-> left-to-right Legal but rare; occasionally useful at the end of a pipeline.
NoteSeeing All Three in Action
TipBest Practice: Prefer <- for Assignment

The R community, including the Tidyverse style guide and most textbooks, prefers <- for assignment and reserves = for named arguments inside function calls. This convention makes scripts easier to scan, because the eye can distinguish “this creates a variable” (<-) from “this passes an argument” (=). In RStudio, the keyboard shortcut Alt + - (hyphen) inserts <- with spaces on both sides.


4.5 Reassigning and Updating Variables

NoteHow To: Update a Variable’s Value

Reassignment simply binds the name to a new value; the old value is discarded. The variable can even change type across reassignments because R is dynamically typed.

TipExpert Insight: Dynamic Typing Is Powerful and Dangerous

R letting you rebind score from number to string is convenient in interactive work and treacherous in larger scripts. A common source of bugs is reusing a variable name for something of a different type mid-way through a script. Pick fresh names when the meaning changes.


4.6 Rules for Naming Variables

NoteThe Hard Rules R Enforces
Rule Example of Legal Name Example of Illegal Name
Must start with a letter or a dot . (not followed by a digit). income, .private 2ndRound, .3rd
May contain letters, digits, underscore _, and dot .. mean_score_2024, price.usd mean-score, price usd
Cannot contain operators or spaces. total_cost total cost, total+cost
Cannot be a reserved word. result, count TRUE, if, function
NoteReserved Words You Cannot Use as Names

R refuses to let you assign a value to any of these reserved words. Using them as variable names produces a syntax error.

Category Reserved Words
Logical constants TRUE, FALSE, T, F
Missing and special values NA, NA_integer_, NA_real_, NA_character_, NA_complex_, NULL, NaN, Inf
Control flow if, else, for, while, repeat, break, next, return
Declaration function
WarningT and F Are Reassignable, But Do Not Do It

The letters T and F are shortcuts for TRUE and FALSE. Unlike TRUE and FALSE themselves, they are ordinary variables that happen to be pre-assigned. You can overwrite them with T <- 0, and disaster follows for every piece of code that assumed T meant TRUE. Write TRUE and FALSE in full and never reassign T or F.


4.7 Naming Conventions: What the R Community Uses

NoteCommon Styles You Will See in the Wild
Style Example Who Uses It
snake_case mean_income, customer_id Tidyverse, modern R, this book.
camelCase meanIncome, customerId Some older R packages, developers from a Java background.
dot.case mean.income, customer.id Base R (e.g. data.frame, read.csv), older code.
PascalCase MeanIncome Rare in R; more common for function objects in some codebases.
UPPER_SNAKE MAX_SCORE, N_RUNS Constants and configuration values.
TipBest Practice: Pick One Style and Stick With It

Consistency matters more than the style you pick. A script that mixes meanIncome, mean.income, and mean_income is much harder to scan than a script that picks any one style and uses it everywhere. This book and the Tidyverse both use snake_case throughout.

WarningAvoid Names That Shadow Built-in Functions

Names like data, df, c, t, mean, and sum already exist as built-in functions or datasets in R. If you reassign them, you will shadow the original and confuse your future self and any collaborator. Use customer_df instead of df, avg_score instead of mean, and so on.


4.8 Inspecting and Managing Your Variables

NoteCore Tools for Workspace Management

R provides a handful of built-in functions that let you see, check, and remove the variables in your current session’s environment.

Function Purpose
ls() List all objects in the current environment.
exists("name") Return TRUE if a variable with that name exists.
class(x) Report the class of the value bound to x.
str(x) Show the structure and type of x compactly.
rm(x) Remove the binding for x from the environment.
rm(list = ls()) Remove every user-created object. Use with caution.
NoteRemoving Variables
TipExpert Insight: Start Each Analysis with a Clean Session, Not rm(list = ls())

A common pattern in old R tutorials is to start every script with rm(list = ls()). That clears your workspace but not the packages that are already attached, and it does not reset random seeds or option settings. A truly clean start comes from restarting R itself (in RStudio: SessionRestart R, or the keyboard shortcut Ctrl+Shift+F10). Restarting R is the modern, reproducible way to start fresh.


4.9 A Small Worked Example

NotePutting the Chapter Together

The snippet below applies every idea from this chapter: three assignment operators, readable names, comments that explain the why, a reassignment, and a workspace inspection at the end.

Every one of those three assignment operators works. In production code, the entire block would be written with <- for consistency.



Summary

Concept Description
Syntax Basics
Expression-Driven Language Every line in R returns a value
Statements and Comments Statements are separated by newlines; comments use #
Semicolons and Multi-Line Semicolons let you place multiple statements on one line, but one idea per line is best
Case Sensitivity x and X are different objects in R
Whitespace Whitespace is mostly ignored except inside strings
Variables and Names
Variable Assignment Use <- as the standard assignment operator
Naming Conventions Snake_case or camelCase, descriptive and consistent
Reserved Words Names like TRUE, FALSE, NULL, NA, function are reserved