Essentials of R coding I

Author

Dr. Adrian Correndo

Published

January 9, 2026

Introduction

This page provides an overview of the essential types of elements in R, including examples and explanations for each. Use this as a quick reference to understand the basics of data types and operations.

Type of elements in R

01. Numbers

20
[1] 20

02. Math Operations

20+1 # addition
[1] 21
20-4 # subtraction
[1] 16
20*5 # multiplication
[1] 100
20/5 # division
[1] 4
2^2 # exponentials
[1] 4
sqrt(9) # square root
[1] 3
# Greater exponents for roots
# notation is: x^(1/n)

# Cubic root of 27
27^(1/3)  # Result: 3
[1] 3
# 4th root of 16
16^(1/4)  # Result: 2
[1] 2
# 5th root of 32
32^(1/5)  # Result: 2
[1] 2

03. Text or characters (also called strings)

"coding is fun"
[1] "coding is fun"

But these elements are not stored as objects yet:

04. Define objects

a <- 20
10 -> b
# We can also use equal:
c = 15
# But using "<-", and leave = only for operations (so you can notice the difference) is considered a better coding practice.

06. Vectors

A vector is one of the most basic data structures. It is a sequence of elements of the same type, such as numbers, characters, or logical values. Vectors are used to store and manipulate collections of data efficiently.

a. Creating a vector

Vectors can be created using the c() function (combine function):

# Numeric vector
numeric_vector <- c(1, 2, 3, 4.5)
numeric_vector
[1] 1.0 2.0 3.0 4.5
# Character vector
character_vector <- c("corn", "wheat", "soybean")
character_vector
[1] "corn"    "wheat"   "soybean"
# Logical vector
logical_vector <- c(TRUE, FALSE, TRUE)
logical_vector
[1]  TRUE FALSE  TRUE

b. Accessing Elements

You can access elements of a vector using square brackets []:

# Access the first element
numeric_vector[1]
[1] 1
# Access multiple elements
numeric_vector[c(1, 3)]
[1] 1 3

c. Vectorized Operations

In R, vector-operations are applied to each element automatically:

# Adding a scalar to a vector
numeric_vector + 2
[1] 3.0 4.0 5.0 6.5
# Element-wise addition
numeric_vector + c(10, 20, 30, 40)
[1] 11.0 22.0 33.0 44.5

d. Common Functions with Vectors

  • ‘length()’: Get the number of elements in a vector.
  • ‘typeof()’ or ‘class()’: Determine the type of elements in a vector.
  • ‘seq()’: Generate a sequence of numbers.
  • ‘rep()’: Repeat elements to create a vector.

07. Lists

In R, a list is a versatile data structure that can contain elements of different types, including vectors, matrices, data frames, and even other lists. Unlike vectors, which are homogeneous, lists are heterogeneous, meaning their elements can be of different data types and lengths.

Key Characteristics of Lists:

  1. Heterogeneous: Lists can store elements of varying types (numeric, character, logical, etc.) and structures (vectors, data frames, functions, etc.).

  2. Indexed: Elements in a list are accessed using double square brackets [[ ]] or named elements using $.

Why Use Lists?

  1. Flexibility: Lists can store complex and nested data structures.

  2. Data Wrangling: Useful for handling results from models, nested data, or any mixed-type collections.

  3. Functions: Functions in R often return their output as lists (e.g., lm()).

a. Creating a list

Lists are created using the list() function:

# Create a list with different types of elements
my_list <- list(
  "numeric_v" = numeric_vector,
  "character_v" = character_vector,
  "single_number" = 42,
  "logical_value" = TRUE
)

b. Accessing Elements in a List

You can access elements in a list by their position or name:

By Position:

# Access the first element
my_list[[1]]
[1] 1.0 2.0 3.0 4.5
# Access the second element
my_list[[2]]
[1] "corn"    "wheat"   "soybean"

By name:

# Access by name
my_list$numeric_v
[1] 1.0 2.0 3.0 4.5
my_list$character_v
[1] "corn"    "wheat"   "soybean"

Subelements:

# Access the first value in the numeric vector
my_list$numeric_vector[1]
NULL

c. Some functions for lists

# Number of elements in the list
length(my_list)
[1] 4
# Names of the elements
names(my_list)
[1] "numeric_v"     "character_v"   "single_number" "logical_value"
# Structure of the list
str(my_list)
List of 4
 $ numeric_v    : num [1:4] 1 2 3 4.5
 $ character_v  : chr [1:3] "corn" "wheat" "soybean"
 $ single_number: num 42
 $ logical_value: logi TRUE

08. Data frame

In R, a data frame is a two-dimensional data structure used for storing tabular data. It is one of the most commonly used data structures in R for data analysis and manipulation.

Key Characteristics of a Data Frame

  1. Tabular Structure: Data is organized in rows and columns.

  2. Heterogeneous Columns: Each column can contain different data types (e.g., numeric, character, logical), but all elements in a column must be of the same type.

  3. Row and Column Names: Rows and columns can have names for easier identification.

Why Use a Data Frame?

  1. Data Analysis: It is ideal for representing structured data like spreadsheets or databases.

  2. Flexible Operations: Columns can be easily added, removed, or modified.

  3. Integration with R Functions: Many R functions for statistical modeling and analysis expect data frames as input.

a. Creating a Data Frame

A data frame can be created using the data.frame() function:

# Create a data frame
my_data <- data.frame(
  Crop = c("Corn", "Wheat", "Soybean"), # Character column
  Yield = c(180, 90, 50), # Numeric column
  Legume = c(FALSE, FALSE, TRUE) # Logical column
)

print(my_data)
     Crop Yield Legume
1    Corn   180  FALSE
2   Wheat    90  FALSE
3 Soybean    50   TRUE

b. Accessing data in a data frame

Accessing columns:

# Access a column by name
my_data$Crop
[1] "Corn"    "Wheat"   "Soybean"
# Access a column by index
my_data[, 2]
[1] 180  90  50

Accessing rows:

# Access the first row
my_data[1, ]
  Crop Yield Legume
1 Corn   180  FALSE
# Access specific rows
my_data[c(1, 3), ]
     Crop Yield Legume
1    Corn   180  FALSE
3 Soybean    50   TRUE

Accessing specific elements

# Access the element in the 2nd row, 3rd column
my_data[2, 3]
[1] FALSE
# Access specific cells by column name
my_data[2, "Crop"]
[1] "Wheat"

c. Adding a new column

my_data$Season <- c("Summer", "Winter", "Summer")

d. Modify a column

my_data$Yield <- my_data$Yield + 5

e. Adding a new row

In base R, we can use rbind() to add rows:

new_row <- data.frame(Crop = "Barley", Yield = 80, Legume = FALSE, Season = "Winter")
my_data <- rbind(my_data, new_row)

f. Filtering (rows)

In base R, we can use subset() to filter rows:

subset(my_data, Yield > 150)
  Crop Yield Legume Season
1 Corn   185  FALSE Summer

We can also use logical conditions:

my_data[my_data$Legume == TRUE, ]
     Crop Yield Legume Season
3 Soybean    55   TRUE Summer

g. Selecting (columns)

In base R, there is no function to select columns. We need to use brackets [] and vectors c():

my_data[c("Crop", "Yield")]
     Crop Yield
1    Corn   185
2   Wheat    95
3 Soybean    55
4  Barley    80

h. Some functions for data frames

nrow(my_data)        # Number of rows
[1] 4
ncol(my_data)        # Number of columns
[1] 4
colnames(my_data)    # Column names
[1] "Crop"   "Yield"  "Legume" "Season"
summary(my_data)     # Summary statistics
     Crop               Yield          Legume           Season         
 Length:4           Min.   : 55.00   Mode :logical   Length:4          
 Class :character   1st Qu.: 73.75   FALSE:3         Class :character  
 Mode  :character   Median : 87.50   TRUE :1         Mode  :character  
                    Mean   :103.75                                     
                    3rd Qu.:117.50                                     
                    Max.   :185.00                                     

09. Matrix

In R, a matrix is a two-dimensional, rectangular data structure that stores elements of the same type. It is similar to a data frame in structure but less flexible, as all elements in a matrix must be of a single data type (e.g., numeric, character, or logical).

Key Characteristics of a Matrix

  1. Homogeneous: All elements in a matrix must be of the same type.

  2. 2D Structure: A matrix has rows and columns, forming a table-like structure.

  3. Dimensions: Defined by the number of rows and columns.

Why Use a Matrix?

  1. Mathematical Operations: Ideal for linear algebra and mathematical modeling.

  2. Efficient Storage: Matrices use less memory compared to more complex structures like data frames.

  3. Simpler Operations: Homogeneous data ensures consistent behavior across elements.

a. Creating a Matrix

You can create a matrix using the matrix() function:

# Create a numeric matrix
my_matrix <- matrix(
  data = 1:9,     # Data values
  nrow = 3,       # Number of rows
  ncol = 3,       # Number of columns
)

print(my_matrix)
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

b. Accessing elements in a matrix

Accessing rows:

# Access the first row
my_matrix[1, ]
[1] 1 4 7

Accessing columns:

# Access the second column
my_matrix[, 2]
[1] 4 5 6

Accessing specific elements:

# Access the element in the 2nd row, 3rd column
my_matrix[2, 3]
[1] 8

c. Adding a new column

new_col <- c(10, 11, 12) # Create the column
my_matrix <- cbind(my_matrix, new_col) # Paste it to the existing

d. Adding a new row

new_row <- c(13, 14, 15, 16)
my_matrix <- rbind(my_matrix, new_row)

10. Functions

a. Create a function

We need to use the syntax function(x) { x as object of a task }. ‘x’ is considered an “argument”, and the function itself is inside the {}. For example:

my_function <- function(x) { x + 1 }

b. Check the function

my_function(9)
[1] 10

c. Write a function with 3 arguments

my_xyz_function <- function(x, y, z) { x + y - z }

d. Order of arguments

Note: R is order sensitive (if you don’t explicitly specify the argument)

my_xyz_function(12, 3, 4)
[1] 11
my_xyz_function(12, 4, 3)
[1] 13

e. Specifying arguments with names

If you specify the argument name as = to, the order doesn’t matter:

my_xyz_function(z = 4, x = 12, y = 3)
[1] 11

f. A more complex function

fx <- function(x, y, remove_na = NULL) {
        # First operation is a sum, removing NAs
        first <- sum(c(x, y), na.rm = remove_na)
        # Add a text message
        second <- "This function is so cool"
        # Store result
        result <- first + x
        # Print output
        print(list("Message" = second,
                   "1st" = first,
                   "end" = result))
                   }

Run the function with alternative arguments:

fx(x = a, y = b, remove_na = FALSE)
$Message
[1] "This function is so cool"

$`1st`
[1] 30

$end
[1] 50
fx(x = a, y = b, remove_na = TRUE)
$Message
[1] "This function is so cool"

$`1st`
[1] 30

$end
[1] 50

Store the output in an object:

foo <- fx(x=b, y=a)
$Message
[1] "This function is so cool"

$`1st`
[1] 30

$end
[1] 40