In R, while doing data manipulation and analysis, one might need to combine vectors and data frames to create a holistic data set. The cbind() and rbind() are powerful functions, allowing us to combine vectors and data frames efficiently.
Let’s start by creating two vectors, having first and last names of players, by using following commands.
f_name <- c("Stephen","Chris","Derrick")
l_name <- c("Cury","Paul","Rose")
Using cbind() in R
The first function we are using is cbind() which essentially stands for column bind. This function is used to bind vectors or matrices as columns to create a new matrix. So for the above vectors we just created, if we want to combine them by column, the following command is used
full_cbind <- cbind(f_name,l_name)
The above command gives following results
It’s important to note that
cbind() works element-wise, meaning it combines the elements of vectors or matrices based on their positions in columns.
Using rbind() in R
Just like cbind() is used to combine columns of vectors or data frames, rbind() combines the rows of vectors or data frames. It binds vectors, matrices, or data frames by rows to create a new vector, matrix, or data frame. Let’s demonstrate it by using the above vectors created earlier. If we want the rows of above data set to be combined, use the following command
full_rbind <- rbind(f_name,l_name)
The command combines the rows of data in following way
Combining vectors with Data frames:
Now, if we have a data frame, and we want to combine these vectors and have a data frame, this could also be done by using the cbind() or rbind() function, depending on whether you want the data to be combined by columns or by rows. To understand this, let’s create a data frame using following following command
basketball <- data.frame(f_name = c("Stephen", "Chris" , "Derrick"), l_name = c("Cury", "Paul" , "Rose"))
To create a vector for combining it later with the data frame, we use following command
age <- c("32" , "29" , "34")
Both data frame and vector has been created, let’s combine these by using following command
all_details <- cbind(basketball,age)
The above command combines vector with data frame in a single data set, as shown below.
Note that, in above commands, both data frame and vector are of same length, so they get combined smoothly.
Combine Multiple vectors or Data frames using cbind() and rbind()
In order to combine multiple vectors or data frames, we again take the example of the above created vectors named “f_name, l_name, age”. To combine these vectors, use the following command
three_vectors <- cbind(f_name,l_name,age)
This generates the following output, showing a matrix of 3 by 3.
It’s important to note that the resulting object consists of three vectors side by side, forming rows and columns.
These vectors combined above by the name of “three_vectors” can be converted into data frames by the following command
three_vectors <- as.data.frame(three_vectors)
Combining vectors and data frames of different lengths
When you want to combine data of different lengths, you generally have a few options to use. For instance, let’s generate data frames of different lengths by using following commands
football <- data.frame(f_name_foot = c("Mathew", "Andrew" , "Brad","Joe"), l_name_foot = c("Ryan", "Redmayne" , "Jones","Gauci")) basketball <- data.frame(f_name = c("Stephen", "Chris" , "Derrick"), l_name = c("Cury", "Paul" , "Rose"))
For above data frames, football has 4 observations in each column, and basketball has three observations in each column. Thus, the length of data set is different.
Now, if we combine these two data frames of different lengths; football and basketball, either by using rbind() or cbind() function, the command wouldn’t run. Let’s try by using both commands as given below
missing_rbind <- rbind(basketball,football) missing_cbind <- cbind(basketball,football)
The following error will be shown by the above commands. This implies that the length of data sets required to combine should be same.
The only way to combine these data frames of different lengths is by using another function, bind_rows(). To use this function, we first load the following library
Next, use the following command to combine these data frames
missing_rbind <- bind_rows(basketball,football)
The output generated from the above command is following
The rbind() function appends the data set in a way that any missing columns in the shorter data frame are filled with
NA values, as shown in the above image.
Another way to combine these columns of different lengths is by using the merge() function. The merge function is used for merging data frames based on common columns.
The following command shows how we can use merge() function to combine these columns of different lengths.
football <- cbind("row_no"=row.names(football),football) basketball <- cbind("row_no"=row.names(basketball),basketball) missing_cbind <- merge(football,basketball, all=TRUE)
In the first two commands, row_no assigns unique row number to each row in the data frame. It creates a new column called “row_no” in both the “football” and “basketball” data frames. The purpose of adding the “row_no” column is to give each row a unique identifier so that the data frames can be merged based on these identifiers. the next step is to merge these columns based on that unique identifier. The following output is generated from the above command
Similarly, if you have vectors of different lengths, they should be assigned a same length or maximum length of the vector, and then combine using cbind() or rbind() function. If three vectors have different length, then we first find out the maximum length of the vector by using following command
m_len <- max(length(f_name), length(l_name),length(age))
The output shows that maximum length of above vectors is 3, so we assign same length to each of the above vectors. Following commands should be used to serve the purpose
length(f_name) <- m_len
length(l_name) <- m_len
length(age) <- m_len
Once the same length has been assigned, next we combine these vectors by using either of the functions, as shown in the command below
diff_len <- cbind(f_name,l_name,age)
The above command combines vectors and save them by the name of diff_len.