## Creating dummy variables in R

In statistical modeling and data analysis, dummy variables are often used to represent categorical data. The dummy variables are binary variables represented as either 0 or 1. This article deals with creating dummy variables in R. Let’s create the following data set for students containing their score of different subjects, by using the following commands […]

## Count Number of Observations by Group (Category) in R

While we are dealing with data analysis in R, it is essential to understand data structure and count number of observations by each variable or based on certain categories etc. This article deals with counting number of observations in diamonds data set. Before starting the article, first load the data set by using the following

## Fama and French Three-Factor Model

What exactly Fama-French three-factor model is? Before introducing multi-factor asset pricing models, the Capital Asset Pricing Model (CAPM) played a significant role in explaining the changes in stock returns. According to CAPM (this model has only one independent variable, which is market premium i.e. market return minus risk-free rate), the market premium can explain the

## Identify, Remove and Tag Duplicate Observations in R

Data cleaning is the most fundamental aspect of data analysis, ensuring the reliable and accurate results. Duplicate observations can, however, pose a challenge in this accuracy of data analysis, leading to skewed results. The handling of duplicate observations in R is a straightforward task, where the accuracy and reliability of data analysis can be ensured.

## Standardizing, normalization and Mean Centering of Variable in R

Standardization, normalization and mean centering of variable are common data processing techniques in Statistics and data analysis. It is important to standardize variables in statistics to compare and analyze different variables on the same scale. If you have two variables, one in inches and the other in centimeters, it’s not possible to compare these variables