
videocam_off
This livestream is currently offline
Check back later when the stream goes live
In this video we discuss about different types of ttest that we can apply e.g. one sample ttest,two sample ttest, independent sample ttest andpaired sample ttest.
00:00 Introduction to video
1:00 One sample ttest
2:50 Independent Sample ttest
5:30 Paried Sample ttest
Website: thedatahall.com
As an Amazon Associate, I earn from qualifying purchases.
Show More Show Less View Video Transcript
0:00
Welcome to the Data Hall YouTube channel. In this video we are going to talk about
0:03
the t-tests in R. So what we are going to do is we
0:07
this is what we call mean comparison. So we test whether the mean is significant or not. So that means
0:15
significantly different from one value or the mean between two different groups
0:20
is same or it is different. For that we are going to use the auto data. So let's
0:25
first load the data. We have the data over here. We have
0:29
different firms, different cars, their prices, their mileage, their weight and length etc
0:36
And then there is one variable of interest that is whether the car is
0:40
domestically produced or the car is a foreign produced product. So we have different types of t-tests. One is called
0:49
one sample t-test. Then we have two sample or what we also call independent sample t-test. Then we have the dependent sample or the
0:58
paired sample t-test. So let's start with the one sample t-test. So we have this variable called the price of the car. So what we want to do
1:06
is we want to take that variable and we want to see whether
1:10
its mean value is different from 6000 or not. So our hypothesis is that
1:16
the mean value of the mean price of the car is 6000 in our data set. So we are going to use the t-test function
1:25
and take the data, the column, the specific column, specify the specific mean that we want to test
1:31
and press ctrl enter. And what we are interested in is the p value. This p value is greater than 0.05
1:40
That means we accept the the null hypothesis. And in this case the null hypothesis is that
1:48
the mean value is equal to zero. Sorry then the mean value is equal to
1:53
6000. So because we are not able to reject the null hypothesis so we conclude that the mean value is
2:02
equal to 6000. The alternate hypothesis has been specified over here which says
2:07
that the true mean value is not equal to 6000. So what our conclusion is that the mean value in this data set
2:13
is equal to the mean price is equal to 6000. If we were to check whether it is different from zero or not the mean
2:21
value is different from zero or not. So we would specify the mu as zero and what this says is that the p value is quite low it is
2:31
less than 0.05. So we reject the null hypothesis in favor of alternate hypothesis and the alternate hypothesis says is that the
2:38
mean value is not equal to zero. So that was one sample t-test we apply
2:42
it when we have a data and we want to check its
2:47
mean value. Okay let's move to true sample t-test it's also called independent sample t-test for a reason because we consider
2:56
that these samples are independent of one another. So we have this variable called foreign
3:02
and it contains whether the car is domestically produced or a foreign
3:05
produced car. So we want to compare the mean price between these two categories of this foreign variable that is
3:14
we want to see whether the average price of a car is same whether the car is produced domestically or it is produced
3:23
in a foreign country. So what we do is we use the t-test
3:28
and we take the price variable use the tilde sign and then specify the
3:32
categorical variable and obviously specify the data. If I press ctrl enter I can see that our p value is
3:40
greater than 0.05. In this case the the null hypothesis is that the mean
3:46
value of the the average price of foreign and domestically produced car is same and in this case we are going to
3:54
accept that null hypothesis. The alternate is that the mean value the true difference in mean
4:00
between the group domestic and group foreign is not equal to zero. So that is
4:05
their means are different so the null is the means are same and the alternate is
4:10
that the means are different. Now in this case we had our data stacked
4:14
over one another so our categories were stacked over one another but some cases we would have let's say a variable
4:20
called foreign price and then we would have a column called domestic price. In that case what we do
4:27
is let's generate that data using the split function I am going to generate
4:32
that data. So now we have this groups list where we contain it can be data frame or a list
4:40
the the idea is that now we have two columns one is domestic car and then the foreign car we have
4:46
52 observations in domestic cars and 22 observations in foreign foreign cars and we want to compare their mean so the result would be
4:56
exactly the same but the way we specify is different we just
5:00
specify these two columns now and R would compare now R would
5:05
generate the exact same p value the exact same mean values right it would also generate the mean value so previously the
5:13
domestic car as mean was 60.72 and currently it is also 60.7
5:20
60 60 6072 so so there are different ways of doing doing this two sample t-test and this
5:29
is what we have demonstrated. The next is pair sample t-test what we also call the dependent sample t-test and for that I'm going to use a
5:37
different data set I'm going to use the chicken weights and what we have over here is that we have the weights of the chicks
5:47
and at what day that that weight was taken so that is
5:54
the zero means on the date of the birth and two means
5:58
two days after the birth. So what we are going to do is let's create some category filter the data
6:04
just have the two categories that is weight on day zero and weight on day two and we want to compare whether their
6:13
mean weight is different or not but in this case it is different from our previous data in previous data we had two different
6:20
categories and they were totally independent but in this case the subject is the same that is the same chicks and we want to compare their
6:28
weights so what we do is we use the pair sample t-test apply the
6:32
function specify the the column the weights column and then the categorical variable and specify whether we want to do
6:42
pair sample t-test or not if you do not specify this parameter then it would apply the the the one sample t-test right this is
6:52
sorry the two sample independent sample t-test so this argument is important for it to come
6:58
to do the paired sample t-test so let me apply this and again we are going to look into the the p-value
7:06
it is less than 0.05 in this case the null hypothesis is that the two
7:11
two means are same and the alternate is that two means are different
7:16
so our conclusion is that based on the the p-value that the two
7:21
means are different because when the p-value would be less than 0.05
7:25
we would reject the null hypothesis and accept the alternate hypothesis so i hope this was useful do subscribe to this channel
7:33
and do hit the bell icon
#Science
#Vehicle Specs, Reviews & Comparisons


