总体均值的区间估计

x <- rnorm(36,80,5)
t.test(x)

    One Sample t-test

data:  x
t = 91.933, df = 35, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 77.76681 81.27893
sample estimates:
mean of x 
 79.52287 
str(t.test(x))
List of 10
 $ statistic  : Named num 91.9
  ..- attr(*, "names")= chr "t"
 $ parameter  : Named num 35
  ..- attr(*, "names")= chr "df"
 $ p.value    : num 2.49e-43
 $ conf.int   : num [1:2] 77.8 81.3
  ..- attr(*, "conf.level")= num 0.95
 $ estimate   : Named num 79.5
  ..- attr(*, "names")= chr "mean of x"
 $ null.value : Named num 0
  ..- attr(*, "names")= chr "mean"
 $ stderr     : num 0.865
 $ alternative: chr "two.sided"
 $ method     : chr "One Sample t-test"
 $ data.name  : chr "x"
 - attr(*, "class")= chr "htest"
t.test(x) $ conf.int[1] #置信区间下限
[1] 77.76681
t.test(x) $ conf.int[2] #置信区间上限
[1] 81.27893
margin_of_error <- qt(0.975,35)*sd(x)/sqrt(36)
ll <- mean(x)-margin_of_error
up <- mean(x)+margin_of_error
ci <- c(ll,up)
ci
[1] 77.76681 81.27893
x <- rnorm(36,80,5)
t.test(x)

    One Sample t-test

data:  x
t = 109.62, df = 35, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 79.05487 82.03822
sample estimates:
mean of x 
 80.54655 
str(t.test(x))
List of 10
 $ statistic  : Named num 110
  ..- attr(*, "names")= chr "t"
 $ parameter  : Named num 35
  ..- attr(*, "names")= chr "df"
 $ p.value    : num 5.37e-46
 $ conf.int   : num [1:2] 79.1 82
  ..- attr(*, "conf.level")= num 0.95
 $ estimate   : Named num 80.5
  ..- attr(*, "names")= chr "mean of x"
 $ null.value : Named num 0
  ..- attr(*, "names")= chr "mean"
 $ stderr     : num 0.735
 $ alternative: chr "two.sided"
 $ method     : chr "One Sample t-test"
 $ data.name  : chr "x"
 - attr(*, "class")= chr "htest"
t.test(x) $ conf.int[1]
[1] 79.05487
t.test(x) $ conf.int[2]
[1] 82.03822

置信水平的理解

#置信水平confidence level的理解


ci <- array(0,dim = c(5000,2))
for (i in 1:5000){
  x <- rnorm(36, 80, 5)
  ci[i,] <- c(t.test(x)$conf.int[1],t.test(x)$conf.int[2])
}

head(ci)
         [,1]     [,2]
[1,] 77.92869 81.13243
[2,] 78.00351 81.05289
[3,] 78.66402 81.66217
[4,] 78.47628 81.56928
[5,] 78.20933 80.71063
[6,] 79.37287 82.78821
mean(1*((ci[,1]<80) & (ci[,2]>80)))
[1] 0.9538

本章习题

总体服从均值为15,标准差为2的正态分布。

从总体中抽取容量为50的个体,组成一个样本,利用该样本构造总体均值的95%的置信区间。重复该过程1000次,利用这1000个样本构造1000个95%的置信区间。计算这1000个置信区间中包含总体均值的比例是多少?

从总体中抽取容量为50的个体,组成一个样本,利用该样本构造总体均值的90%的置信区间。重复该过程1000次,利用这1000个样本构造1000个90%的置信区间。计算这1000个置信区间中包含总体均值的比例是多少?

作答要求:上传代码、以及console中的输出结果