数据集:mtcars
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
1.1
用select()函数从mtcars提取5个变量⽣成新的数据框。
1.2
用mutate()函数在数据框中追加新的变量,将油耗变量mpg(miles per gallon)转换成转公里/升(kilometers per liter)的油耗指标。
1 miles per gallon = 0.425 kilometers per liter
1.3
任选mpg中的某个变量,用if_else()函数对该变量的数值进⾏条件转换。
1.4
设置2个筛选条件,用filter()函数从mpg筛选个案⽣成新的数据框。
1.5
报告disp, hp, drat, wt, qsec的相关系数矩阵,并对相关系数矩阵进行可视化呈现。
## corrplot 0.92 loaded
mtcars %>% 
  select(disp:qsec) %>% 
  cor() %>% 
  round(3) %>% 
  corrplot(addCoef.col = "white",
           number.cex = 0.8,
           number.digits = 3,
           tl.cex = 0.8,
           tl.col = 1,
           cl.length = 11,
           type = "upper",
           method = "square")mtcars %>% 
  select(disp:qsec) %>% 
  cor() %>% 
  round(3) %>% 
  corrplot(addCoef.col = "white",
           col =c("cyan","white","purple"), 
           number.cex = 0.8,
           number.digits = 3,
           tl.cex = 0.8,
           tl.col = 1,
           cl.length = 11,
           type = "lower",
           method = "circle")mtcars %>% 
  select(disp:qsec) %>% 
  cor() %>% 
  round(3) %>%
  corrplot.mixed(upper = "circle",
                 lower = "number",
                 addgrid.col = "grey",
                 tl.col = "black")HTML COLOR CODES: https://htmlcolorcodes.com
Colors in R: http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf
## 
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
gr <- colorRampPalette(c("cyan","white","purple"))
mtcars %>% 
  select(mpg:qsec) %>% 
  cor.plot(cex = 0.6,
           stars = TRUE,
           gr = gr,
           cex.axis = 0.8)gr <- colorRampPalette(c("#E9E8FF", "white", "#A22491"))
mtcars %>% 
  select(mpg:qsec) %>% 
  cor.plot(cex = 0.6,
           stars = TRUE,
           gr = gr,
           cex.axis = 0.8)
将相关系数的绝对值按降序排列
## Returning only the top 15. You may override with the 'top' argument
1.6
任选两个定量变量,用ggplot绘制散点图,设置散点图的颜色、形状、大小,并添加一条回归线。
# 若用数值型变量映射颜色,渐变色
mtcars %>% 
  ggplot(aes(wt,mpg,col = am))+
  geom_point()+
  geom_smooth(method = lm,
              se = F)## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation:
## colour.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
##   the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
##   variable into a factor?
# 用factor把数值型变量转换成因子
mtcars %>% 
  ggplot(aes(wt,mpg,col = factor(am)))+
  geom_point()+
  geom_smooth(method = lm,
              se = F)+
  scale_color_manual(values = c("cyan4", "purple"),
                     label = c("automatic",
                               "manual"))+
  labs(title = "Scatter Diagram of Weight vs. MPG",
       x = "Weight",
       y = "Miles per Gallon",
       col = "transmission")+
  theme(plot.title = element_text(hjust = 0.5))+
  theme_bw()## `geom_smooth()` using formula = 'y ~ x'
# 生成字符串变量
mtcars %>% 
  ggplot(aes(wt,mpg,col = transmission))+
  geom_point()+
  geom_smooth(method = lm,
              se = F)+
  scale_color_manual(values = c("cyan4", "purple"),
                     label = c("automatic",
                               "manual"))+
  labs(title = "Scatter Diagram of Weight vs. MPG",
       x = "Weight",
       y = "Miles per Gallon",
       col = "transmission")+
  theme(plot.title = element_text(hjust = 0.5))+
  theme_bw()## `geom_smooth()` using formula = 'y ~ x'
# 若用数值型变量映射颜色,渐变色
mtcars %>% 
  ggplot(aes(wt,mpg,col = factor(cyl)))+
  geom_point()+
  geom_smooth(method = lm,
              se = F)+
  labs(title = "Scatter Diagram of Weight vs. MPG",
       x = "Weight",
       y = "Miles per Gallon",
       col = "Number of Cylinders")+
  theme(plot.title = element_text(hjust = 0.5))+
  theme_bw()## `geom_smooth()` using formula = 'y ~ x'
1.7
将32辆汽车按VS分成两组(Engine (0 = V-shaped, 1 = straight)),绘制分组散点图,用ggplot绘制散点图,设置散点图的颜色、形状、大小,并添加一条回归线。
#颜色、形状用离散型变量映射
#注意aes(shape = factor(am)是放置在ggplot()中的
mtcars %>% 
  ggplot(aes(wt,mpg,
         col = factor(vs),
         size = cyl,
         shape = factor(am)))+
  geom_point()+
  geom_smooth(method = lm, se = F)+
  scale_color_manual(values = c("deeppink", "darkviolet"),
                     label = c("V-shaped",
                               "Straight"))+
  labs(title = "Scatter Diagram of Weight vs. MPG",
       x = "Weight",
       y = "Miles per Gallon",
       col = "Engine",
       shape = "Transmission")+
  theme(plot.title = element_text(hjust = 0.5))+
  theme_bw()## `geom_smooth()` using formula = 'y ~ x'
#注意aes(shape = factor(am)是放置在geom_point()中的
mtcars %>% 
  ggplot(aes(wt,mpg,
         col = factor(vs),
         size = cyl))+
  geom_point(aes(shape = factor(am)))+
  geom_smooth(method = lm,se = F)+
  scale_color_manual(values = c("deeppink", "darkviolet"),
                     label = c("V-shaped",
                               "Straight"))+
  scale_shape_manual(values = c("circle","triangle"),
    label = c("automatic","manual"))+
  labs(title = "Scatter Diagram of Weight vs. MPG",
       x = "Weight",
       y = "Miles per Gallon",
       col = "Engine",
       shape = "Transmission",
       size = "Number of Cylinders")+
  theme(plot.title = element_text(hjust = 0.5))+
  theme_bw()## `geom_smooth()` using formula = 'y ~ x'