一、dataframe基本操作
1.读取数据集,生成dataframe,查看前几行数据
data <- read.csv("../input/ab_data.csv", header = TRUE)
# 生成dataframe
data <- data.frame(data)
# 查看数据
head(data)
2.取指定行,比如下面取group列中为“treatment”和landing列中为“old_page”的
get1 <- data %>% filter( group == "treatment" & landing_page == "old_page")
3.取列名, 直接使用$即可
data1 <- data$converted
4.新建列,这里是提取timestamp里的日期
data$day_date<-as.Date(data$timestamp)
5.提取唯一值,类似python的unique
unique_id <- unique(data$user_id)
length(unique_id)
6.行列合并, rbind和cbind
notaligned_user <- data%>% filter( group == "treatment" & landing_page == "old_page")
notaligned_user2 <- data%>% filter( group == "control" & landing_page == "new_page")
# rbind:合并后列数不变,行数相加,类似叠猫猫
notaligned_user_all <- rbind(notaligned_user,notaligned_user2)
# cbind:合并后行数不变,列数相加,列名也会跟着加
notaligned_user_all <- cbind(notaligned_user,notaligned_user2)
7.获取列数,使用nrow
data_row_num<- nrow(data)
ggplot网址:
http://www.sthda.com/english/wiki/ggplot2-barplots-quick-start-guide-r-software-and-data-visualization
文章出处登录后可见!
已经登录?立即刷新