如何解决ggplot直方图在布尔上拆分
使用公共数据集(请注意需要花费几分钟的时间):
library(data.table)
dl <- tempfile()
download.file("http://files.grouplens.org/datasets/movielens/ml-10m.zip",dl)
ratings <- fread(text = gsub("::","\t",readLines(unzip(dl,"ml-10M100K/ratings.dat"))),col.names = c("userId","movieId","rating","timestamp"))
movies <- str_split_fixed(readLines(unzip(dl,"ml-10M100K/movies.dat")),"\\::",3)
colnames(movies) <- c("movieId","title","genres")
if (as.numeric(version$year) < 2020 | (version$year=="2020" & as.numeric(version$month) < 3)){
# if using R 3.6 or earlier
movies <- as.data.frame(movies) %>% mutate(movieId = as.numeric(levels(movieId))[movieId],title = as.character(title),genres = as.character(genres))
} else {
# if using R 4.0 or later
movies <- as.data.frame(movies) %>% mutate(movieId = as.numeric(movieId),genres = as.character(genres))}
movielens <- left_join(ratings,movies,by = "movieId")
如:
> head(movielens)
userId movieId rating timestamp title genres
1: 1 122 5 838985046 Boomerang (1992) Comedy|Romance
2: 1 185 5 838983525 Net,The (1995) Action|Crime|Thriller
3: 1 231 5 838983392 Dumb & Dumber (1994) Comedy
4: 1 292 5 838983421 Outbreak (1995) Action|Drama|Sci-Fi|Thriller
5: 1 316 5 838983392 Stargate (1994) Action|Adventure|Sci-Fi
6: 1 329 5 838983392 Star Trek: Generations (1994) Action|Adventure|Drama|Sci-Fi
>
我正尝试用fill
分割ggplot直方图,以显示以下总评分与一半评分之间的差异:
movielens %>%
mutate(whole = rating == round(rating)) %>%
ggplot(mapping=aes(x=rating),fill=whole) +
geom_histogram()
因为半收视率不常见,但fill
由于某些原因无法正常工作...
解决方法
您需要在美学(aka映射)中进行填充,而不是在ggplot()
通话中进行填充:
library(data.table)
library(stringr)
library(dplyr)
library(ggplot2)
dl <- tempfile()
download.file("http://files.grouplens.org/datasets/movielens/ml-10m.zip",dl)
ratings <- fread(text = gsub("::","\t",readLines(unzip(dl,"ml-10M100K/ratings.dat"))),col.names = c("userId","movieId","rating","timestamp"))
movies <- str_split_fixed(readLines(unzip(dl,"ml-10M100K/movies.dat")),"\\::",3)
colnames(movies) <- c("movieId","title","genres")
if (as.numeric(version$year) < 2020 | (version$year=="2020" & as.numeric(version$month) < 3)){
# if using R 3.6 or earlier
movies <- as.data.frame(movies) %>% mutate(movieId = as.numeric(levels(movieId))[movieId],title = as.character(title),genres = as.character(genres))
} else {
# if using R 4.0 or later
movies <- as.data.frame(movies) %>% mutate(movieId = as.numeric(movieId),genres = as.character(genres))}
movielens <- left_join(ratings,movies,by = "movieId")
movielens %>%
mutate(whole = rating == round(rating)) %>%
ggplot(mapping=aes(x=rating,fill=whole)) +
geom_histogram()
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。