如何解决Fisher 检验或卡方检验 Rstudio
此问题使用 https://people.ucsc.edu/~mclapham/eart125/data/georoc.csv 处的数据文件
流纹岩可分为高硅类型(SiO2 大于 75% 的流纹岩)和更典型的流纹岩(SiO2 小于 75%)。流纹岩类型的比例在板内和会聚边缘构造环境之间是否存在显着差异?在下面输入 p 值:
我尝试将信息与以下代码组合在一起:
kick <- matrix(c(georoc$tectonic.setting == "Intraplate" | georoc$tectonic.setting == "Convergent margin",georoc$SIO2),ncol = 2)
chisq.test(kick)
这就是我得到的
Pearson's Chi-squared test
data: kick
X-squared = 380.59,df = 999,p-value = 1
Warning message:
In chisq.test(ckok) : Chi-squared approximation may be incorrect
我做错了什么,我该如何解决?我是 R 的新手。
解决方法
我认为这是正确的,但不知道这个特定的域,所以只是想警告一下!
library(tidyverse)
data <- read_csv('https://people.ucsc.edu/%7Emclapham/eart125/data/georoc.csv')
data_tidy <- data %>%
filter(rock.type == "Rhyolite") %>%
mutate(high_SiO2 = SIO2 > 75) %>%
select("setting" = tectonic.setting,"type" = rock.type,high_SiO2) %>%
group_by(setting,type) %>%
count(high_SiO2) %>%
ungroup() %>%
filter(setting %in% c("Convergent margin","Intraplate")) %>%
select(-type) %>%
pivot_wider(names_from = high_SiO2,values_from = n) %>%
select(setting,"low_SiO2" = `FALSE`,"high_SiO2" = `TRUE`) %>%
column_to_rownames(var = "setting") %>%
as.matrix()
这样做会得到以下输出:
low_SiO2 high_SiO2
Convergent margin 62 10
Intraplate 43 22
然后我们可以运行:
chisq.test(data_tidy)
...得到:
Pearson's Chi-squared test with Yates' continuity
correction
data: data_tidy
X-squared = 6.5263,df = 1,p-value = 0.01063
我最初担心我会混淆矩阵的行和列,但我认为这在这个测试中并不重要。