R初歩勉強会その2
- 今日は2回目
- 2変数の関連を見る
- 2変数データをランダムに発生させる
- 2値型と連続値型
- Rのデータの持ち方:ベクトル・データフレーム・行列・リスト
- ランダムにデータを作るための関数:sample(),rnorm()
- 2変数の関係を見る関数:chisq.test(), cor()
- 次回は『関数を作る』です
--- title: "Untitled" output: html_document --- ```{r} n <- 100 X <- rep(0:1,50) X ``` ```{r} X. <- rep(0:1,each=50) X. ``` ```{r} X.. <- sample(0:1,n,prob=c(0.5,0.5),replace = TRUE) X.. sum(X..) ``` ```{r} n.iter <- 1000 num.record <- rep(0,n.iter) for(i in 1:n.iter){ X.. <- sample(0:1,n,prob=c(0.5,0.5),replace = TRUE) num.record[i] <- sum(X..) } ``` ```{r} str(num.record) ``` ```{r} mean(num.record) ```
```{r}
sorted <- sort(num.record)
(sorted[500] + sorted[501])/2
```
```{r}
sorted[496:505]
```
```{r}
plot(sorted)
abline(h=50,col=2)
```
```{r}
hist(num.record)
```
```{r}
p0 <- 0.7
p1 <- 0.75
```
```{r}
n0 <- n - sum(X..)
n1 <- sum(X..)
n0
n1
```
```{r}
res0 <- sample(0:1,n0,prob=c(1-p0,p0),replace=TRUE)
mean(res0)
```
```{r}
res1 <- sample(0:1,n1,prob=c(1-p1,p1),replace=TRUE)
mean(res1)
```
```{r}
X..
res0
res1
```
```{r}
which(X.. == 0)
```
```{r}
Y <- rep(9,n)
Y[which(X.. == 0)] <- res0
Y
```
```{r}
Y[which(X.. == 1)] <- res1
Y
```
```{r}
data <- data.frame(X..,Y)
data
```
```{r}
tab <- table(data)
```
```{r}
chisq.test(tab)
```
```{r}
pchisq(0.34684,df=1,lower.tail=FALSE)
```
```{r}
ch <- seq(from=0,to=10,by=0.1)
pval <- pchisq(ch,df=1,lower.tail=FALSE)
plot(ch,pval,type="l")
abline(v=0.34684,col=3)
abline(h=0.5559,col=4)
```
# Linear Algebra
```{r}
m <- matrix(c(2,3,4,5),2,2)
m
chisq.test(m)
```
```{r}
m2 <- matrix(c(1,2,3,4),2,2)
m
m2
m %*% m2
```
```{r}
eout <- eigen(m)
eout
print("----")
eout1
print("===")
eout2
```
```{r}
eout2
print("---")
eout2[,1]
```
```{r}
a <- m %*% eout2[,1]
```
```{r}
a/eout2[,1]
a[1] / eout2[1,1]
eout1
```
# PCA Principal component analysis
```{r}
n <- 1000
n.var <- 4
data <- matrix(rnorm(n*n.var),ncol=n.var)
pairs(data)
```
# correlation coefficient
```{r}
plot(data[,1:2])
cor(data[,1:2])
```
```{r}
z <- data[,1] * 2 + rnorm(n,3,0.5)
plot(data[,1],z)
abline(3,2,col=2)
cor(data[,1],z)
```