Spustite si R.
Pozn.: R v ucebni alebo na Vasom pocitaci nemusi mat k dispozicii vsetky pridavne balicky (napr. rggobi alebo rgl). V tom pripade sa pokuste potrebne balicky doinstalovat. R umoznuje instalaciu v uzivatelskych adresaroch. V pripade problemov mozete pouzit pristup k R na pocitaci helix.fi.muni.cz pomocou ssh (alebo wo windows program Xlaunch). Uzivatelske meno a heslo Vam dam v cviceni Linux: ssh -X username@helix.fi.muni.cz Windows: Po spusteni programu Xlaunch (odklikajte nastavenia) spustite putty, nastavite v menu X11 “X forwarding” a pripojite sa k pocitacu. Tym ziskate moznost pracovat s R na helixe (a dalsimi programami) v grafickom rezime.
Nacitajte si datovy subor cars.data, obsahujuci informacie o
automobiloch zo 70-tych a 80-tych rokov minuleho storocia. (Windows: Pracovny adresar v grafickej verzii R mozete zmenit cez File -> Change dir)
mpg - dojazd v milach na galon (prevratena hodnota spotreby) cylinders - pocet valcov displacement - objem motora horsepower - vykon motora weight - hmotnost acceleration - zrychlenie model.year - rok vyroby
cardata <- read.table("data/cars.data")
Pomenujeme stlpce:
names(cardata) <- c("mpg","cylinders","displacement","horsepower","weight","acceleration","model.year","origin")
Nacitame nazvy aut:
carnames <- read.table("data/cars.names")
names(carnames) <- c("name")
Spojime tabulky do jednej:
cars <- data.frame(c(carnames,cardata))
Zakladne informacie o datovom subore ziskame:
summary(cars)
## name mpg cylinders displacement horsepower weight acceleration
## ford pinto : 6 Min. : 9.00 Min. :3.000 Min. : 68.0 Min. : 46.00 Min. :1613 Min. : 8.00
## amc matador : 5 1st Qu.:17.50 1st Qu.:4.000 1st Qu.:105.0 1st Qu.: 75.75 1st Qu.:2226 1st Qu.:13.70
## ford maverick : 5 Median :23.00 Median :4.000 Median :151.0 Median : 95.00 Median :2822 Median :15.50
## toyota corolla: 5 Mean :23.51 Mean :5.475 Mean :194.8 Mean :105.08 Mean :2979 Mean :15.52
## amc gremlin : 4 3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:302.0 3rd Qu.:130.00 3rd Qu.:3618 3rd Qu.:17.18
## amc hornet : 4 Max. :46.60 Max. :8.000 Max. :455.0 Max. :230.00 Max. :5140 Max. :24.80
## (Other) :377 NA's :8 NA's :6
## model.year origin
## Min. :70.00 Min. :1.000
## 1st Qu.:73.00 1st Qu.:1.000
## Median :76.00 Median :1.000
## Mean :75.92 Mean :1.569
## 3rd Qu.:79.00 3rd Qu.:2.000
## Max. :82.00 Max. :3.000
##
Prvych 5 riadkov:
cars[1:5,]
## name mpg cylinders displacement horsepower weight acceleration model.year origin
## 1 chevrolet chevelle malibu 18 8 307 130 3504 12.0 70 1
## 2 buick skylark 320 15 8 350 165 3693 11.5 70 1
## 3 plymouth satellite 18 8 318 150 3436 11.0 70 1
## 4 amc rebel sst 16 8 304 150 3433 12.0 70 1
## 5 ford torino 17 8 302 140 3449 10.5 70 1
K stlpcom tabulky mozno pristupovat tiez podla mena:
cars$mpg
## [1] 18.0 15.0 18.0 16.0 17.0 15.0 14.0 14.0 14.0 15.0 NA NA NA NA NA 15.0 14.0 NA 15.0 14.0 24.0 22.0 18.0 21.0
## [25] 27.0 26.0 25.0 24.0 25.0 26.0 21.0 10.0 10.0 11.0 9.0 27.0 28.0 25.0 25.0 NA 19.0 16.0 17.0 19.0 18.0 14.0 14.0 14.0
## [49] 14.0 12.0 13.0 13.0 18.0 22.0 19.0 18.0 23.0 28.0 30.0 30.0 31.0 35.0 27.0 26.0 24.0 25.0 23.0 20.0 21.0 13.0 14.0 15.0
## [73] 14.0 17.0 11.0 13.0 12.0 13.0 19.0 15.0 13.0 13.0 14.0 18.0 22.0 21.0 26.0 22.0 28.0 23.0 28.0 27.0 13.0 14.0 13.0 14.0
## [97] 15.0 12.0 13.0 13.0 14.0 13.0 12.0 13.0 18.0 16.0 18.0 18.0 23.0 26.0 11.0 12.0 13.0 12.0 18.0 20.0 21.0 22.0 18.0 19.0
## [121] 21.0 26.0 15.0 16.0 29.0 24.0 20.0 19.0 15.0 24.0 20.0 11.0 20.0 21.0 19.0 15.0 31.0 26.0 32.0 25.0 16.0 16.0 18.0 16.0
## [145] 13.0 14.0 14.0 14.0 29.0 26.0 26.0 31.0 32.0 28.0 24.0 26.0 24.0 26.0 31.0 19.0 18.0 15.0 15.0 16.0 15.0 16.0 14.0 17.0
## [169] 16.0 15.0 18.0 21.0 20.0 13.0 29.0 23.0 20.0 23.0 24.0 25.0 24.0 18.0 29.0 19.0 23.0 23.0 22.0 25.0 33.0 28.0 25.0 25.0
## [193] 26.0 27.0 17.5 16.0 15.5 14.5 22.0 22.0 24.0 22.5 29.0 24.5 29.0 33.0 20.0 18.0 18.5 17.5 29.5 32.0 28.0 26.5 20.0 13.0
## [217] 19.0 19.0 16.5 16.5 13.0 13.0 13.0 31.5 30.0 36.0 25.5 33.5 17.5 17.0 15.5 15.0 17.5 20.5 19.0 18.5 16.0 15.5 15.5 16.0
## [241] 29.0 24.5 26.0 25.5 30.5 33.5 30.0 30.5 22.0 21.5 21.5 43.1 36.1 32.8 39.4 36.1 19.9 19.4 20.2 19.2 20.5 20.2 25.1 20.5
## [265] 19.4 20.6 20.8 18.6 18.1 19.2 17.7 18.1 17.5 30.0 27.5 27.2 30.9 21.1 23.2 23.8 23.9 20.3 17.0 21.6 16.2 31.5 29.5 21.5
## [289] 19.8 22.3 20.2 20.6 17.0 17.6 16.5 18.2 16.9 15.5 19.2 18.5 31.9 34.1 35.7 27.4 25.4 23.0 27.2 23.9 34.2 34.5 31.8 37.3
## [313] 28.4 28.8 26.8 33.5 41.5 38.1 32.1 37.2 28.0 26.4 24.3 19.1 34.3 29.8 31.3 37.0 32.2 46.6 27.9 40.8 44.3 43.4 36.4 30.0
## [337] 44.6 40.9 33.8 29.8 32.7 23.7 35.0 23.6 32.4 27.2 26.6 25.8 23.5 30.0 39.1 39.0 35.1 32.3 37.0 37.7 34.1 34.7 34.4 29.9
## [361] 33.0 34.5 33.7 32.4 32.9 31.6 28.1 NA 30.7 25.4 24.2 22.4 26.6 20.2 17.6 28.0 27.0 34.0 31.0 29.0 27.0 24.0 23.0 36.0
## [385] 37.0 31.0 38.0 36.0 36.0 36.0 34.0 38.0 32.0 38.0 25.0 38.0 26.0 22.0 32.0 36.0 27.0 27.0 44.0 32.0 28.0 31.0
Mozeme zistit, ktore auta maju v nazve “ford”:
fords <- grep("ford",cars$name)
fords
## [1] 5 6 13 18 24 32 39 44 48 51 56 69 73 82 88 96 100 108 112 120 134 138 144 147 163 167 174 176 182 198 201
## [32] 208 214 222 236 240 244 253 262 263 272 290 294 298 322 344 359 360 374 382 398 402 405
.. a vytvorit si novu tabulku ich spotreby:
data.frame(cars$name[fords],cars$mpg[fords])
## cars.name.fords. cars.mpg.fords.
## 1 ford torino 17.0
## 2 ford galaxie 500 15.0
## 3 ford torino (sw) NA
## 4 ford mustang boss 302 NA
## 5 ford maverick 21.0
## 6 ford f250 10.0
## 7 ford pinto 25.0
## 8 ford torino 500 19.0
## 9 ford galaxie 500 14.0
## 10 ford country squire (sw) 13.0
## 11 ford mustang 18.0
## 12 ford pinto runabout 21.0
## 13 ford galaxie 500 14.0
## 14 ford gran torino (sw) 13.0
## 15 ford pinto (sw) 22.0
## 16 ford gran torino 14.0
## 17 ford ltd 13.0
## 18 ford maverick 18.0
## 19 ford country 12.0
## 20 ford pinto 19.0
## 21 ford maverick 21.0
## 22 ford pinto 26.0
## 23 ford gran torino 16.0
## 24 ford gran torino (sw) 14.0
## 25 ford maverick 15.0
## 26 ford ltd 14.0
## 27 ford mustang ii 13.0
## 28 ford pinto 23.0
## 29 ford pinto 18.0
## 30 ford gran torino 14.5
## 31 ford maverick 24.0
## 32 ford granada ghia 18.0
## 33 ford pinto 26.5
## 34 ford f108 13.0
## 35 ford granada 18.5
## 36 ford thunderbird 16.0
## 37 ford mustang ii 2+2 25.5
## 38 ford fiesta 36.1
## 39 ford fairmont (auto) 20.2
## 40 ford fairmont (man) 25.1
## 41 ford futura 18.1
## 42 ford fairmont 4 22.3
## 43 ford ltd landau 17.6
## 44 ford country squire (sw) 15.5
## 45 ford fairmont 26.4
## 46 ford mustang cobra 23.6
## 47 ford escort 4w 34.4
## 48 ford escort 2h 29.9
## 49 ford granada gl 20.2
## 50 ford fairmont futura 24.0
## 51 ford granada l 22.0
## 52 ford mustang gl 27.0
## 53 ford ranger 28.0
.. fordy, ktore maju 4 a 8 valcov:
bigfords <- fords[which(cars$cylinders[fords] == 8)]
smallfords <- fords[which(cars$cylinders[fords] == 4)]
.. a ich vypis z tabulky aut (sirku vypisovanych riadkov mozete upravit prikazom “options(width=128)”)
options(width=128)
cars[bigfords,]
## name mpg cylinders displacement horsepower weight acceleration model.year origin
## 5 ford torino 17.0 8 302 140 3449 10.5 70 1
## 6 ford galaxie 500 15.0 8 429 198 4341 10.0 70 1
## 13 ford torino (sw) NA 8 351 153 4034 11.0 70 1
## 18 ford mustang boss 302 NA 8 302 140 3353 8.0 70 1
## 32 ford f250 10.0 8 360 215 4615 14.0 70 1
## 48 ford galaxie 500 14.0 8 351 153 4154 13.5 71 1
## 51 ford country squire (sw) 13.0 8 400 170 4746 12.0 71 1
## 73 ford galaxie 500 14.0 8 351 153 4129 13.0 72 1
## 82 ford gran torino (sw) 13.0 8 302 140 4294 16.0 72 1
## 96 ford gran torino 14.0 8 302 137 4042 14.5 73 1
## 100 ford ltd 13.0 8 351 158 4363 13.0 73 1
## 112 ford country 12.0 8 400 167 4906 12.5 73 1
## 144 ford gran torino 16.0 8 302 140 4141 14.0 74 1
## 147 ford gran torino (sw) 14.0 8 302 140 4638 16.0 74 1
## 167 ford ltd 14.0 8 351 148 4657 13.5 75 1
## 174 ford mustang ii 13.0 8 302 129 3169 12.0 75 1
## 198 ford gran torino 14.5 8 351 152 4215 12.8 76 1
## 222 ford f108 13.0 8 302 130 3870 15.0 76 1
## 240 ford thunderbird 16.0 8 351 149 4335 14.5 77 1
## 272 ford futura 18.1 8 302 139 3205 11.2 78 1
## 294 ford ltd landau 17.6 8 302 129 3725 13.4 79 1
## 298 ford country squire (sw) 15.5 8 351 142 4054 14.3 79 1
a priemerna spotreba
mean(cars[bigfords,]$mpg, na.rm=TRUE)
## [1] 14.335
mean(cars[smallfords,]$mpg, na.rm=TRUE)
## [1] 25.82222
Lepsiu predstavu o datach ziskame vytvorenim 2-rozmernych priemetov dat:
plot(cars)
… a priemetov zaujimavych dvojic premmenych podmienenych dalsou
premennou:
library(lattice)
xyplot(displacement ~ mpg | cylinders, data = cars)
… pripadne 3d zobrazenie troch premennych
cloud(displacement ~ cylinders * mpg, data = cars)
za pomoci OpenGL:
library(rgl)
plot3d(cars[,2:4])
interaktivne:
library(sculpt3d)
sculpt3d(cars[,2:4])
alebo
library(rggobi)
g <- ggobi(cars)
V tejto chvili by ste mali mat urcity cit pre pracu s datami v R. Urobte dalsie kroky, ktore by ste mohli potrebovat pri navrhovani vlastnych sposobov vizualizacie: - farba a tvar symbolov v prikazoch plot(), cloud() a plot3d() - farba pozadia, styl pisma - ine typy grafov (polarne suradnice, paralelne osi) - dokreslovanie objektov do grafu - ulozenie obrazku do suboru
Pri hladani vhodnych funkcii Vam moze okrem ineho pomoct galeria grafov v R na adrese https://www.r-graph-gallery.com/all-graphs/ alebo prikazy help() a help.search().
help(help)
Zoznamte sa s nimi, pouzijeme ich v dalsom tyzdni.
Pekny vikend!