Proiect Inteligenta Computationala

6
Proiect Inteligenta Computationala Datele sunt luate pentru anul 2012 Atribute: Suprafata tarii Populatia tarii GDP=PIB GDP per capita GDP growth rate=rata de crestere a PIB (masurata in procente) Inflation=rata inflatiei (%) Trade=suma exporturilor si importurilor de bunuri si servicii (% din PIB) Employment rate-rata de angajare Surse : http://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG/countries http://ec.europa.eu/eurostat/tgm/table.do? tab=table&language=en&pcode=tsdec420&tableSelection=3&footnotes=y es&labeling=labels http://ec.europa.eu/eurostat/tgm/table.do? tab=table&init=1&language=en&pcode=tec00115&plugin=1 1. Calculati statisticile descriptive ale setului de date: medie, dispersie, variante, matrice de covarianta si de corelatie, histograme. 2. Determinati posibile dependente intre variabile, ecuatii de regresie, coeficientii dreptelor de regresie estimati, trasarea dreptelor de regresie+ interpretari. a=read.table("tariue.txt",header=TRUE, sep="\t")

description

asdasd

Transcript of Proiect Inteligenta Computationala

Page 1: Proiect Inteligenta Computationala

Proiect Inteligenta Computationala

Datele sunt luate pentru anul 2012

Atribute:

Suprafata tarii

Populatia tarii

GDP=PIB

GDP per capita

GDP growth rate=rata de crestere a PIB (masurata in procente)

Inflation=rata inflatiei (%)

Trade=suma exporturilor si importurilor de bunuri si servicii (% din PIB)

Employment rate-rata de angajare

Surse :

http://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG/countries

http://ec.europa.eu/eurostat/tgm/table.do?tab=table&language=en&pcode=tsdec420&tableSelection=3&footnotes=yes&labeling=labels

http://ec.europa.eu/eurostat/tgm/table.do?tab=table&init=1&language=en&pcode=tec00115&plugin=1

1. Calculati statisticile descriptive ale setului de date: medie,dispersie, variante, matrice de covarianta si de corelatie, histograme.2. Determinati posibile dependente intre variabile, ecuatii de regresie,coeficientii dreptelor de regresie estimati, trasarea dreptelor deregresie+ interpretari.

a=read.table("tariue.txt",header=TRUE, sep="\t")

attach(a)

summary(a)

hist(Suprafata)

hist(Populatie)

hist(GDP)

Page 2: Proiect Inteligenta Computationala

hist(GDPgrowthrate)

hist(GDPpercapita)

hist(Employmentrate)

hist(Trade)

hist(Inflation)

lin<-lm(Employmentrate~Inflation )# dreapta de regresie/dependenta intre variabile

summary(lin)

3. ACP: vectori proprii, valori proprii, criterii de determinare acomponentelor principale, scree plot, matricea scorurilor, biplot: grafice+ interpretari.4. SVM: construirea setului de antrenare si de testare, diverse forme alefunctiei kernel: liniara, polinomiala, sigmoid, radiala, nr. de vectorisuport in fiecare situatie,predictii, matricea de confuzie, rata de exactitate a modelului pt fiecaresituatie, coef. Cohen5. Analiza cluster: kmeans, kmedoids, fuzzy clustering, ierarhica,dendograme, grafice, interpretari; diverse valori pentru numarul declustere, comentarii asupra siluetei clusterelor, matrice de confuzie,rata de exactitate a modelului pt fiecare situatie6. Arbori de decizie7. SOM : construirea hartilor,8. Clasificatorul Naiv Bayesian,9. Retele neuronale, predictie, rata de exactitate a modelului , coef. Cohen

>fisier<-data.frame(Tari=c("Austria", "Belgia", "Bulgaria", "Cipru", "Croatia", "Danemarca", "Estonia", "Finlanda","Franta", "Germania", "Grecia", "Irlanda", "Italia", "Letonia", "Lituania", "Luxemburg", "Malta", "Polonia", "Portugalia", "Regatul Unit","Republica Ceha","Romania","Slovacia","Slovenia","Spania","Suedia","Tarile de Jos","Ungaria",), Suprafata=c(83 879, 30 528, 110 899, 9 251,  87 661,  42 915, 45 227, 338 434, 632 833, 357 137, 131 957, 69 797,

301 336, 64 562, 65 300, 2 586, 316, 312 679, 92 211, 248 527, 78 866, 238 390, 49 036, 20 273,

505 990, 438 575, 41 540, 93 023 ),Populatie=c(8 408 121,11 094 850, 7 327 224, 82 011, 4 275 984, 5 580 516, 1 325 217, 5 401 267, 65 287 861, 80 327 900, 11 123 034, 4 582 707, 59 394 207, 2 044 813, 3 003 641, 524 853, 417 546, 38 538 447, 10 542 398, 63 495 303, 10 505 445, 20 095 996, 5 404 322, 2 055 496, 46 818 219, 9 482 855, 16 730 348, 9 931 925 ), PIB=c(307,375.881,39.927,17.72,43.682,245.252, 17.415, 192.35, 2032, 2666,

Page 3: Proiect Inteligenta Computationala

193.347, 163.938, 1567, 22.257, 32.94, 42.918, 6.88, 381.48, 165.107, 1933,152.926,131. 579, 71.096, 35.319, 1029, 407.82, 599.338, 96.98), RataPIB=c(0.9,1.6,2,-2.4,-2.2,-0.7,4.7,-1.5,0.3,0.4,-6.6,-0.3,-2.3,4.8,3.8,-0.2,2.5,1.8,-3.3,0.7,0,.6,1.6,-2.6,-2.1,-0.3,-1.6,-1.5), RataAngajare=c(70.3,61.7,60.64.8,50.2,72.2,69.4,72.5.65.1,71.5,45.2,59.4,50.5,6.4,67.9,64.1, 46.6,57.5,63,68.4,62.5,56.3,57.3,64.6,54.6,76.8,71.9,56.4), RataInflatie=c(2.6,2.6,2.4,3.1,3.4,2.4,4.2,3.2,2.2,2.1,1,1.9,3.3,2.3,3.2,2.9,3.2,3.7,2.8,2.8,3.5,3.4,3.7,2.8,2.4,0.9,2.8,5.7)) >fisier

>set.seed(5)

> km <- kmeans(fisier[,2:4], 3, 15) Datele sunt clusterizate cu algoritmul k-means cu 3 clustere si 15 iteratii.

>print(km)

> plot(fisier, col = km$cluster)

4.SVM/L library(e1071)library(scales)data<-read.table (file="Tari.txt", header= TRUE, sep="\t")datatrain <- data Setul de date de antrenare este chiar setul initial.t <- ncol(train) t=nr. decoloane ale setului de antrenarettarget <- train[, t] target retine ultima coloana din setul antrenare, care contine variabila calitativa “calitate”=>Ratainfmodel<- svm(Ratai~ ., data = train)modelRETEA NEUR

1-2.Call:lm(formula = PIB ~ Suprafata)

Residuals: 1 2 3 4 5 6 154.90 178.65 -89.32 -197.50 -105.22 58.50

Coefficients: Estimate Std. Error t value Pr(>|t|)

Page 4: Proiect Inteligenta Computationala

(Intercept) 2.230e+02 1.381e+02 1.615 0.182Suprafata -8.458e-04 1.958e-03 -0.432 0.688

Residual standard error: 171.3 on 4 degrees of freedomMultiple R-squared: 0.04458, Adjusted R-squared: -0.1943 F-statistic: 0.1867 on 1 and 4 DF, p-value: 0.688

Interpretare:Ecuatia PiB=2.230e+02 -8.458e-04 *Suprafata+ἐ

Page 5: Proiect Inteligenta Computationala