Proiect Inteligenta Computationala
-
Upload
marius-cristian -
Category
Documents
-
view
131 -
download
9
description
Transcript of Proiect Inteligenta Computationala
Proiect Inteligenta Computationala
Datele sunt luate pentru anul 2012
Atribute:
Suprafata tarii
Populatia tarii
GDP=PIB
GDP per capita
GDP growth rate=rata de crestere a PIB (masurata in procente)
Inflation=rata inflatiei (%)
Trade=suma exporturilor si importurilor de bunuri si servicii (% din PIB)
Employment rate-rata de angajare
Surse :
http://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG/countries
http://ec.europa.eu/eurostat/tgm/table.do?tab=table&language=en&pcode=tsdec420&tableSelection=3&footnotes=yes&labeling=labels
http://ec.europa.eu/eurostat/tgm/table.do?tab=table&init=1&language=en&pcode=tec00115&plugin=1
1. Calculati statisticile descriptive ale setului de date: medie,dispersie, variante, matrice de covarianta si de corelatie, histograme.2. Determinati posibile dependente intre variabile, ecuatii de regresie,coeficientii dreptelor de regresie estimati, trasarea dreptelor deregresie+ interpretari.
a=read.table("tariue.txt",header=TRUE, sep="\t")
attach(a)
summary(a)
hist(Suprafata)
hist(Populatie)
hist(GDP)
hist(GDPgrowthrate)
hist(GDPpercapita)
hist(Employmentrate)
hist(Trade)
hist(Inflation)
lin<-lm(Employmentrate~Inflation )# dreapta de regresie/dependenta intre variabile
summary(lin)
3. ACP: vectori proprii, valori proprii, criterii de determinare acomponentelor principale, scree plot, matricea scorurilor, biplot: grafice+ interpretari.4. SVM: construirea setului de antrenare si de testare, diverse forme alefunctiei kernel: liniara, polinomiala, sigmoid, radiala, nr. de vectorisuport in fiecare situatie,predictii, matricea de confuzie, rata de exactitate a modelului pt fiecaresituatie, coef. Cohen5. Analiza cluster: kmeans, kmedoids, fuzzy clustering, ierarhica,dendograme, grafice, interpretari; diverse valori pentru numarul declustere, comentarii asupra siluetei clusterelor, matrice de confuzie,rata de exactitate a modelului pt fiecare situatie6. Arbori de decizie7. SOM : construirea hartilor,8. Clasificatorul Naiv Bayesian,9. Retele neuronale, predictie, rata de exactitate a modelului , coef. Cohen
>fisier<-data.frame(Tari=c("Austria", "Belgia", "Bulgaria", "Cipru", "Croatia", "Danemarca", "Estonia", "Finlanda","Franta", "Germania", "Grecia", "Irlanda", "Italia", "Letonia", "Lituania", "Luxemburg", "Malta", "Polonia", "Portugalia", "Regatul Unit","Republica Ceha","Romania","Slovacia","Slovenia","Spania","Suedia","Tarile de Jos","Ungaria",), Suprafata=c(83 879, 30 528, 110 899, 9 251, 87 661, 42 915, 45 227, 338 434, 632 833, 357 137, 131 957, 69 797,
301 336, 64 562, 65 300, 2 586, 316, 312 679, 92 211, 248 527, 78 866, 238 390, 49 036, 20 273,
505 990, 438 575, 41 540, 93 023 ),Populatie=c(8 408 121,11 094 850, 7 327 224, 82 011, 4 275 984, 5 580 516, 1 325 217, 5 401 267, 65 287 861, 80 327 900, 11 123 034, 4 582 707, 59 394 207, 2 044 813, 3 003 641, 524 853, 417 546, 38 538 447, 10 542 398, 63 495 303, 10 505 445, 20 095 996, 5 404 322, 2 055 496, 46 818 219, 9 482 855, 16 730 348, 9 931 925 ), PIB=c(307,375.881,39.927,17.72,43.682,245.252, 17.415, 192.35, 2032, 2666,
193.347, 163.938, 1567, 22.257, 32.94, 42.918, 6.88, 381.48, 165.107, 1933,152.926,131. 579, 71.096, 35.319, 1029, 407.82, 599.338, 96.98), RataPIB=c(0.9,1.6,2,-2.4,-2.2,-0.7,4.7,-1.5,0.3,0.4,-6.6,-0.3,-2.3,4.8,3.8,-0.2,2.5,1.8,-3.3,0.7,0,.6,1.6,-2.6,-2.1,-0.3,-1.6,-1.5), RataAngajare=c(70.3,61.7,60.64.8,50.2,72.2,69.4,72.5.65.1,71.5,45.2,59.4,50.5,6.4,67.9,64.1, 46.6,57.5,63,68.4,62.5,56.3,57.3,64.6,54.6,76.8,71.9,56.4), RataInflatie=c(2.6,2.6,2.4,3.1,3.4,2.4,4.2,3.2,2.2,2.1,1,1.9,3.3,2.3,3.2,2.9,3.2,3.7,2.8,2.8,3.5,3.4,3.7,2.8,2.4,0.9,2.8,5.7)) >fisier
>set.seed(5)
> km <- kmeans(fisier[,2:4], 3, 15) Datele sunt clusterizate cu algoritmul k-means cu 3 clustere si 15 iteratii.
>print(km)
> plot(fisier, col = km$cluster)
4.SVM/L library(e1071)library(scales)data<-read.table (file="Tari.txt", header= TRUE, sep="\t")datatrain <- data Setul de date de antrenare este chiar setul initial.t <- ncol(train) t=nr. decoloane ale setului de antrenarettarget <- train[, t] target retine ultima coloana din setul antrenare, care contine variabila calitativa “calitate”=>Ratainfmodel<- svm(Ratai~ ., data = train)modelRETEA NEUR
1-2.Call:lm(formula = PIB ~ Suprafata)
Residuals: 1 2 3 4 5 6 154.90 178.65 -89.32 -197.50 -105.22 58.50
Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.230e+02 1.381e+02 1.615 0.182Suprafata -8.458e-04 1.958e-03 -0.432 0.688
Residual standard error: 171.3 on 4 degrees of freedomMultiple R-squared: 0.04458, Adjusted R-squared: -0.1943 F-statistic: 0.1867 on 1 and 4 DF, p-value: 0.688
Interpretare:Ecuatia PiB=2.230e+02 -8.458e-04 *Suprafata+ἐ