Proiect Analiza Datelor

22
Proiectul a fost coordonat de Dl. prof.dr. Gilbert Saporta, C.N.A.M. Paris PROIECT ANALIZA DATELOR

Transcript of Proiect Analiza Datelor

Page 1: Proiect Analiza Datelor

Proiectul a fost coordonat de

Dl. prof.dr. Gilbert Saporta,

C.N.A.M. Paris

PROIECT ANALIZA DATELOR

Page 2: Proiect Analiza Datelor

IIINNNTTTRRROOODDDUUUCCCEEERRREEE $FHVW� SURLHFW� DUH� FD� VFRS� V � DUDWH� LQWHUHVXO� úL� HILFDFLWDWHD� SH� FDUH� OH��

SUH]LQW � PHWRGHOH� VWDWLVWLFH� GHVFULSWLYH� vQ� DQDOL]D� WDEHOHORU� GH� GDWH� GH�dimensiuni mari.

([HPSOXO� DOHV� VH� UHIHU � OD� SRSXOD LD� D� ��� GH� UL� FDUH� XUPHD] � V � ILH�DQDOL]DWH�vQ�IXQF LH�GH�YDULDELOHOH��

• GHPRJUDILFH��SRSXOD LD�������HVWLPD LD�SHQWUX�������UDWHOH�GH�QDWDOLWDWH�úL�GH�PRUWDOLWDWH��LQGLFHOH�VLQWHWLF�GH�IHFXQGLWDWH���QXP UXO�GH�SHUVRDQH�FX�PDL�SX LQ�GH����DQL��SUHFXP�úL�DO�FHORU�FDUH�DX����GH�DQL�VDX�PDL�PXOW��VSHUDQ D�GH�YLD � Aceste date sunt stabilite de: “Population Reference Bureau” din Washington.

• JHRJUDILFH��VXSUDID ��FRQWLQHQW • HFRQRPLFH��3URGXVXO�1D LRQDO�%UXW�SH�ORFXLWRU�SHQWUX�������vQ�GRODUL��

H[WUDV� GLQ� SXEOLFD LL� DOH� % QFLL� 0RQGLDOH�� 3URGXVXO� ,QWHULRU� %UXW� SH�ORFXLWRU�� SHQWUX� ������ vQ� GRODUL�� OD� SUH XO� úL� SDUitatea puterii de FXPS UDUH� GLQ� ������ IXUQL]DW� GH� &HQWUXO� GH� 6WXGLL� 3URVSHFWLYH� úL� GH�,QIRUPD LL�,QWHUQD LRQDOH�

• VRFLRORJLFH��UHOLJLD�PDMRULWDU �vQ� DU )LLQG�GDW �QDWXUD�FDQWLWDWLY �D�YDULDELOHORU��VH�YD� HIHFWXD� vQ�SULPD�SDUWH��

Analiza în Componente Principale. Scopul este reprezentarea într-XQ�VSD LX�FX�GRX � GLPHQVLXQL� D� GLIHULWHORU� UL�� 9RP� IL� DSRL� GHFL� FRQGXúL� OD� H[DPLQDUHD�D[HORU�SULQFLSDOH�DVWIHO�FUHDWH��SUHFXP�úL�D�FDOLW LL�UHSUH]HQW ULL�YDULDELOHORU�úL�LQGLYL]LORU�vQ�DFHVW�VSD LX�FX�GRX �GLPHQVLXni.

9RP�vQFHUFD�GH�DVHPHQL�V �H[SOLF P�FXP�V-D�UHDOL]DW�SURJQR]D�SRSXOD LHL�vQ�DFHVW�VSD LX�

ÌQ�D�GRXD�SDUWH��FX�DMXWRUXO�XQHL�FODVLILF UL��YRP�RE LQH�SDUWL LD� FHD�PDL�FRHUHQW �D�DFHVWHL�PXO LPL�GH� UL�

Page 3: Proiect Analiza Datelor
Page 4: Proiect Analiza Datelor

ANALIZA ÎN COMPONENTE PRINCIPALE

Analiza în CRPSRQHQWH� 3ULQFLSDOH� HVWH� R� PHWRG � GHVFULSWLY � D� GDWHORU�cantitative. 1.1. Descrierea datelor

1.1.1. Variabilele 6H�GLVWLQJ�GRX �WLSXUL�GH�YDULDELOH�� • variabilele cantitative active, care vor determina axele principale • variabilele ilustrative care prin natura lor (calitative) nu pot participa la

crearea axelor principale. �����(OH�YRU�SXWHD�IL�WRWXúL��vQWU-R�D�GRXD�HWDS ��V �ILH�UHSUH]HQWDWH�vQWU-un cerc GH�FRUHOD LL� �����9RP�UH LQH�GH�DVHPHQL��GUHSW�YDULDELO � LOXVWUDWLY ��R�YDULDELO �FDQWLWDWLY ��progQR]D�SRSXOD LHL�

'H�IDSW��YRP�vQFHUFD�V �H[SOLF P�DFHDVW �YDULDELO �vQ�IXQF LH�GH�YDULDELOHOH�active.

ÌQ�H[HPSOXO�QRVWUX�DP�UH LQXW� 11 variabile active

�����683(5),&��FRQWLQX � �����3238/$7,��FRQWLQX � �����1$7$/,7(��FRQWLQX � �����0257$/,7��FRQWLQX � 8����,1)$17,/��FRQWLQX � �����)(&21',7��FRQWLQX � ����,1����FRQWLQX � ����68����FRQWLQX � ����(63(5$1&��FRQWLQX � ����31%��FRQWLQX � ����3,%��FRQWLQX �

3 variabile ilustrative 1. 5(/,*,21����PRGDOLW L� 2. &217,1(1����PRGDOLW L� 7. 352-(&7,��FRQWLQX �

Page 5: Proiect Analiza Datelor

1.1.2. Indivizii ,QGLYL]LL�VWDWLVWLFL�VXQW� ULOH�����GH� UL�DX�IRVW�UH LQXWH�SHQWUX�DQDOL]D�FX�R�

SRQGHUH�XQLIRUP �HJDO �FX����)UDQ D�QX�SDUWLFLS � OD�DFHDVW �DQDOL] �� úL�YD� IL�WUDWDW �FD�XQ�LQGLYLG�VXSOLPHQWDU�

1.1.3. �6WDWLVWLFD�XQLGLPHQVLRQDO

$FHVW�WDEHO�DUDW �F �YDULDELOHOH�DX�RUGLQH�GH�P ULPH�GLIHULWH�PXOW� $QDOL]D�GDWHORU�GHPRJUDILFH�FX�DMXWRUXO�GLDJUDPHORU�vQ�VWHD��GLQ�FDUH�YH L��

J VL� XQ� H[WUDV� vQ� SDJLQD� ��� IDFH� GHMD� V � DSDU � XUP WRDUHOH� FDUDFWHULVWLFL�� vQ�DGHY U�� VH� SRW� GLVWLQJH� SH� GH� R� SDUWH�� ULOH� FX� R� VSHUDQ � GH� YLD � ULGLFDW ��IRUP � vQ� WULXQJKL�� SUHFXP�&DQDGD��*HUPDQLD� úL� ,WDOLD�� LDU�SH� GH� DOW � SDUWH��ULOH� FDUH� SUH]LQW � R� QDWDOLWDWH�� R�PRUWDOLWDWH� LQIDQWLO � úL� XQ� SURFHQW� GH�PDL�

SX LQ�GH����DQL�SUHD�ULGLFDW�ID �GH�PHGLH��GLDJUDPH�vQ�VWHD��SUHFXP�$OJHULa, $UDELD�6DXGLW �úL�&RDVWD�GH�)LOGHú��

8Q� DOW� JUDILF� �WLS� ³FXWLD� FX� PXVW L´�� FODVHD] � ULOH� vQ� IXQF LH� GH�

GHQVLWDWHD�SRSXOD LHL��SRSXOD LD�UDSRUWDW �OD�VXSUDID ���2� DU �QX�DSDUH�úL�QX�SDUWLFLS �OD�FDOFXO��GDW�ILLQG�FDUDFWHUXO�HL�H[FHS LRQDO��6LQJDSRUe.

4�� ���������SULPD�FXDQWLO � Q2 = 0,0490 (mediana) 4�� ���������D�WUHLD�FXDQWLO � E = Q3-Q1*Q3+1,5Q1=0,2676

$FHVW� JUDILF� DUDW � ULOH� FDUH� DX� R� GHQVLWDWH� IRDUWH� ULGLFDW � vQ� UDSRUW� FX�norma:

Coreea de Sud, ULOH�GH�-RV�

Japonia, Belgia.

Page 6: Proiect Analiza Datelor
Page 7: Proiect Analiza Datelor
Page 8: Proiect Analiza Datelor

1.2 Analiza în Componente Principale

$QDOL]D� vQ� &RPSRQHQWH� 3ULQFLSDOH� HVWH� HIHFWXDW � SOHFkQG� GH� OD� WDEHOXO�GDWHORU� WUDQVIRUPDWH�� ÌQ� DGHY U��GDW� ILLQG�RUGLQXO�GH�P ULPH� IRDUWH�GLIHULW� DO�datelor, se va utiliza metrica inversei dispersiilor, ceea ce revine la efectuarea Analizei în Componente Principale pe date centrate reduse.

0DL�PXOW��WRDWH�GDWHOH�DX�R�SRQGHUH�LGHQWLF �úL�HJDO �FX��� �������,QHU LD�úL�F XWDUHD�D[HORU�SULQFLSDOH

,QHU LD� LQL LDO � D� QRUXOXL� GH� SXQFWH� HVWH� VXPD� SRQGHUDW � � D� S WUDWHORU�GLVWDQ HORU� LQGLYL]LORU� OD� FHQWUXO� GH� JUHXWDWH�� 6H� DUDW � F � DWXQFL� FkQG� GDWHOH�sunt centrate-UHGXVH��DFHDVW �LQHU LH�HVWH�HJDO �FX�QXP UXO�GH�YDULDELOH�DGLF �11.

$�&�3�� FRQVW � vQ� GHWHUPLQDUHD� D[HORU� �QXPLWH� D[H� SULQFLSDOH�� FDUH� YRU�SHUPLWH�PD[LPL]DUHD�LQHU LHL�QRUXOXL�GH�SXQFWH�SURLHFWDW��$FHDVW �PD[LPL]DUH�QHFHVLW � F XWDUHD� YDORULORU� SURSULL� DOH� PDWULFLL� 90� XQGH� 9� HVWH� PDWULFHD�dispersiilor -�FRYDULDQ HORU�úL�0��PDWULFHD�XWLOL]DW �

Aprecierea asupra preciziei calculelor: Urma înainte de diagonalizare 11.00 Suma valorilor proprii 11.00

HISTOGRAMA PRIMELOR 11 VALORI PROPRII

�3ULPHOH�GRX �YDORUL�SURSULL�SHUPLW�GHWHUPLQDUHD�D[HORU�SULQFLSDOH���úL����$FHVWH�GRX �D[H�IRUPHD] �SULPXO�SODQ�SULQFLSDO��3ULPD�D[ �SULQFLSDO �SHUPLWH�UHVWLWXLUHD�D����������GLQ�LQHU LH��úL��D doua 13,80 %.

3ULPXO�SODQ�SULQFLSDO�H[SOLF �GHFL��������GLQ�LQHU LD� WRWDO �D�QRUXOXL�GH�puncte.

+LVWRJUDPD�YDORULORU�SURSULL�IDFH�V �DSDU �R�UXSWXU �GXS �D�GRXD�YDORDUH�SURSULH��$FHVW�FULWHULX�SHUPLWH�GHWHUPLQDUHD�QXP UXOXL�GH�D[H��GH�LQWHUSUHWDW�

Page 9: Proiect Analiza Datelor

1.2.2. Reprezentarea variabilelor ÌQ� VFRSXO� RE LQHULL� XQHL� LQWHUSUHW UL� SHQWUX� D[H�� WUHEXLH� V � UHSUH]HQW P��

variabilele în primul plan principal. 3HQWUX� DFHDVWD��VH� FDOFXOHD] �FRUHOD LLOH� vQWUH�FRPSRQHQWHOH�SULQFLSDOH� úL�

variabile. Aceasta permite determinarea cerculuL�GH�FRUHOD LL��

Page 10: Proiect Analiza Datelor

&DOLWDWHD��UHSUH]HQW ULL�YDULDELOHORU

$FHVW� WDEHO�� GDWRULW � S WUDWHORU� FRUHOD LLORU�� SHUPLWH� LQWHUSUHWDUHD� D[HORU�

H[DPLQkQG�FDOLWDWHD�UHSUH]HQW ULL�YDULDELOHORU� Pe axa 1, variabilele cele mai bine reprezentate sunt Natalitatea,

MortaliWDWHD�LQIDQWLO ��)HFXQGLWDWHD��3URFHQWXO�GH�PDL�SX LQ�GH����DQL��$FHVWH�YDULDELOH� VH� RSXQ� OD� 3URFHQWXO� GH�PDL� PXOW� GH� ��� DQL�� 6SHUDQ D� GH� YLD � úL�31%��YDULDELOH�FDUH�VXQW�PDL�SX LQ�ELQH�UHSUH]HQWDWH�

6H� SRDWH� GD� GHFL� R� LQWHUSUHWDUH� SHQWUX� D[D� ��� DFHDVW � D[ � RSXQH� ULOH�WLQHUH�OD� ULOH�E WUkQH�

3H� D[D� ��� YDULDELOHOH� FHOH� PDL� ELQH� UHSUH]HQWDWH� VXQW� 3RSXOD LD� úL�6XSUDID D�

(VWH� YRUED� GHFL� GH� R� D[ � GH� ³WDOLH´� FDUH� RSXQH� PDULOH� UL� SRSXODWH� OD�celelalte.

Global variabilele cele mai bine reprezentate în planul 1-2 sunt Natalitatea úL� 3URFHQWXO� GH�PDL� SX LQ� GH� ��� DQL�� JUDILF�� HOH� VXQW� FHOH�PDL� DSURSLDWH� GH�FHUFXO�GH�FRUHOD LL�

9DULDELOD� LOXVWUDWLY � 3URMHFWLRQ� � �3URJQR]D� SRSXOD LHL�� HVWH� IRDUWH� VWUkQV�FRUHODW �FX�D[D���úL�IRDUWH�VODE��FX�D[D����(D�HVWH�VODE�FRUHODW �FX�YDULDELOHOH�demografice.

1.2.3. Reprezentarea indivizilor &RRUGRQDWHOH� QHFHVDUH� UHSUH]HQW ULL� LQGLYL]LORU� vQ� SULPXO� SODQ� SULQFLSDO�

VXQW�IXUQL]DWH�vQ�DQH[D����FkW�úL�HOHPHQWHOH�QHFHVDUH�LQWHUSUHW ULL���FRQWULEX LLOH�LQGLYL]LORU�OD�SULPXO�SODQ�SULQFLSDO�úL�FRVLQXVXULOH�S WUDWH�

&RQWULEX LLOH� D[ � FX� D[ � SHUPLW� GHWHUPLQDUHD� LPSRUWDQ HL� LQGLYL]LORU� vQ�FRQVWUXLUHD� D[HORU�� 1X� HVWH� GH� GRULW� FD� XQ� LQGLYLG� V � DLE � R� FRQWULEX LH�H[FHVLY ��$FHDVWD��DU�FRQVWLWXL�XQ�IDFWRU�GH�LQVWDELOLWDWH�

ÌQ� DGHY U�� GDF � UHQXQ P� OD� DFHVW� LQGLYLG� SHQWUX� $�&�3�� YRP� SXWHD� V �GHWHUPLQ P� GLQ� QRX� D[H� FX� R� VHPQLILFD LH� GLIHULW �� *UDILF�� LQGLYL]LL� FX� R�

Page 11: Proiect Analiza Datelor

FRQWULEX LH� SXWHUQLF � VXQW� SH� ³IURQWLHUHOH´� UHSUH]HQW ULL� vQWUXFkW� FRQWULEX LD�individului i pe axa j este raportul dintre coordonata indiYLGXOXL�L�úL�LQHU LD�SXUWDW �SH�D[D�M���FX��R�DSUR[LPD LH�HJDO �FX�SRQGHUHD�LQGLYLGXOXL��

'DF � VH� LDX� GUHSW� FULWHULL� GH� GHFL]LH�� LQGLYL]LL� FDUH� DX� � R� FRQWULEX LH� GH�GRX �RUL�PDL�PDUH�GHFkW�SRQGHULOH�ORU��LQGLYL]LL�FX�R�FRQWULEX LH��IRDUWH�PDUH�pe axa 1 sunt:

• &RDVWD�GH�)LOGHú��0DOL��6HQHJDO��7RJR��(WLRSLD��6RPDOLD��7FKDG�SHQWUX�ULOH�WLQHUH��$FHVWH� UL�LDX�YDORUL�LPSRUWDQWH�SHQWUX�YDULDELOH�FDUH�DX��

GHWHUPLQDW� SR]LWLY� DFHDVW � � D[ � �QDWDOLWDWH�� IHFXQGLWDWH�� SURFHQWXO� GH�PDL�SX LQ�GH����DQL��PRUWDOLWDWH� LQIDQWLO �� úL� VODEH�SHQWUX� �YDULDELOHOH�FDUH�DX�GHWHUPLQDW�QHJDWLY��VSHUDQ D�GH�YLD ��SURFHQWXO�GH�PDL�PXOW�GH����úL�31%��

• 6LWXD LD�HVWH�GLQ�FRQWU ��LQYHUVDW �SHQWUX�-DSRQLD ULOH�FX�R�FRQWULEX LH�PDUH�VXQW��

• &KLQD�� 8566�� ,QGLD�� $FHVWH� UL� � DX� vQ� DFHODúL� WLPS� úL� SRSXOD LH� úL�VXSUDID �PDUH�

$FHVWH� UL�VXQW��vQ�DFHODúL�WLPS�FRQVHUYDWH��UH]XOWDWHOH�$&3�UHI FXW �I U �DFHVWH� UL��U PkQ�LGHQWLFH��

• (PLUDWHOH� $UDEH� 8QLWH� DX� R� VLWXD LH� LQYHUV �� VXSUDID � úL� SRSXOD LH�PLF �

/D�SDVXO�DO�GRLOHD��WUHEXLH�MXGHFDW �FDOLWDWHD�UHSUH]HQW ULL�LQGLYL]LORU� ,QGLYL]LL�LQL LDOL�VLWXD L�vQWU-XQ�VSD LX�FX����GLPHQVLXQL�VXQW�SURLHFWD L�vQWU-

XQ�VSD LX�FX���GLPHQVLXQL��8Q�LQGLYLG�YD�IL�FX�DWkW�PDL�ELQH�UHSUH]HQWDW�FX�FkW�SLHUGHUHD�GLVWDQ HL�HVWH�PDL�PLF ��ÌQ HOHJHP�SULQ�SLHUGHUHD�GLVWDQ HL�GLIHUHQ D�vQWUH�GLVWDQ D� LQGLYLGXOXL� L� OD�RULJLQH� vQ� VSD LXO� FX� ���GLPHQVLXQL� úL� GLVWDQ D�DFHOXLDúL� LQGLYLG� OD� RULJLQH� vQ� SULPXO� SODQ� SULQFLSDO�� &D� R� FRQVHFLQ �� XQ�LQGLYLG� YD� IL� FX� DWkW� PDL� ELQH� UHSUH]HQWDW� FX� FkW� XQJKLXO� vQWUH� LQGLYLG� úL�SURLHF LD� VD� HVWH� PDL� PLF�� VDX� FX� DOWH� FXYLQWH� FX� FkW� S WUDWXO� FRVLQXVXOXL�unghiului este mai mare (aceasta verificându-se în special pentru indivizii GHS UWD L�GH�RULJLQH��

$QH[D� �� FRQ LQH� SLHUGHULOH� GH� GLVWDQ �� &HL� �� LQGLYL]L� FHL� PDL� SURVW�UHSUH]HQWD L� VXQW� vQ� RUGLQH�� (PLUDWHOH�8QLWH�$UDEH�� 6LQJDSRUH�� &ROXPELD� úL�Siria.

Page 12: Proiect Analiza Datelor

1.2.4. Reprezentarea variabilelor ilustrative.

)LHFDUH�YDULDELO �HVWH�UHSUH]HQWDW �SULQ�FHQWUXO�V X�GH�JUHXWDWH�

5HPDUF P�vQ�SDUWLFXODU��F �UHOLJLD�³�KLQGRX´�DUH�DFHOHDúL�FRRUGRQDWH�FD�

IndiD��ÌQ�DGHY U��HVWH�VLQJXUD� DU ��vQ�FDUH�KLQGXLVPXO�HVWH�UHOLJLD�PDMRULWDU �

ÌQ� FRQFOX]LH�� UHSUH]HQW ULOH� JUDILFH� IDF� V � DSDU � FODU� FHOH� WUHL� JUXSH�

GLVWLQFWH�GH� UL�� • FHOH� WLQHUH�� vQ� SULQFLSDO� UL� DIULFDQH�� SX LQ� GH]YROWDWH�� GH� UHOLJLH�

DQLPLVW �VDX�PXVXOPDQ � • FHOH��E WUkQH�� ULOH�GH]YROWDWH��GLQWUH�FDUH� ULOH�HXURSHQH��2FHDQLD�úL�

$PHULFD�6HSWHQWULRQDO ��GH�UHOLJLH�FUHúWLQ • úL�R�D�WUHLD�JUXS ��LQWHUPHGLDU �� ULOH�vQ�FXUV�GH�GH]YROWDUH�DOH�$VLHL�úL�

Americii de Sud.

Page 13: Proiect Analiza Datelor

O clasificare ne va permite, într-o D�GRXD�SDUWH��V �RE LQHP�R�SDUWL LH�PDL�SUHFLV �D�DFHVWRU� UL�

Page 14: Proiect Analiza Datelor

CLASIFICAREA

Clasificarea are ca scop regruparea indivizilor în clase omogene. ([LVW � GRX � PDUL� WLSXUL� GH� PHWRGH�� FODVLILFDUHD� QRQ� LHUDUKLF � FDUH�

SURGXFH� R� SDUWL LH� vQWU-un nuP U� GDW� GH� FODVH�� úL� FODVLILFDUHD� LHUDUKLF � FDUH�SURGXFH�XQ�úLU�GH�SDUWL LL�vQFXLEDWH

Page 15: Proiect Analiza Datelor

2.1. &ODVLILFDUHD�LHUDUKLF �SULQ�PHWRGD�OXL�:DUG

�&ULWHULXO�GH�UHJUXSDUH�D�GRL�LQGLYL]L�VH�ED]HD] �SH�QR LXQHD�GH�LQHU LH��6H�UHJUXSHD] � FHL� GRL� LQGLYL]L�� VDX� VH� FODVHD] � LQGLYL]LL� FDUH� IDF� V � VH� SLDUG �PLQLPXO�GH�LQHU LH�LQWHUFODVH�

,QHU LD� LQWHUFODVH� HVWH�PHGLD� S WUDWHORU� GLVWDQ HORU� FRQWUHORU� GH� JUHXWDWH�GLQ� ILHFDUH� FODV � OD� FHQWUXO� GH� JUHXWDWH� WRWDO�� &D� úL� OD� $�&�3��� YRP� XWLOL]D�metrica inverselor dispersiei.

Page 16: Proiect Analiza Datelor

,QGLFLL� QLYHOXULORU� H[SULP � SLHUGHUHD� GH� LQHU LH� LQWHUFODVH� OD� ILHFDUH�UHJUXSDUH��6XPD�LQGLFLORU�GH�QLYHO�DU�WUHEXL�V �ILH�HJDO �FX�����DGLF �HJDO �FX�LQHU LD�WRWDO �D�QRUXOXL�GH��SXQFWH�

&HL�GRL�LQGLYL]L�FHL�PDL�DSURSLD L�VXQW�GHFL�����úL�����DGLF �3RUWXJDOLD�úL�1RXD�=HHODQG �

ÌQ�HWDSD�XUP WRDUH��QX��PDL��U PkQ�GHFkW����LQGLYL]L�úL�R�FODV �GLQ�FHOH�GRX � UL�UHJUXSDWH�SUHFHGHQW�UHSUH]HQWDW �DWXQFL�SULQ�FHQWUXO�V X�GH�JUHXWDWH��6H� UHJUXSHD] � GLQ� QRX� FHL� GRL� LQGLYL]L� I FkQG� V � VH� SLDUG � FkW� PDL� SX LQ �LQHU LH�LQWHUFODVH��DGLF �5HJDWHOH-8QLWH�úL�'DQHPDUFD�

6H�UHvQFHSH�DSRL�DFHDVW �LWHUD LH�SkQ �FkQG�WR L�LQGLYL]LL�YRU�IL�UHJUXSD L�în interiorul unei singure clase.

&ULWHULXO��³UXSWXULL�³�GH�OD�KLVWRJUDP �SHUPLWH�GHWHUPLQDUHD�QXP UXOXL de FODVH�FH�WUHEXLH�S VWUDW��$FHVW�FULWHULX�DEVROXW�D�FRQVHUYDW���FODVH�

&XORULOH� SHUPLW� XúRU� V � GLVWLQJHP� FHOH� �� FODVH� DVWIHO� FUHDWH�� �FXORULOH�identice au fost folosite în reprezentarea indivizilor în primul plan principal.

ÌQ�EOHX��FODVD����vQ�URúX��FOasa 2, în verde, clasa 3.

Page 17: Proiect Analiza Datelor

2.2. Consolidarea claselor 0HWRGD�FHQWUHORU�PRELOH�SHUPLWH�FRQVROLGDUHD��SDUWL LHL�I FXWH�FX�PHWRGD�

lui Ward. Plecând de la centrele de greutate ale celor 3 clase, se constituie clase noi: indivizii sunt atDúD L� OD� FHQWUXO� GH� JUHXWDWH� � FHO� PDL� DSURSLDW�� 6H�FDOFXOHD] � DSRL� FHQWUHOH� GH� JUHXWDWH� DOH� DFHVWRU� FODVH� QRL� úL� VH� UHvQFHSH�RSHUD LD�SkQ �OD�VWDELOLWDWH�

0HWRGD�FHQWUHORU�PRELOH�FRQYHUJH� IRDUWH� UDSLG��3ULPD�SDUWL LH�HIHFWXDW �

HUD� GHFL� VDWLVI F WRDUH�� 8Q� VLQJXU� LQGLYLG� � D� VFKLPEDW� FODVD�� HVWH� YRUED� GH�$UJHQWLQD�FDUH�WUHFH�GLQ�FODVD���OD� UL�vQ�FXUV�GH�GH]YROWDUH�

�&RQVWUXLUHD�ILQDO �D�FODVHORU�HVWH�GHFL��

Page 18: Proiect Analiza Datelor

Descrierea claselor Mai multe elemente permit caracterizarea claselor diferite create. 3RW� IL� SULYLWH� FRRUGRQDWHOH� úL� YDORULOH� WHVW� DOH� FHQWUHORU� GH� � JUHXWDWH� DOH�

FODVHORU��³LQGLYL]LL��WLSLFL´��PRGDOLW LOH�úL�YDULDELOHOH�FHOH�PDL�FDUDFWHULVWLFH� &ODVHOH���úL���DX�FRRUGRQDWH�GH�YDORUL�ULGLFDWH�GDU�RSXVH�SH�D[D����úL�FX�

valori test ridicate. $[D����SHUPLWH�GHFL�XúRU�GLIHUHQ LHUHD�XúRDU �D�LQGLYL]LORU� 'LQ�FRQWU ��FODVD���HVWH�IRDUWH�³PHGLH´�� Caracterizarea prin “indivizi tipici”

³,QGLYL]LL�WLSLFL´�VXQW�FHL�PDL�DSURSLD L��GH�FHQWUXO�GH�JUHXWDWH�DO�FODVHL� TaEHOHOH� XUP WRDUH� GDX�� SHQWUX� ILHFDUH� FODV �� FHL� �� LQGLYL]L�� FHL� PDL�

FDUDFWHULVWLFL�úL�GLVWDQ D�ORU�OD�FHQWUXO�GH�JUHXWDWH�

Page 19: Proiect Analiza Datelor

�&DUDFWHUL]DUHD�SULQ�YDULDELOH�LOXVWUDWLYH�úL�DFWLYH

8UP WRDUHOH� GRX � WDEHOH� GDX�� SHQWUX� ILHFDUH� FODV �� YDULDELOHOH� FHOH�PDL�PXOW�úL�FHOH�PDL�SX LQ�UHSUH]HQWDWLYH�

'HILQL LH

• &/$�02'�� UDSRUWXO� GLQWUH� QXP UXO� LQGLYL]LORU� DSDU LQkQG� FODVHL� úL�PRGDOLW LL�úL�QXP UXO�GH�LQGLYL]L�DSDU LQkQG�PRGDOLW LL� ([HPSOX����������GLQWUH� ULOH�DVLDWLFH�IDF�SDUWH�GLQ�FODVD�� (10/12=0,8333).

• 02'�&/$6�� UDSRUWXO�GLQWUH�QXP UXO� LQGLYL]LORU� DSDU LQkQG�FODVHL� úL�PRGDOLW LL�úL�QXP UXOXL�GH�LQGLYL]L�DL�FODVHL� Exemplu: 45,45% dintre indivizii clasei 1 sunt asiatice (10/12=0,4545).

&ODVD� �� HVWH� IRUPDW � vQ� SULQFLSDO� GLQ� ULOH� DVLDWLFH� vn timp ce ea nu FRQ LQH�QLFL-R� DU �HXURSHDQ �

&ODVD���HVWH�vQ�VSHFLDO�DIULFDQ �úL�DQLPLVW �úL�IRDUWH�SX LQ�FUHúWLQ � &ODVD� �� HVWH� GLQ� FRQWU � HXURSHDQ � úL� FUHúWLQ � úL� QX� FRQ LQH� QLFL� R� DU �

DIULFDQ �VDX�LVODPLVW �

Page 20: Proiect Analiza Datelor

*OREDO�� VH� UHPDUF � F � DEDWHULOH� PHGLL� S WUDWLFH� DOH� YDULDELOHORU� GLQ�

LQWHULRUXO�XQHL�DFHOHLDúL�FODVH�VXQW�LQIHULRDUH�DEDWHULL�PHGLL�S WUDWLFH�D�vQWUHJLL�PXO LPL�GH�YDULDELOH�

ÌQ�DGHY U�SDUWL LD�FUHHD] �FODVH�PXOW�PDL�RPRJHQH��0DL�PXOW��FODVHOH���úL�3 au “variabile caracterisWLFH�� GDU� vQ� RSR]L LH´�� $FHOHD� FDUH� VXQW� VXSHULRDUH�PHGLHL�SHQWUX�R�FODV ��VXQW�LQIHULRDUH��SHQWUX�FHDODOW ���úL�LQYHUV��&ODVD���HVWH�vQ�DGHY U�R�FODV �LQWHUPHGLDU ��I U �FDUDFWHULVWLFL�ELQH�PDUFDWH�

&ODVLILFDUHD� HVWH� GHFL� FRPSOHPHQWDU � $QDOL]HL� vQ� &RPSRQHnte 3ULQFLSDOH�� (D� YD� SHUPLWH� DILúDUHD� UH]XOWDWHORU� GkQG� FDUDFWHULVWLFL� IRDUWH�SUHFLVH�DVXSUD�FODVHORU��DOH�F URU�FRQWXUXUL�DX�IRVW�WUDVDWH�SULQ�$�&�3�

Concluzie

0HWRGHOH�VWDWLVWLFH�GHVFULSWLYH��GDWRULW � DVSHFWHORU�YL]XDOH� �UHSUH]HQW ULL�JUDILFH� úL� DUERUL� GH� FODVLILFDUH�� úL� LQWXLWLYH�� LPSRUWDQWH�� SHUPLW� GHVFULHUHD�UHODWLY�VLPSOX��D�PXO LPLL�GH�GDWH�FRPSOH[H�

$YDQWDMXO� DFHVWRU�PHWRGH�� vQ� DIDUD� DVSHFWXOXL� GHVFULSWLY�� FRQVW � GHFL� vQ�IDSWXO�F �HOH�VXQW�UHFHSWLELOH�GH�XQ�SXEOLF�ODUJ��QHVSHFLDOL]DW�

Page 21: Proiect Analiza Datelor

Anexe

Page 22: Proiect Analiza Datelor