1 / 1

1u*tu

1u*tu. (t'--*. ol'thc IntcrnationalNlultiClonlctcuoc linginccrs and (lomputcr Sr:icntists. 2010 Vol I,. of. I)r'ooccdings. . . il'll.l(ls 2010, Ir'Iarch17 - 19,2010. Ibng I'orrg. A Novel K-MeansBasedClusteringAlgorithm. for Hish DirnensionalData Sets.

casper
Download Presentation

1u*tu

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1u*tu (t'--* ol'thc IntcrnationalNlultiClonlctcuoc linginccrs and (lomputcr Sr:icntists 2010 Vol I, of I)r'ooccdings \ \ il\'ll.l(ls 2010, Ir'Iarch17 - 19,2010. Ibng I'orrg A Novel K-MeansBasedClusteringAlgorithm for Hish DirnensionalData Sets NID Nasir Suhmtru,NID Ali N{arnat Madtid Khalilian, Norrv'atiMr-rstapha, Ilorvevet', they ollcrt do uot pr-cciscll mcasuLemeuts. identify thc relevancc of the measurcd fcatttt-cs to thc Abstract- Data clustering is an unsupervised nlethod for specific phenomena of iuterest. Data obserr,'atiottsr'vith extraction hiddcn pattern fi'om huge data sets. tlaving thousands of fcatures ol' moro ilfe now colnlnonl such as both accuracy and efficiency for high rlinrcnsional data profiles clustedng in reconttucttder systerns. pcrsonaltty scts rvith enormous number of samplcs is n challcnging sirnilaritl', geuotnic data, Iinancial data, rveb document arena. For rcally largc and high dimcnsiolral data scts, data and sensor data. llotvc','cr, high-dmrcnsional data vcltical rlnta reduction shoukl be performed pl'ior to poses differ-ent challenges for clustcr-ing algor-ithrns tl-rat applying thc clustering techniques rlhich is pcrfonning rcquirc specialized solutions. Rcocntly. somc rcscarchcs dimension reduction, and the rnain disadvantagc is have given solutionson high-dimettsionalproblcnr. Il thc saclificing thc quality of t'esults. Horvcver, bccausc rvcll-krtorvu sun'ey [1] the problcru is introduocd ln l vct\ dirncnsionnlit't' reduction methods inet'itablv cirusc solnc 'lhel'c illustratile r.vayaud solne approaches al-eskctched. loss of infortnation or rnaY danrlge the intcrprtrtabilitl' of is no clcar distinction betrvccn drllbrcrtt sub problcnrs thc rcsults, even distot'ting thc lcal clusters,extla caution ancl the ar.bitrarily oliented) or (axis-parallel iIt solnc applications d:rtn is advised. Furthcrlnorc, corrcsponding algorithrns arc discusscd rvithout polntlrlg rcductiolr is not possiblc c.g. personal silrlilirrity, out the under-lfing dilTerences iu the respective probletn custotner pI'ef'eretrces proliles clustering irt custorner dcllnitions. recornnretrdet' systeln or tlata by rlhich is generatcd sensol' netrvolks. Existing clustering techniques rvould Our main ob.jectiveis proposing a fiauretvot-k to oourbitre nol'nrally apply in a lal'ge space rvith high dilnension; and diviclc ancl rclational definition of olustcring spa<.;c dividing big space into subspaceshorizontallv can lead us oonquer method to ot'ercome albretncntioued dilliculties to high elficiency and accul'acY.In this studY rve propose and irnpror-ing ciliciency and acottlaoy irl K-NIcans :r mcthod tlrat uses divirle and conquel' technique lvith algorithm to apply in high dirnensioualtlatasets l)csprtc equivalencv and conlpatiblc relafion concepts 1o inrpl'ole prcvious study in dividing rvholc spacc rnto sublpaocs the performance of the K-l\Ieans clustering nrethod for verlically based on olrjeot's fbatures ll-61 rve applv a usirrg in high dinrensional datasets. llxperiment l'esults horizontal rnethod to divide etrtite space iuto sLtbspaocs dernonstlate appropriate accut'acj and specd up. somc expcrirncuts ou rcal bascd on objcots. We <;onducted rvorld dataset (per.sonality srmilaritv) as au examplc o1' Intlex T'ernt-Chtstering, K-IIeans .llgofithn, Personulity spccial groups o1'applicationsthat arc silent othcr rncthods Sin rilar ity, I I ig h D im en si tmaI I) uta, E qui' ale n cy lfe lati ort suitablesolutiou. for pr-esenting l. INTRot)ucrloN In scction 2 plclitnrnarics lbr rcscaroh arc dcscribed. lvlany appiioations nced to rtsc unsupcrvrsed tcohutclues Proposed method has bectr explained in section 3, in rvhere therc is uo previous larorvledge about patteurs scction 4 rvc dcmonsh'atc our mcl-hodology, datasets insiclc samplcs arrd its grouping, so olttstcrtng can be description, experiuretrt prooedut'e Finally lve havc . usef'ul. (lluster-iug is gr-ouping satnples base on their conolusion anclfuturc rvotks. sunilaritv as samplcs rn dill-el'cnt groups should bc dissirnilar Both sinrilality and dissimilarity need to be I'i{lLIl\llN.\Itlt-ls Ii cluoidatccliu olcal rvay. High dimcnsionalitY is onc of thc r.najor causes in data complexity. Technologl urakcs tt (ilustering is groupirtg saurples illto some classes possiblc to autotnatically obtain a hugc amount of unsupcn.iscd rvith unklotvn labcl. Lct cousidcr spacc rvith sarnples rvhich are defined rvith vectot-s: \1. KI t,\l-lLI:\N ts rvrth tltc Islarnic Azad Unttcr srtv, Karaj lJt utoh. Iran. S : {Vr,[/2,...,1j,] as each vcctor has orr'tr ptopertics llc is non PirD candrdate in Iraculty of'Contputer Soicucc atld Inlblltattorr 'fochnolor', Unn crsity l)utra N,lalat'sra ', € R't me (E{arl: khalrl-i:11::tiig! or-o). j"' propcrty that is shorvnrvith vr, r', *o I | "ns I)r Norrvatr Nlustapha rs rvith thc Computcr Sctcucc I)cparttnctrt, Facully 'fcohtrolov, [Jntr cr sttv Putt a N[alaysta u1 y, and I{u is a space rvith d dimcnsittns. o[ (lornputor Sctcrtoc atd Iulitt mation (li-rtrarlr I\r {$ ir1;!r-.!,t!.li! ttttttt,c,lq,rrr} ). l)r Nasir Suliman is rvith the Liomputcr Soicncc L)epartnrunt. IIc rs the (- tettng,'1|gort |Itnt ,1. K-,trIettns Itts ,\ssocraled Prof. itr Iracultv of Cotnputcr Science attd lnlirrntattrut 'fcchnolov. ITniversitv Putra Nlalavsra (I1-mail: \ir,111-r{$!4qflu.c!!!!]!J) I)r r\lr lvlamat is rvrth thc Coutptttct Scicnoc Dcpartnrcnt. l lc i-q the K - N{cans is rcgardcd as a staplc of clustcring rncthods, r\ssocratcd Prol'. tu Facultv oI Computer Sojencc and Itllbtmalton due to its easeof impletnentatiou. It rvorks rvell fbr-narty 'l eohnokrv. Untr crsttv Putla Nlalavsia (l:-rnail: .\lr!1 1-skiql-r!!tt, qtl! ,ttl ) practical problcrns. particularly r.lhcn the rcsultlug clusters are compact and hyper spherioal irt shape. 1'he IN4I:,L]S 2010 012-8-2 ISLIN:978-988-l'l (Online) (Print). ISSN:2078-09(r(r ISSN:2r178-0958

More Related