1 / 78

Network

Network. Pajek. Introduction. Pajek is a program, for Windows , for analysis and visualization of large networks having some thousands or even millions of vertices. In Slovenian language the word pajek means spider. Application.

Download Presentation

Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Pajek

  2. Introduction • Pajek is a program, for Windows, for analysis and visualization of large networks having some thousands or even millions of vertices. In Slovenian language the word pajek means spider.

  3. Application • Pajek should provide tools for analysis and visualization of such networks: • collaboration networks, • organic molecule in chemistry, • protein-receptor interaction networks, • genealogies, • Internet networks, • citation networks, • diffusion (AIDS, news, innovations) networks, • data-mining (2-mode networks), etc. • See also collection of large networks at: • http://vlado.fmf.uni-lj.si/pub/networks/data/

  4. Main goals • to support abstraction by (recursive) decomposition of a large network into several smaller networks that can be treated further using more sophisticated methods; • to provide the user with some powerful visualization tools; • to implement a selection of efficient (subquadratic) algorithms for analysis of large networks.

  5. six data structures in pajek • network– main object (vertices and lines - arcs, edges): • graph, valued network, 2-mode or temporal network • partition • Nominal property of vertices. Default extension: .clu • vector • numerical property of vertices. Default extension: .vec • permutation • reordering of vertices. Default extension: .per • cluster • subset of vertices (e.g. a class from partition). Default extension: .cls. • hierarchy • hierarchically ordered clusters and vertices. Default extension: .hie

  6. Network – .net • Network can be defined in different ways on input file. Look at three of them: • 1. List of neighbours (Arcslist / Edgeslist)(see test 1.net) *Vertices 5 1 ”a” 2 ”b” 3 ”c” 4 ”d” 5 ”e” *Arcslist 1 2 4 2 3 3 1 4 4 5 *Edgeslist 1 5

  7. Explanation • Data must be prepared in an input (ASCII) file. Program NotePad can be used for editing. Much better is a shareware editor, TextPad. • Words, starting with *, must always be written in first column of the line. They indicate the start of a definition of vertices or lines. • Using *Vertices 5 we define a network with 5 vertices. This must always be the first statement in definition of a network. • Definition of vertices follows after that – to each vertex we give a label, which is displayed between “and ”. • Using *Arcslist, a list of directed lines from selected vertices are declared (1 2 4 means, that there exist two lines from vertex 1, one to vertex 2 and another to vertex 4). • Similarly *Edgeslist, declares list of undirected lines from selected vertex. • In the file no empty lines are allowed – empty line means end of network.

  8. Network – .net • 2. Pairs of lines (Arcs / Edges) (see test 2.net) *Vertices 5 1 ”a” 2 ”b” 3 ”c” 4 ”d” 5 ”e” *Arcs 1 2 1 1 4 1 2 3 2 3 1 1 3 4 2 4 5 1 *Edges 1 5 1

  9. Explanation • Directed lines are defined using *Arcs, undirected lines are defined using *Edges. The third number in rows defining arcs/edges gives the value/weight of the arc/edge. • In the previous format (Arcslist / Edgeslist) values of lines are not defined • the format is suitable only if all values of lines are 1. • If values of lines are not important the third number can be omitted (all lines get value 1). • In the file no empty lines are allowed – empty line means end of network.

  10. Network – .net • 3.Matrix (see test 3.net) *Vertices 5 1 ”a” 2 ”b” 3 ”c” 4 ”d” 5 ”e” *Matrix 0 1 0 1 1 0 0 2 0 0 1 0 0 2 0 0 0 0 0 1 1 0 0 0 0

  11. Explanation • In this format directed lines (arcs) are given in the matrix form (*Matrix). If we want to transform bidirected arcs to edges we can use “Network>create new network>Transform>Arcs to Edges>Bidirected only”

  12. Additional definition of network • Additionally, Pajek enables precise definition of elements used for drawing networks (coordinates of vertices, shapes and colors of vertices and lines, ...). • Example: (see test 4.net) *Vertices 5 1 “a” box 2 “b” ellipse 3 “c” diamond 4 “d” triangle 5 “e” empty ...

  13. Draw • Layout of networks • Energy: The network is presented like a physical system, and we are searching for the state with minimal energy • Kamada-Kawai: using separate components, you can tile connected components in a plane • Fruchterman-Reingold: draw in a plane or space and selecting the repulsion factor • Eigen Values: Selecting 2 or 3 eigenvectors to become the coordinates of vertices. Can obtain nice pictures

  14. Partition – .clu • Partitions are used to describe nominal properties of vertices. • e.g., 1-men, 2-women • Definition in input file (see test.clu) *Vertices 5 1 2 2 2 1

  15. Vector – .vec • Vectors are used to describe numerical properties of vertices (e.g., centralities). • Definition in input file (see test.vec) *Vertices5 0.58 0.25 0.25 0.08 0.25

  16. Pajek project files • It is time consuming to load objects one by one. Therefore it is convenient to store all data in one file, called Pajekproject file (.paj). (see test.paj) • Project files can be produced manually by using “File>PajekProject File>Save” • To load objects stored in Pajek project file select “File>Pajek Project File>Read”

  17. Menu structure • Commands are put to menu according to the following criterion: • commands that need only a network as input are available in menu Net, • commands that need as input two networks are available in menu Networks, • commands that need as input two objects (e. g., network and partition) are available in menu Operations, • commands that need only a partition as input are available in menu Partition . . .

  18. Global and local views on network

  19. Global and local views on network • Local view is obtained by extracting sub-network induced by selected cluster of vertices. • Global view is obtained by shrinking vertices in the same cluster to new (compound) vertex. In this way relations among clusters of vertices are shown. • Combination of local and global view is contextual view: Relations among clusters of vertices and selected vertices are shown.

  20. Example • Import and export in 1994 among 80 countries are given. They is given in 1000$. (See Country_Imports.net) • Partition according to continents (see Country_Continent.clu) • 1 – Africa, 2 – Asia, 3 – Europe, 4 – N. America, 5 – Oceania, 6 – S. America. • Operations>Extract from Network>Partition • Operations>Shrink Network>Partition

  21. Extracting Subnetwork • Operations>Extract from Network>Partition

  22. Extracting Subnetwork • Operations>Shrink Network>Partition

  23. Removing lines with low values • Network>Info>Line Values

  24. Removing lines with low values • Network>Create New Network>Transform>Remove>Lines with value>lower than (340000)

  25. Resources • Download • The latest version of Pajek is freely available, for non-commercial use, at its home page: http://vlado.fmf.uni-lj.si/pub/networks/pajek/ • Text file into Pajek • http://vlado.fmf.uni-lj.si/pub/networks/pajek/howto/text2pajek.htm • WoS to Pajek • http://vlado.fmf.uni-lj.si/pub/networks/pajek/WoS2Pajek/default.htm • Tutorial • Exploratory Social Network Analysis with Pajek • visit Pajek wiki for more information • http://pajek.imfm.si/doku.php

  26. http://pajek.imfm.si/doku.php?id=wos2pajek/ WOS to pajek

  27. S519 Web of Science

  28. S519 Output

  29. S519 Output

  30. wos2pajek • The download link:  • http://pajek.imfm.si/doku.php?id=wos2pajek • The new tutorial slides:  • http://pajek.imfm.si/lib/exe/fetch.php?media=faq:wos:wos2pajek07.pdf

  31. MontyLingua • Download from: http://web.media.mit.edu/~hugo/montylingua/ • Unpack it and copy ‘montylingua-2.1’ to C:\Python26\Lib\site-packages • Set up a new environment variable named ‘MONTYLINGUA’ and set the variable value as c:\Python26\Lib\site-packages\MontyLingua-2.1\Python

  32. wos2pajek • Download the latest version of WoS2Pajek. • http://pajek.imfm.si/doku.php?id=wos2pajek • Unpack it, and double click on WoS2Pajek.py to show the main interface of program:

  33. You can also put all wos files in a folder

  34. WoS2Pajek Program • The current version of WoS2Pajek requires 7 parameters to be given by the user: • MontyLinguadirectory: path to the directory in which the MontyLingua package is installed; • project directory: where the output files are saved; • WoS file; • maxnum– estimate of the number of all vertices (number of records+number of cited Works) –30*number of records; • step – prints info about each k*step record as a trace; step= 0– no trace. • use ISI name / short name; • make a clean WoS file without duplicates; • boolean list[DE, ID, TI, AB] specifying which fields are sources of keywords.

  35. Wos-pajek.txt

  36. Cite.net • Network/Info/General • Network/Create New Network/Transform/Remove/Loops • Network/Create New Network/Transform/Remove/Multiple lines/Single line

  37. CiteNew.net • Paper citation network • Questions • What are highly cited articles? • The diameter of the network? • What are the major clusters? • More questions?

  38. Strong component of cite network • Network/Create Partition/Components/Strong [2] • Operations/Network+Partition/Extract SubNetwork[1-*] • Operations/Network+Partition/Transform/Remove Lines/Between Cluster • Save citestrong.clu

  39. Co-author network • Read WA.net • Network/2-mode network/2-mode to 1-mode/Columns • Network/Create Partition/Components/Weak [2] • Operations/Network+Partition/Extract SubNetwork[1-*] • Network/Create New Network/Transform/Remove/Loops • WANew.net (which is a co-author network) • Questions: • The author with highest co-authors?

  40. Bibliographic coupling network • [Read Cite.net] • Network/Create New Network/Transform/1-mode to 2-mode • Network/2-mode Network/2-mode to 1-mode/Rows • Network/Create Partition/Components/Weak [2] • Operations/Network + Partition/Extract SubNetwork[1-*]

  41. Co-citation network • [Read Cite.net] • Network/Create Partitions/Degree/Output • Operations/Network+Partition/Extract subNetwork[1-*] • Network/Create New Network/Transform/1-mode to 2-mode • Network/2-mode network/2-mode to 1-mode/Columns • Network/Create Partition/Components/Weak [2] • Operations/Network+Partition/Extract SubNetwork[1-*]

  42. Network analysis

  43. Two-mode network • One-mode network • each vertex can be related to each other vertex. • Two-mode network • vertices are divided into two sets and vertices can only be related to vertices in the other set.

  44. Example *vertices 15 10 1 "P1" 2 "P2" 3 "P3" 4 "P4" 5 "P5" 6 "P6" 7 "P7" 8 "P8" 9 "P9" 10 "P10" 11 "Au1" 12 "Au2" 13 "Au3" 14 "Au5" 15 "Au5" *edgeslist 1 11 12 15 2 12 14 15 3 14 4 11 15 5 12 13 6 13 7 11 15 8 11 12 14 9 11 12 13 14 15 10 11 12 15 • Suppose we have data as below: • P1: Au1, Au2, Au5 • P2: Au2, Au4, Au5 • P3: Au4 • P4: Au1, Au5 • P5: Au2, Au3 • P6: Au3 • P7: Au1, Au5 • P8: Au1, Au2, Au4 • P9: Au1, Au2, Au3, Au4, Au5 • P10: Au1, Au2, Au5 See two_mode.net

  45. Transforming to valued networks • The network is transformed into an ordinary network, where the vertices are elements from the first subset, using • “Network>2 mode network>2-Mode to 1-Mode>Rows”.

  46. Transforming to valued networks • If we want to get a network with elements from the second subset we use • “Network>2 mode network>2-Mode to 1-Mode>Columns”.

More Related