Online Information Visualization of Huge Data Spaces

Online Information Visualization of Huge Data Spaces by Mao Lin Huang Internetworking Research

Application of graph drawing methods in information visualization to solve the problem of navigating large information spaces. The thesis covers three areas: 1. Information visualization: • the “small window” problem 2. Graph drawing: • the “drawing partially unknown graphs” problem • the “online graph drawing” problem • the preserving “the mental map” problem 3. Information discovery (browsing & navigation): • the “lost in hyperspace” problem Internetworking Research

1) The “small window” problem • Information visualization must allow user to view and browse information spaces and focus quickly on items of interest. • However, the limited number of pixels on the screen makes it difficult to completely display large information spaces in detail. This is known as the “small window” problem. Internetworking Research

“Static layout + dynamic viewing” • The most common solution for viewing moderate data (with hundreds or up to thousands of nodes) and addressing the “ small window” problem is to use “static layout + dynamic viewing” approaches that build a static global context of the graph, and then allow the user to navigate through it. • Since the amount of data that can be effectively displayed at one time is limited, and the whole global context may not be displayed in detail at one time, they always involve a mechanism to change the view (dynamic viewing). This allows the user to effectively view only at one time a small area of the whole visualisation by changing the viewing area, zoomed focus point, or view point of the visualisation. Internetworking Research

Using a very large virtual page The virtual page technique predefines the drawing of the whole graph, and then provides a small window and scroll bar to allow the user to navigate through it (by changing the viewing area). Internetworking Research

Fish-eye views The fish-eye technique can keep a detailed picture of a part of a graph as well as the global context of the graph. It changes the zoomed focus point. Internetworking Research

Hyperbolic tree The hyperbolic browser technique performs fish-eye viewing with animated transitions to preserve the user’s mental map. It changes both the viewing area and the zoomed focus point. Internetworking Research

3D Cone trees 3D methods, such as cone trees, enlarge the immediate UI workspace and increase the apparent density of information on the screen. It changes the view point. Internetworking Research

A summary of previous visualisation techniques While these techniques deal with graphs of moderate size, they don’t handle huge graphs (with millions or perhaps billions of nodes). The major problems may be outlined as below: • These techniques predefine the layout. In most cases, the whole graph may not be known. In some cases, the local node in a distributed system may know only a small subgraph of the graph. It may be impossible to pre-compute the layout of the whole graph. • Pre-computation of the overall geometrical structure of huge graph is very computationally expensive. Most layout algorithms have super-linear time complexity, and in practice are too slow for interactive graphics if the number of nodes is large than a few hundred. • The layout is predefined and views are extracted of this layout. The user is unable to navigate logically through the graph and they naturally thinks in terms of logical relations, not in terms of the synthetic geometrical mapping onto the screen. Internetworking Research

Fred Cathy Cathy Cathy Fred Cathy Cathy Cathy Mary Tony Tony Tony Mike Mike Mike Mary Tony Tony Tony Tony Fred Cathy Cathy Mike Mary Pat Mike Tony Tony Cathy Mike Mary Maurice Tony Maurice Tony Mike Mike Fintan Pat Mike Fred Maurice Maurice Tony The “ static layout + dynamic viewing” method is the traditional solution to the “ small window” problem. Internetworking Research

Online Information Visualisation • This thesis proposed a new approach to address the above three problems by using a sequence of dynamic visual frames called “logical frames”: • Online Navigational Visualisation: The user sees a tiny subset of the graph at any one time. The user changes view by traversing the graph logically. Internetworking Research

The Online Graph Model Internetworking Research

Let’s imagine that we are exploring a large snowfield. We are unable to see the entire snowfield, and the limited things we can see are those that are located within our current field of vision. Internetworking Research

Online Navigational Visualisation(OFDAV) OFDAV provides a major departure from traditional methods. We visualise a tiny part (a “frame” Fi ) of a huge graph at time t. We change from Fi to Fi+1 by user interaction. OFDAV does not need to know the whole graph, it does not predefine the geometry (the user can navigate logically), and it is user-oriented. Internetworking Research

Online Navigational Visualisation • In OFDAV, the view of the user focuses on a small subgraph of a large graph G at any point in time. • The subgraph is defined by its focus nodes. • Conceptually, the focus nodes form a FIFO queue. We then allow the user to change the set of focus nodes by selecting another node on the screen. • We use a force-directed graph drawing algorithm to draw the subgraph of G and a logical neighbourhood of this subgraph. • We use animation to guide the user between views, reduce the cognitive effort and preserve the mental map. • We also adopt a history that traces the subgraphs that the user has visited. This assists in backtracking through the graph. Internetworking Research

Transitions To change from one logical key frame Fi to next Fi+1, the user selects a node vi+1 in Fi with a mouse click. The vi+1 is appended to the queue, and a node is deleted from the queue in a FIFO manner. Internetworking Research

2) The “partially unknown graphs” problem • That is, the whole graph that we want to draw may not be known. In some cases, at the time of viewing only a small sub-graph is known. Thus, it is impossible to define a drawing of the graph. • We solve this problem by incrementally calculating and maintaining a small local visualization on-line, instead of predefining the overall visual structure of the graph at once. Internetworking Research

The graph is supplied to the system by a series of requests for neighbourhoods of focus nodes. The graph is partially unknown Small local graph new focus node v Huge graph neighbourhood of v Internetworking Research

3) The “online graph drawing” problem • We address the general graph drawing problems, that is, to make the layout of graph comprehensive and easier to read. • We also address some specific criteria for online graph drawing, we achieve this by using a “modified spring algorithm” for graph drawing. Internetworking Research

The “online graph drawing” problem The specific criteria for online drawing: • The layout of logical frame must show the direction of the exploration. • Reduce the overlaps among the local regions. • The sequence of drawing preserves the mental map. The general criteria for graph drawing: • Reduce the edge crossings. • … Internetworking Research

The layout of Fi must show the direction of the exploration. Spring model Modified spring model Internetworking Research

Reducing the overlaps among the local regions. Spring model Modified spring model Internetworking Research

Reducing the number of edge crossing Spring model Modified spring model Internetworking Research

Spring model In the spring model, each node is replaced by a steel ring, and edges are replaced by Hookes’s law springs. The rings have a gravitational repulsion acting between them, and we can find a drawing which minimizes the energy. Internetworking Research

The physical forces The Modified Spring Algorithm has many forces, including: • Hooke’s law springs for all edges, with varying strengths depending on whether the endpoints are focus nodes or not. • Gravitational repulsion forces for all nonedges. • Special gravitational forces between nodes in each neighbourhood. • Some further forces. The effect of these forces is to: • try to keep the queue of focus nodes in a left-right line • keep node images disjoint • radially display neighbourhoods around each focus node Internetworking Research

Modified spring algorithm In order to address the specific criteria of on-line drawing, we add extra forces among the neighbourhoods, N(vi ), N(vi+1 ), …, N(vi+B-1 ) of the focus nodes. These extra forces are used to separate the neighbourhoods so that user can visually identify the changes. Thisextra force is also a Newtonian gravitational force. Internetworking Research

The force model Suppose that Fi = (Gi, Qi) is the logical frame which is currently being viewed on the screen, and Gi = (Vi, Ei). The total force applied on node v is: (1) Where fuv is the force exerted on v by the spring between u and v, and guv and huv are the gravitational repulsions exerted on v by one of the other node u in Fi. Internetworking Research

An example of modified spring algorithm. In this frame, there are two focus nodes, x and y. The total force on node v is: Internetworking Research

4) The preserving“the mental map” problem • It is a key quality issue of information visualization. That is, the user has difficulty in quickly understanding the underlying structure of the current view, when moving the focus around the huge graph by changing of views. The user has to spend time to re-form the mental map and understand the changes and relationships between the previous view and the current view. Internetworking Research

The mental map Our goal is to preserve the user’s mental map, while taking best advantage of the view screen. In OFDAV, we use three types of animation to assist the user in understanding the change in view. Fade Animation: We use shrinking/growing to help the user identify nodes that are disappearing/appearing. Camera Animation: This moves the whole drawing so that the new focus node moves toward the centre of the screen. Layout Animation: We use a complex system of forces based on Hooke’s law springs to adjust the layout between logical key frames. Internetworking Research

Layout Animation For each logical key frame Fi there is a graph drawing D(F) which consists of a sequence D1, D2, …, Dk of drawings of Fi; each is a screen of Fi. We use Spring Algorithm to achieve the layout animation. The algorithm creates the in-betweening sequence of screens to smoothly revise the layout from the old key frame D(Fi ) to the new key frame D(Fi+1 ). The change from one screen Dito the next screen Di+1is computed by a numerical method which converges to a stable configuration of the force system. Layout animation is the most important mechanism that we provide to achieve the smooth transition between views and preservation of the mental map. Internetworking Research

5) The “lost in hyperspace” problem • “Lost in hyperspace” is a famous problem of navigating the huge data space, where users become disoriented with respect to a complex system of hypertext links. • When users move around a large information space as much as they do in hypertext, there is a real risk that they may become disoriented or have trouble finding the information they need [Nielson, 1990]. Internetworking Research

5) The “lost in hyperspace” problem • Even in this small document, which could be read in one hour, users experienced the ‘lost in hyperspace’ phenomenon as exemplified by the following user comment: ‘ I soon realized that if I did not read something when I stumbled across it, then I would not be able to find it later.’ Of the respondents, 56% agreed fully or partly with the statement, ‘When reading the report, I was often confused about where I was.’ [Nielson, 1990]. Internetworking Research

Overview Diagrams This is the real problem that happens when reading text in a nonsequential way with too many cross-references. A number of researchers have noted that overview diagrams provide a reasonable solution to the “lost in hyperspace” problem. Our system can dynamically generate a sequence of such diagrams. Other overview diagram systems have been proposed. Internetworking Research

Overview diagrams using the biform views However, these systems all predefine the layout and they only visualise the history within very limited context levels (eg. 4 levels). (In contrast, OFDAV provides an on-line browsing environment in which we can navigate through unlimited context levels.) Internetworking Research

Focus+Context views Another number of researchers have developed new dynamic methods to visualise query results of web search. Mukherjea proposes a dynamic focus+context view technique to show the focus node, immediate neighbourhood of the node and some landmark nodes in a web site. This helps user to quickly gain the understanding of where they are. Internetworking Research

Focus+Context views • However, from visualisation & navigation points of the views, this technique has a number of weaknesses: • The mental map is broken when jumping from one view to another. (OFDAV adopts three types of animations to smooth transform from one view to another.) • The user understands where they are, but has no guide to returning to where they have visited in the past. (OFDAV adopts a “history” tail to traces the previous focus nodes that user has visited. This assists user in backtracking through the graph.) Internetworking Research

The current Web browsing technique • The amount of information now available through the WWW has grown explosively. An increasing number of tools are also available to assist the user to find and access information on the WWW. One of the key requirements for a WWW navigator is to maintain the user’s sense of orientation and facilitate navigation within the context of the total information space. Internetworking Research

Web browser • The current generation of Web browsers, such as Netscape and MS explorer, provide users with an effective and convenient way to move in cyberspace.This is done by clicking on a series of hyperlinks embedded in Web pages. • However, this arrangement does not give users a visual “map” to guide the users in their Web journey. It does not provide a sense of “space” while the user is exploring the (cyber) space, instead it only gives a series of linear lists. Internetworking Research

Web browser • This is mainly because of the difficulty of constructing such a huge, complex, and dynamic map with a (virtually) unlimited number of hyper-documents (nodes) and hyperlinks (edges). • Most existing visualisation techniques and current research interests emphasise “site mapping”. That is, they try to find an effective way of constructing a structured geometrical map for one Web site. This can only guide the user through a very limited region of cyberspace, and does not help users in their overall journey through cyberspace. Internetworking Research

Graphic Web browser • Graphic Web Browser - mapping and browsing the entire Cyberspace. • We look at the whole of Cyberspace as one graph; a huge and partially unknown graph. We use on-line visualisation to maintain and display a subset of this huge graph incrementally. Internetworking Research

Graphic Web Browser addresses the problem of “lost in hyperspace” with a sense of “space”. • Graphic Web Browser addresses the fundamental problem of “lost in hyperspace” by displaying a sequence of logical visual frames with a graphic “history tail” to track the user’s current location and keep records of his previous locations in the huge information space. • The logical neighborhood of the focus nodes indicates the current location of the user, and the tail of history indicates the path of the past locations during the navigation. Internetworking Research

Internetworking Research

Visualising the history of exploration The queue of focus nodes is the recent history of the exploration. The force model that we use tends to keep this history in a horizontal line; the new nodes are added to one end of the line and the nodes which disappear are at the other end. As well as the queue of focus nodes, OFDAV keeps track of past history: a queue of all previous focus nodes. An option in OFDAV is to show the past history. Internetworking Research

Conclusion • More sophisticated filtering strategies and rules should be created. Existing filtering rules may sometimes make us lose useful information. • The labelling problem has not been completely solved yet. If we put the entire long URL string into a box as its label, then the boxes are enlarged and cost more display space. The issues are: 1) how to shorten the length of labels, and 2) make these short labels unique. The investigation of these issues is proceeding. Internetworking Research

Online Information Visualization of Huge Data Spaces