1 / 38

Jo Ellis-Monaghan St. Michaels College, Colchester, VT 05439 e-mail: jellis-monaghan@smcvt.edu website: http://aca

Graph Models from the Kevin Bacon Game to Biomolecular Computing and Beyond!. Jo Ellis-Monaghan St. Michaels College, Colchester, VT 05439 e-mail: jellis-monaghan@smcvt.edu website: http://academics.smcvt.edu/jellis-monaghan. A multiple edge. A loop. A. A. A. B. B. B. C. C.

albert
Download Presentation

Jo Ellis-Monaghan St. Michaels College, Colchester, VT 05439 e-mail: jellis-monaghan@smcvt.edu website: http://aca

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graph Models from the Kevin Bacon Game to Biomolecular Computing and Beyond! • Jo Ellis-Monaghan • St. Michaels College, Colchester, VT 05439 • e-mail: jellis-monaghan@smcvt.edu website: http://academics.smcvt.edu/jellis-monaghan

  2. A multiple edge A loop A A A B B B C C C D D D Graphs and Networks A Graph or Network is a set of vertices (dots) with edges (lines) connecting them. Two vertices are adjacent if there is a line between them. The vertices A and B above are adjacent because the edge AB is between them. An edge is incident to each of the vertices which are its end points. The degree of a vertex is the number of edges sticking out from it.

  3. The Kevin Bacon Gameor6 Degrees of separation http://www.spub.ksu.edu/issues/v100/FA/n069/fea-making-bacon-fuqua.html Total number of linkable actors: 631275Weighted total of linkable actors: 1860181Average Bacon number: 2.947 Kevin Bacon is not even among the top 1000 most connected actors in Hollywood (1222th). Average Connery Number: 2.706 Data from The Oracle of Bacon at UVA

  4. Start with any matching Switch matching to nonmatching and vice versa A maximal matching! Maximal Matchings in Bipartite Graphs A Bipartite Graph Start at an unmatched vertex on the left End at an unmatched vertex on the right Find an alternating path

  5. The small world phenomenon • Stanley Milgram sent a series of traceable letters from people in the Midwest to one of two destinations in Boston. The letters could be sent only to someone whom the current holder knew by first name. Milgram kept track of the letters and found a median chain length of about six, thus supporting the notion of "six degrees of separation." http://mathforum.org/mam/04/poster.html

  6. Social Networks • Stock Ownership (2001 NY Stock Exchange) • Children’s Social Network • Social Network of Sexual Contacts http://mathforum.org/mam/04/poster.html

  7. Infrastructure and Robustness Scale Free Number of vertices Vertex degree JetBlue Distributed Number of vertices Vertex degree MapQuest

  8. Rolling Blackouts inAugust 2003 http://encyclopedia.thefreedictionary.com/_/viewer.aspx?path=2/2f/&name=2003-blackout-after.jpg

  9. Some Networks are more robust than others.But how do we measure this? http://www.caida.org/tools/visualization/mapnet/Backbones/

  10. t s A network modeled by a graph (electrical, communication, transportation) Question: If each edge operates independently with probability p, what is the probability that the whole network is functional? A functional network (can get from any vertex to any other along functioning edges) A dysfunctional network (vertices s and t can’t communicate)

  11. Deletion and Contraction is a Natural Reduction for Network Reliability If an edge is working (this happens with probability p), it’s as thought the two vertices were “touching”—i.e. just contract the edge: If an edge is not working (this happens with probability 1-p), it might as well not be there—i.e. just delete it: Thus, if R(G;p) is the reliability of the network G where all edges function with a probability of p, and e is not a bridge nor a loop, then R(G;p) =(1-p)R(G-e;p) + p R(G/e;p)

  12. = (1-p)p2 + p (1-p) + pp = (1-p)p2 + p(1-p)p+ p2 + p (1-p) Reliability Example Note that if every edge of the network is a bridge (i.e. the network is a disjoint union of trees), then R(G;p) = (p)E, where E is the number of edges. Also note that R(loop;p) = 1 E.g.: So R(G;p) = 3p2- 2p3 gives the probability that the network is functioning. E.g. R(G; .5)=.5625 Bothersome question: Does the order in which the edges are deleted and contracted matter?

  13. A E B D C A E B D C Conflict Scheduling Draw edges between classes with conflicting times Color so that adjacent vertices have different colors. Minimum number of colors = minimum required classrooms.

  14. G G-e G\e + = n(n-1)2+ + Coloring Algorithm • The Chromatic Polynomial counts the ways to vertex color a graph: • C(G, n ) = # proper vertex colorings of G in n colors. Recursively: Let e be an edge of G . Then, = - = = n(n-1)2+n(n-1)+ 0 = n2(n-1)

  15. Conflict Scheduling • Register Allocation • Assign variables to hardware registers during program execution. Variables conflict with each other if one is used both before and after the other within a short period of time (for instance, within a subroutine). Minimize the use of non-register memory. • Vertices: the different variables • Edges: between variables which conflict with each other • Colors: assignment of registers • Need at least as many registers as the minimum number of colors required! • Frequency Assignment • Assign frequencies to mobile radios and other users of the electromagnetic spectrum. Two customers that are sufficiently close must be assigned different frequencies, while those that are distant can share frequencies. Minimize the number of frequencies. • Vertices: users of mobile radios • Edges: between users whose frequencies might interfere • Colors: assignments of different frequencies • Need at least as many frequencies as the minimum number of colors required!

  16. Rectilinear pattern recognitionjoint work with J. Cohn (IBM), R. Snapp and D. Nardi (UVM) • IBM’s objective is to check a chip’s design and find all occurrences of a simple pattern to: • Find possible error spots • Check for already patented segments • Locate particular devices for updating The Haystack The Needle…

  17. Pre-Processing BEGIN /* GULP2A CALLED ON THU FEB 21 15:08:23 2002 */ EQUIV 1 1000 MICRON +X,+Y MSGPER -1000000 -1000000 1000000 1000000 0 0 HEADER GYMGL1 'OUTPUT 2002/02/21/14/47/12/cohn' LEVEL PC LEVEL RX CNAME ULTCB8AD CELL ULTCB8AD PRIME PGON N RX 1467923 780300 1468180 780300 1468180 780600 + 1469020 780600 1469020 780300 1469181 780300 1469181 + 781710 1469020 781710 1469020 781400 1468180 781400 + 1468180 781710 1467923 781710 PGON N PC 1468500 782100 1468300 782100 1468300 781700 + 1468260 781700 1468260 780300 1468500 780300 1468500 + 780500 1468380 780500 1468380 781500 1468500 781500 RECT N PC 1467800 780345 1503 298 ENDMSG (Raw data format) Two different layers/rectangles are combined into one layer that contains three shapes; one rectangle (purple) and two polygons (red and blue) Algorithm is cutting edge, and not currently used for this application in industry.

  18. Linear time subgraph search for target Both target pattern and entire chip are encoded like this, with the vertices also holding geometric information about the shape they represent. Then we do a depth-first search for the target subgraph. The addition information in the vertices reduces the search to linear time, while the entire chip encoding is theoretically N2 in the number of faces, but practically NlogN.

  19. Netlist Layout(joint work with J. Cohn, A. Dean, P. Gutwin, J. Lewis, G. Pangborn) How do we convert this… … into this?

  20. Gate Pin Wire Netlist • A set S of vertices ( the pins) hundreds of thousands. • A partition P1 of the pins (the gates) 2 to 1000 pins per gate, average of about 3.5. • A partition P2 of the pins (the wires) again 2 to 1000 pins per wire, average of about 3.5. • A maximum permitted delay between pairs of pins. Example

  21. The Wires

  22. The Wiring Space Placement layer-gates/pins go here Vias (vertical connectors) Horizontal wiring layer Up to 12 or so layers Vertical wiring layer

  23. The general idea • Place the pins so that pins are in their gates on the placement layer with non-overlapping gates. • Place the wires in the wiring space so that the delay constrains on pairs of pins are met, where delay is proportional to minimum distance within the wiring, and via delay is negligible

  24. B D G A F C E H Lots of Problems…. • Identify Congestion • Identify dense substructures from the netlist • Develop a congestion ‘metric’ Congested area Congested area What would be good What often happens

  25. Automate Wiring Small Configurations • Some are easy to place and route • Simple left to right logic • No / few loops (circuits) • Uniform, low fan-out • Statistical models work • Some are very difficult • E.g. ‘Crossbar Switches’ • Many loops (circuits) • Non-uniform fan-out • Statistical models don’t work

  26. SPRING EMBEDDING

  27. Random layout Spring embedded layout

  28. Biomolecular constructions • Nano-Origami: Scientists At Scripps Research Create Single, Clonable Strand Of DNA That Folds Into An Octahedron • A group of scientists at The Scripps Research Institute has designed, constructed, and imaged a single strand of DNA that spontaneously folds into a highly rigid, nanoscale octahedron that is several million times smaller than the length of a standard ruler and about the size of several other common biological structures, such as a small virus or a cellular ribosome. http://www.sciencedaily.com/releases/2004/02/040212082529.htm

  29. DNA Strands Forming a Cube http://seemanlab4.chem.NYU.edu

  30. Assuring cohesion A problem from biomolecular computing—physically constructing graphs by ‘zipping together’ single strands of DNA (not allowed) N. Jonoska, N. Saito, ’02

  31. A Characterization • A theorem of C. Thomassen specifies precisely when a graph may be constructed from a single strand of DNA, and theorems of Hongbing and Zhu to characterize graphs that require at least m strands of DNA in their construction. • Theorem: A graph G may be constructed from a single strand of DNA if and only if G is connected, has no vertex of degree 1,and has a spanning tree T such that every connected component of G – E(T) has an even number of edges or a vertex v with degree greater than 3.

  32. L. M. Adleman, Molecular Computation of Solutions to Combinatorial Problems. Science, 266 (5187) Nov. 11 (1994) 1021-1024. • Oriented Walk Double Covering and Bidirectional Double Tracing • Fan Hongbing, Xuding Zhu, 1998 • “The authors of this paper came across the problem of bidirectional double tracing by considering the so called “garbage collecting” problem, where a garbage collecting truck needs to traverse each side of every street exactly once, making as few U-turns (retractions) as possible.”

  33. AGGCTC AGGCT GGCTC CTACT TCTAC CTCTA TTCTA DNA sequencing(joint work with I. Sarmiento) It is very hard in general to “read off’ the sequence of a long strand of DNA. Instead, researchers probe for “snippets” of a fixed length, and read those. The problem then becomes reconstructing the original long strand of DNA from the set of snippets.

  34. Enumerating the reconstructions • This leads to a directed graph with the same number of in-arrows as out arrows at each vertex. • The number of reconstructions is then equal to the number of paths through the graph that traverse all the edges in the direction of their arrows.

  35. Graph Polynomials Encode the Enumeration • A very fancy polynomial, the interlace polynomial, of Arratia, Bollobás, and Sorkin ,2000, encodes the number of ways to reassemble the original strand of DNA. • It is related, with a lot of work, to the contraction-deletion approach of the Chromatic and Reliability polynomials.

  36. a b d c The interlace polynomial is computed, not on the “snippet” graph, but on an associated circle graph. The “snippet” graph a b c a c a d d b c d b A chord diagram The associated circle graph

  37. v' v v v' v' v v v' a c a c v’ v’ v b v b Pendant Duplicate Graphs Effect of adding a pendant vertex or duplicating a vertex Adding a pendant vertex to v. Duplicating the vertex v.

  38. Theorem • A set of subsequences of DNA permits exactly two reconstructions iff the circle graph associated to any Eulerian circuit of the ‘snippet’ graph is a pendant-duplicate graph. Side note to the cognesci: Pendant-duplicate graphs correspond to series-parallel graphs via a medial graph construction, so the two reconstructions is actually a new interpretation of the beta invariant.

More Related