1 / 12

Visualizing the Legislature

Visualizing the Legislature. Howard University Systems and Computer Science. Mugizi Robert Rwebangira. How to get student interested?. Show them something relevant!. What is relevant right now?. BIG DATA!. BIG DATA.

xanto
Download Presentation

Visualizing the Legislature

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visualizing the Legislature Howard University Systems and Computer Science Mugizi Robert Rwebangira

  2. How to get student interested? Show them something relevant! What is relevant right now? BIG DATA!

  3. BIG DATA 1,200 billion terabytes of data generated in 2008 (=1,200 billion terabytes) More than generated in first 6000 years of human history Growing at 60% per year Source: The Economist, “The Data Deluge” February 26,2010

  4. Problem Storing this Data Understanding/Summarization Emergence of “Data Scientist” Have skills in programming and math/statistics

  5. APPLICATION: POLITICS Getting data on congressional votes used to be difficult Now easily downloadable on government web site For example US Senate takes about 600 votes a session Question: how do we present this situation in a useful way

  6. Solution: Math For each senator we have 600 pieces of information (their votes) We can view the senate as a “cloud” of points in a high dimensional space We want to PROJECT these points into 2 dimensions while preserving the features of the dataset (i.e similar senators should be close together) Also should be efficient

  7. Principle Component Analysis Let X be the n X d data matrix where n = number of senators and d = number of votes We want to compute a matrix n X 2 matrix Y PCA computes the Y such that each dimension is maximally informative (in some sense) Can be computed by Singular Value Decomposition

  8. Principle Component Analysis (cont.) Take X = (V) (E) (VT) (Singular Value Decomposition) The Y = (VT)(X) Can be computed quickly O(n^3)

  9. Dataset Senate Roll Call 110th Senate: 2007 - 2008 n = 102 senators d = 634 votes Data can be obtained from here: http://voteview.com/senate110.htm

  10. Results

  11. Conclusions Visualizing high dimensional will become increasingly important as data proliferates Good motivation for the study of linear algebra/statistics

  12. Questions?

More Related