1 / 24

CrimeScore

Brennon Bortz , Marcos Carzolio, Andrew Hoegh , Shashidhar Sundareisan. CrimeScore. What is CrimeScore ?. CrimeScore is the predicted number of violent crimes per month within a 1km radius of a given location in Washington, D.C. Training Data.

ermin
Download Presentation

CrimeScore

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BrennonBortz, Marcos Carzolio, Andrew Hoegh, ShashidharSundareisan CrimeScore

  2. What is CrimeScore? • CrimeScore is the predicted number of violent crimes per month within a 1km radius of a given location in Washington, D.C.

  3. Training Data • Uniform random sample points throughout Washington, D.C. • Data collected within a 1km radius of samples • Barbershops, bus stops, gas stations, schools, registered property, liquor establishments, and more • Distance to nearest police station, distance to nearest public housing project, etc.

  4. Data Aggregation • Parsed crime data from DC Data Catalog • Classified crime data • Violent and non-violent crimes • Focused on violent crimes, consisting of homicide, robbery, assault with a deadly weapon, and sexual abuse

  5. Implementation Goals • Simple, elegant and familiar • User Interface like Google maps • Dynamic; easily accommodates multiple queries • Represent crime score as a color and a number with an associated interpretation • Pack as much information as possible • Make queries fast and display results faster

  6. Data Flow

  7. Implementation • Query a search or listen to a click on the map • Use Google maps API to get positions of the search on the map • Feed the results to R-script to calculate CrimeScoreusing Shiny • Use the CrimeScore to display color coded markers on the map

  8. Rook • Wraps R environment • Bootstraps R’s internal web server • Maintains environment • Finnicky!

  9. Why Java Script? • Omnipresent in HTML scripting • Prevalent support and acceptance • Ability to write asynchronous functions so that the queries over the internet and to the database does not halt the web-page • Google Maps Java Script API v3 is heavily documented • Supports JSON data interchange format

  10. Google Maps API • Use URL requests to access geocoding, directions, elevation, place and time zone information. • Embed an interactive Google Map in the webpage using JavaScriptby creating markers, infowindows etc. • The JavaScript Maps API V3 is a free service, available for any web site that is free to consumers

  11. Google Maps API • Map • MapOpions • Geocoder • Marker • Infowindow • PlacesService • LatLng • Events

  12. Google Maps API • Place map at the center of Washington DC • Restrict queries up to a 10 km radius • Retrieve latitude and Longitude values for results • Place markers with appropriate colors depending upon crime score • Place infowindow on all markers to show satellite information • Allow option to manually give a Lat/Lon by clicking

  13. Data Storage • The data for the project were stored in a centralized database using MySQL • The main use of the database was to store Latitude, Longitude and details of places, as well as crimes relevant to the mining process • Data collected from the crime data set and the DC data catalog

  14. Challenges • Incomplete or missing data • Dealing with spatial data • Simultaneously dealing with polygons and points in the dc catalog • Finding the distances to the nearest barber shop, schools, churches, police stations, bus stops etc. is time consuming

  15. Challenges • Limit over number of queries in Google maps API • Using radarsearch over textsearch • Can’t specify the boundary of a search query other than a rectangle or a circle in google maps API • Maximum of 200 results per query

  16. Improving Implementation • Make results appear faster • Instead of calculating distances from every place to calculate crime score divide the city into a grid with pre-calculated values of crime scores • A query now will only find what grid the place belongs to and return the appropriate crime score

  17. Random Forest Regression • Each tree trains on a bootstrapped subset of data • At each node on all trees, algorithm randomly chooses predictors on which to build a regression model and create a split in feature space • Response in regression model is actual (observed) CrimeScore • Excellent predictions; difficult interpretations • Analysis done with randomForest R package

  18. Random Forest Regression Regression Tree 1 Original Data Bootstrap 1 Regression Tree 2 Bootstrap 2 Random Forest Regression Tree 3 Bootstrap 3 Regression Tree 4 Bootstrap 4

  19. Model Validation • Algorithm holds out 20% of data to test against model • Performance at each node measured by mean squared error and mean decrease in accuracy

  20. Results

  21. Results

  22. CrimeScore Functionality • Travelers seeking a safe place to stay • City planners choosing locations for parks, etc. • Police mapping out patrol routes • Homebuyers selecting a new residence • Hotel and real estate advertising

  23. CrimeScore

  24. Future Work • Implement CrimeScore in other cities • Develop interface within travel websites • Improve interactivity for city planner

More Related