1 / 34

Camera Culture

Ramesh Raskar Associate Prof, Media Lab, MIT. Camera Culture. Course WebPage : http://raskar.info/course.html. Today’s Plan. Summary, Camera for image ‘search’ Visual Social Computing + Citizen Journalism Next class big question: ‘Opportunities in Pervasive Public Recording’ Big concept

leann
Download Presentation

Camera Culture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ramesh Raskar Associate Prof, Media Lab, MIT Camera Culture Course WebPage : http://raskar.info/course.html

  2. Today’s Plan • Summary, Camera for image ‘search’ • Visual Social Computing + Citizen Journalism • Next class big question: • ‘Opportunities in Pervasive Public Recording’ • Big concept • (Last week) • Understanding Camera Constraints • (This week) • What matters in photography: pixels (Low-level cues) or low-dimensional features (Mid-level cues)? • Decomposing pixels into meaningful values

  3. Camera for ‘image search’ How can we augment the camera to support best 'image search'? • 'Search'=segment/identify/recognize/transform/compare/archive • Or more precisely, object matching across images. • (For example, if we find to find a specific face image, we need a procedure to segment and identify (detect) the pixels likely to belong to a face, then recognize the candidate face by transforming into a representation where we can match with that specific face image. Currently, this is all performed in software using traditional cameras. Typically, the algorithms try to reduce the image to lower-dimensional 'features' and do the matching in this feature-space. Unlike text search, where the search pipeline is simple thanks to easy matching process, object-matching-in-images is quite difficult. What can we additional data can we capture while recording pixels and what new algorithms can exploit this augmented photo?) • How can we make the scene ingredients machine readable so that we can easily perform the 'search'? Is this the key problem? 3D reconstruction (so that it is view independent, )? Hardware and software solutions? Crowdsourcing (let people do • marking/sorting/indexing for others)? Metadata tagging (tag highlevel text labels rather than pixel-level tagging)? • Do we need to capture Material index (where is all the wood in this image)? Segmentation boundaries (shape versus reflectance edges)? Repeatable view and illumination invariance (be able to recreate image from a given view so it can be compared with another image, or create images that look same independent of time-of-day)? • Some ideas: (i) to locate all 'images' with faces, record the iris biometric which validates if a photo includes a human eye, and then we can search all images across an album with that face/eye/iris, (ii) embed RFID tag (electronic bar-code) in every object and record the binary index with an RFID reader.

  4. Next Class • Homework • What are the opportunities in pervasive recording of public spaces? • Pervasive public recording=surveillance/GoogleEarthLive/Subscription cameras • Technology: • See thru fog, time-lapse processing, day-nite/season/multi-modal fusion, how to consume these images, how to merge with static/dynamic content, merge with static/dynamic cameras, support object recognition, refine GPS coords, crowdsourcing, metadata (video frame) tagging • Society: • Commerce (real-estate, reviews, remote maintenance), Environment (earthquake-prediction like opportunities, Politics (protests) • Volunteer • Class notes: Lav (today), next .. • Select/read/present/paper • Visual Social Computing: Tom • Mobile Photography: Eugene • Beyond Visible Spectrum: Brandon • Emerging sensors: Matt • Developing Countries: Lav/ Tilke • Sols for Visually Challenged: James

  5. Today 3pm Less is More: Coded Computational Photography Speaker: Ramesh Raskar, MIT Media LabDate: Wednesday, February 20 2008Time: 3:00PM to 4:00PM Refreshments: 2:45PM Location: Star Seminar Room (32-D463)

  6. Topics • Imaging Devices, Modern Optics and Lenses • Emerging Sensor Technologies • Mobile Photography • Visual Social Computing and Citizen Journalism • Imaging Beyond Visible Spectrum • Computational Imaging in Sciences • Trust in Visual Media • Solutions for Visually Challenged • Cameras in Developing Countries • Future Products and Business Models

  7. Feedback • What are your questions about camera/technology/society? • Your expectations from the course?

  8. Topics • Other courses • Art and Photography • CSAIL: Computational Photography • MechE: Optics • Fall’2008 • ‘Intro to Computational Camera and Photography’ • I will teach course in Fall • Current course • More emphasis on future cameras • Faster review of technology and then look at impact/applications/opportunities • Big ideas/technologies/applications, • Understand rules-of-thumb and trade-offs • Ideal for thesis/projects/research papers/business models • Learn fun stuff before the nitty gritty

  9. No-flash Photography: Full of Tradeoffs... • Available light vs. exposure time vs. scene movement vs. field of view vs. focus depth vs. sensitivity vs. noise vs. color rendition vs. color gamut vs. contrast vs. visible detail vs. …. Flash

  10. Available Light vs Parameter/Specs ‘box’ Exposure Dynamic Range Focus distance Resolution/Frame rate Focal Length (zoom) Field of view Depth of field Aperture Limited Parameters Limited Abilities

  11. Dynamic Range Short Exposure Goal: High Dynamic Range Long Exposure

  12. Phase 1 of Better Photography • Epsilon Photography • Low-level vision • Best pixel and pixel-features • Vary focus, exposure, polarization, illumination • Vary time, view • Better than any one photo (resolution/frame rate, fov, dynamic range etc) • Achieve effects via multi-photo fusion • Create a Super-camera • Mimic human eye

  13. Phase 1.1 of Better Photography • Create a Super-camera • Mimic human eye • What aspect of human eye are critical/ useless? • Eye: Feedback wrt brain, After-image/illusions, • Camera: geometry/stereo pair, multispectral,uniform res, memory, • What are other parameters/Design/Features to improve? • Very small camera/thin camera .. • Tight loop with illumination • ..

  14. The Eye’s Lens

  15. Varioptic Liquid Lens: Electrowetting Varioptic, Inc., 2007

  16. Varioptic Liquid Lens (Courtesy Varioptic Inc.)

  17. Captured Video (Courtesy Varioptic Inc.)

  18. Conventional Compound Lens

  19. “Origami Lens”: Thin Folded Optics (2007) “Ultrathin Cameras Using Annular Folded Optics, “ E. J. Tremblay, R. A. Stack, R. L. Morrison, J. E. Ford Applied Optics, 2007 - OSA

  20. Origami Lens Conventional Lens Origami Lens

  21. Origami Lens Image Conventional Lens Image Optical Performance Conventional Origami Scene

  22. Compound Lens of Dragonfly

  23. TOMBO: Thin Camera (2001) “Thin observation module by bound optics (TOMBO),” J. Tanida, T. Kumagai, K. Yamada, S. Miyatake Applied Optics, 2001

  24. TOMBO: Thin Camera

  25. Image = Optics . Scene Captured Image TOMBO Scene Captured Image (Multiple low-resolution copies of the scene)

  26. Reconstructed Image

  27. Phase 1 of Better Photography • Epsilon Photography • Low-level vision • Best pixel and pixel-features • Vary focus, exposure, polarization, illumination • Vary time, view • Better than any one photo (resolution/frame rate, fov, dynamic range etc) • Achieve effects via multi-photo fusion • Create a Super-camera • Mimic human eye

  28. Phase 1.1 of Better Photography • Create a Super-camera • Mimic human eye • What aspect of human eye are critical/ useless? • .. • What are other parameters/Design/Features to improve? • Very small camera/thin camera .. • Tight loop with illumination • ..

  29. Phase 2 of Better Photography • Coded Photography • Mid-level cues • Regions, shapes(depth), edges, motion, material-index (…) • Cartoons via Multi-flash camera (depth edges), Wavelength profile, • Visual interface issue (human eye expects pixels) • Decompose pixel values (…) • Single or few photos • Create a functionally super-camera • Don’t mimic human eye

  30. Multiperspective Camera? [ Jingyi Yu’ 2004 ]

  31. Phase 3 of Better Photography • Essence Photography • High-level cues • Inference, perception, cognition • Intent based (like biovision systems) • Not a ‘single-solution fits-all’ • ? Single or few ‘photos’ • Beats ‘photography’ • Don’t just mimic human eye, or record pixels/mid-level cues • Create a meaningful representation of visual experience • New art form, new commerce models

  32. Visual Social Computing and Citizen Journalism • What is VSC • Social Computing is well known, I made up VSC • My defn of SC: Online computation of the people, by the people, for the people (old world: govt, economy, epidemiology) • Subsets • Crowdsourcing (CAPTCHA) (by the people, but maybe for just one person) • Participatory sensing (of the people, but no active part by individuals, not for the people) • Recommendation systems (by the people and for the people) • Tagging (Digg) (all three) • Blogs, social networks, auctions, wikipedia, tags • 90% of all data will be ‘about people’ • Example problem: Can we reduce distrust among Kenya’s groups? • Easy to predict certain trends .. • Just add dimensions • Text, audio/music, images, video, (whats next) • LP->Cassette-VHS player -> CD player -> DVD Player (ok Blue-ray DVD player) -> (whats next) • Radio-TV- .. • Gopher -> Newsgroups ->Wikipedia _> (whats next) • Take anything text/audio based -> image/video • Take anything image based -> video (Flickr -> YouTube)

  33. Today 3pm Less is More: Coded Computational Photography Speaker: Ramesh Raskar, MIT Media LabDate: Wednesday, February 20 2008Time: 3:00PM to 4:00PM Refreshments: 2:45PM Location: Star Seminar Room (32-D463)

  34. Next Class • Homework • What are the opportunities in pervasive recording of public spaces? • Pervasive public recording=surveillance/GoogleEarthLive/Subscription cameras • Technology: • See thru fog, time-lapse processing, day-nite/season/multi-modal fusion, how to consume these images, how to merge with static/dynamic content, merge with static/dynamic cameras, support object recognition, refine GPS coords, crowdsourcing, metadata (video frame) tagging • Society: • Commerce (real-estate, reviews, remote maintenance), Environment (earthquake-prediction like opportunities, Politics (protests) • Volunteer • Class notes: Lav (today), next .. • Select/read/present/paper • Visual Social Computing: Tom • Beyond Visible Spectrum: Brandon • Mobile Photography • Emerging sensors • Developing Countries: Lav

More Related