video retrieval and user interaction and digital rights management n.
Skip this Video
Download Presentation
Video retrieval and User interaction and digital rights management

Loading in 2 Seconds...

play fullscreen
1 / 14

Video retrieval and User interaction and digital rights management - PowerPoint PPT Presentation

  • Uploaded on

Video retrieval and User interaction and digital rights management. From Multimedia Retrieval, Springer, Blanken et al. “Multimodal” is the keyword…. Based on a case study Formula race cars video recordings Fusion of multimodal information Sound

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Video retrieval and User interaction and digital rights management' - jalena

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
video retrieval and user interaction and digital rights management

Video retrieval and User interaction and digital rights management

From Multimedia Retrieval, Springer, Blanken et al.

multimodal is the keyword
“Multimodal” is the keyword…
  • Based on a case study
    • Formula race cars video recordings
  • Fusion of multimodal information
  • Sound
    • Audio signal analysis to detect interesting events – when the commentator gets excited
    • At the beginning of an event, there is an overview by commentator
    • They capture the audio signal and screen out the non-voice range signal
    • They also look for specific words – not general voice recognition, but searching only for a handful of race-specific words
f usion
  • Audio
  • Analysis of image stream
    • To catch start of race and other events
    • Used to locate time boundaries of isolatable events
  • Superimposed text
    • Projected on tv screen
    • Information on the driver
    • Driver’s place in race, etc.
audio processing
Audio processing
  • Mix of human language, car noise, background noise, crowd cheering, horns
  • Look for human voice frequency
  • Short time energy (STE)
    • To remove noise
    • Wave form based
  • Pitch – fundamental frequency (F0), the higher, the more excitement in the voice
  • Search for phonemes
  • Pause rate – to detect quantity of speech
  • Keyword spotting – less semantics, but lower error rate
image stream
Image stream
  • Searched for places where commentator raised his voice
  • Searched histogram, looking for certain colors and shapes
  • Tracked the changing of colors and shapes over a series of frames
  • Focus on
    • Start of race
    • Passing
    • Fly-outs (sand and dust)
  • Two classes
    • Scene text
    • Superimposed text
  • The same text can span many frames, and so they count on its position being fixed to limit processing time
  • Ways to pose queries
  • Ways to give feedback
  • Ways to explore
interaction types
Interaction types
  • Retrieval
    • Query formulation
      • Concept based
      • Content based
  • Concept-based
    • Key words in natural language
    • People use different words for the same thing
    • Metadata is often missing
    • Easy for user, hard for software
  • Content-based
    • Query by example paradigm
    • User provides examples
dynamic query interaction
Dynamic query interaction
  • Sliders, buttons, etc.
  • Visual is the key
    • Of the query
    • Of the results
  • Example system, page 299
  • Interaction cycle is short
  • Links, with a feeling similar to using the web
  • Browsing model
    • To get impression of search space
    • To find something when you aren’t sure what it is
    • Browsing a collection of objects and browsing a single object
    • Browsing keywords or namespace hierarchy
    • Example on page 301
user input and relevance feedback
User input and relevance feedback
  • Modalities
    • Visual, audio, tactile
    • Or touch screen, electronic pen, camera, mic, eye tracker, locality sensor, mouse, keyboard
    • No user guide needed
    • If it is speech only, it is difficult to process
    • Multiple modalities at once
      • Such as speech and a map for location or distance
    • Use of ambient intelligence to collect information
  • Relevance feedback
    • Binary feedback
    • Weighed relevance feedback – image page 305
    • Personalization
      • Similar to 1-to-1 marketing concept
      • User profiles are used
        • Users not excited about providing profile info, though
      • Users are grouped into content interest groups
  • Passive works well, like skipping songs on a feed
  • Making an offer that adds to a query, works sometimes, like Amazon trying to sell you similar books
  • User profiles can be built automatically from a history of purchases or a clickstream
  • Filtering techniques
    • Content based – based on triples
      • Attribute – value – fit
      • Title – war and peace – 0
    • Social based – by putting people into groups and getting larger user samples and putting profiles into groups
  • Must provide metadata and data in an integrated way
  • Inherently multimedia in nature in query and response
  • Tree maps or complex metadata or data
  • Graphs to put multimedia objects together into single conceptual objects
  • Starfield display
  • Breaking videos into segments to aid non-linear searching
    • Providing sample frame for each segment
  • Images on pages 314 and 315 and 316
  • Key factors in presenting multimedia data – content adaption
    • What capabilities the device has
    • Limits of device – like size, color, formats of data
    • Must often change formats of data to fit a device
digital rights
Digital rights
  • DRM (digital rights management)
  • Preventative approach
    • Encryption
    • Node locking
    • Dongle
  • Reactive approach
    • Embedding extra information in the product
    • Tracking behavior and looking for a violation
    • Sometimes called forensic tracking
      • Looking for specific watermarks, often specific to a given user
      • Makes it hard to pass content on
  • Application domains
    • Legal – concept: Personal Entertainment Domain (PED)
    • To keep content secure, commercially and intelligence-wise
    • Diagram on page 325 and 326 and 331
  • Sometimes the media is free and commercials are embedded