html5-img
1 / 41

Voice Activated Un-Lock Technology

Voice Activated Un-Lock Technology. V.A.U.L.T A Matlab based Simulation. By. Siddharth Advani B2213401 Anand Gokhale B2213420 Vishal Jain B2213426 Guided by Dr. P.M. Patil. OBJECTIVE.

talbot
Download Presentation

Voice Activated Un-Lock Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Voice Activated Un-Lock Technology V.A.U.L.T A Matlab based Simulation

  2. By Siddharth Advani B2213401 Anand Gokhale B2213420 Vishal Jain B2213426 Guided by Dr. P.M. Patil

  3. OBJECTIVE Correct decision on a speaker’s identity claim given a speech segment (password)

  4. MOTIVATION • Speech contains speaker specific characteristics • Voiceprint as a biometric (distinguishing trait) • Natural & economical way of identification

  5. DEFINATIONS Client: speaker registered on the system Impostor: speaker who claims a false identity Mel-filtering: a frequency scaling that takes into account the fact that the ear is sensitive to linear changes in frequency below 1000 Hz and logarithmic change in frequency above 1000 Hz

  6. What is Simulation? A simulation is the imitation of the operation of a real world process or system over time. Using MATLAB as a tool, VAULT aims at simulating a voice recognition system

  7. Software Implementation

  8. MATLAB Features: • Interpreter  Meant for simulation in R&D • High performance numerical computation • Signal Processing Toolbox

  9. Visual Basic Features • Easy to implement. • Very user friendly, interactive. • Compatible with MATLAB and any Windows version. • Less complicated than the GUI of MATLAB. • Any Microsoft application can be embedded in the VB.

  10. Zones Of VAULT

  11. Phase 1 - Identification FEATURE EXTRACTION PATTERN RECOGNITION USER ID WORD SYSTEM DATABASE TRAINING

  12. PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD WORD IS SAMPLED AT 11.025 kHz PHASE 1 - IDENTIFICATION THE WORD IS DIVIDED INTO SEGMENTS 256 SAMPLES IN EACH SEGMENT

  13. 8 CEPSTRUM COEFFICIENTS ARE CALCULATED FOR EACH SEGMENT PHASE 1 - IDENTIFICATION PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD

  14. PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD VECTOR QUANTIZATION IS USED TO CREATE CODEBOOK PHASE 1 - IDENTIFICATION CEPSTRUM COEFFICIENTS ARE QUANTIZED USING A CODEBOOK OF 128 VECTORS

  15. PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD DISTANCE=8 ? DISTANCE=16 DISTANCE=5 DISTANCE=12 PHASE 1 - IDENTIFICATION 1 2 CLIENT 3 3 DISTANCE=12 4

  16. Database 4 1 2 3 Identification EVERY SPEAKER IS GIVEN A TAG ‘Zero’ 4

  17. PHASE 2 - Authentication ACCEPT REJECT PATTERN RECOGNITION FEATURE EXTRACTION PASS-WORD SYSTEM DATABASE TRAINING

  18. PROCESS FEATURE EXTRACTION VECTOR QUANTIZER CODEBOOK WORD THE SPEECH IS SAMPLED AND THE CEPSTRUM COEFFICIENTS ARE CALCULATED THE SAME WAY AS IN THE IDENTIFICATION PHASE PHASE 2 - AUTHENTICATION

  19. PROCESS FEATURE EXTRACTION VECTOR QUANTIZER CODEBOOK WORD THIS TIME THE QUANTIZER USES A PERSONAL CODEBOOK TRAINED BY THE REAL USER PASSWORD PHASE 2 - AUTHENTICATION USER

  20. PROCESS FEATURE EXTRACTION VECTOR QUANTIZER CODEBOOK WORD PHASE 2 - AUTHENTICATION THRESHOLD PASSWORD DECISION ACCEPT/ REJECT DISTANCE USER CLIENT THRESHOLD DECIDES THE DECISION

  21. Main Obstacle • How to define and extract the unique features of human voice CEPSTRUM cepstrum(frame)=IDFT(log(|DFT(frame)|))

  22. STOCHASTICMODEL TEMPLATE MODEL DETERMINISTIC BETTER SCORE  MIN. DIST PROBABILISTIC BETTER SCORE  MAX. PROB PATTERN MATCHING Dynamic Time Warping Vector Quantization Nearest Neighbour Hidden Markov Model Gaussian Mixture Model

  23. VECTOR QUANTIZATION Goal: finding how the data is clustered • A (feature) vector space is broken into cells • Speaker model: codebook • Codebook: set of prototype vectors (codevectors) • Codevector: vector computed from "similar" single (feature) vectors (e.g. 8 cepstrum coefficients makes 1 codevector)

  24. CLUSTERING

  25. RESULTS THRESHOLD = 5 REJECT ACCEPT

  26. PERFORMANCE EVALUATION • False Rejection (FR) – A client request as himself/herself is rejected • False Acceptance (FA) – An impostor request as a client is accepted • Genuine Acceptance (GA) – A client request as himself/herself is accepted

  27. ACCURACY • FAR (False Acceptance Rate): Prob. of false acceptance Estimate: # false acceptances ---------------------------------------- # false claims • FRR (False Rejection Rate): Prob. of false rejection Estimate: # false rejections ---------------------------------------- # true claims • GAR (Genuine Acceptance Rate): Prob. of genuine acceptance Estimate: # true acceptances ---------------------------------------- # true claims

  28. GRAPHS

  29. THRESHOLD The threshold T can be determined by: 1) choosing T to satisfy a fixed FA or FR criterion 2) varying T to find different FA/FR ratios and choosing T to give the desired FA/FR ratio.

  30. SOURCES OF ERROR CLIENT: • Bad Pronunciation • Extreme emotional states (e.g. stress) • Sickness (head colds alter the vocal tract) • Aging (vocal tract can drift away from models with age) • Channel mismatch (using different microphones for enrollment and verification) IMPOSTER: • Mimicry AMBIENT NOISE

  31. STRENGTHS & WEAKNESSES Strengths • SPEECH IS EASY TO PRODUCE • LOW COMPUTATION REQUIREMENTS • SPEECH IS A BEHAVIORAL SIGNAL • SPOOFING OF SYSTEMS Weaknesses

  32. APPLICATIONS • Security Systems • Voice Dialing • Access control to computers / databases • Remote access to computer networks • Electronic commerce • Forensic • Telephone banking

  33. Hardware Application Robotics Aim: To control a robot via voice

  34. Robot Control via Voice

  35. Parallel Port Interface • 25 pin D-type Male Connector • Parallel port of computer :3 registers • Data register • Status register • Control register

  36. FM Transmitter-Receiver • Frequency of operation: 433.92 MHz • Modulation type : ASK • Bandwidth : 200 kHz

  37. FEA – The Robot Features • Wireless • Prime Mover: DC motors

  38. Relay Driver IC ULN2803 • Eight Darlington Arrays • Internal Free Wheeling Diodes • Output Compatible with TTL logic

  39. FEA’s Drivers IC L293B Motor Driver IC • Four Channel drivers • Bidirectional Motor drive • High voltage , high current output

  40. PROJECT TIME DISTRIBUTION JAN –PARTICIPATED AT IIT TECHFEST FEB –(a) SUBMITTED PAPER AT TECHKRITI KANPUR (b) MADE FEA FOR FERVOR AT COEP (c) MATLAB & VISUAL BASIC TRAINING MAR – PHASE 1 & 2 COMPLETED IN MATLAB APR – MATLAB & VISUAL BASIC INTERFACE MAY – EVALUATION OF SOFTWARE: FAR,FRR & GAR JUNE – APPLICATION BOARD

  41. Future Expansion • Implementation over the DSP board • Making the system to work in real time • Speech Recognition

More Related