General Signal Processing and Machine Learning Tools for BCI Analysis (Part-1)

General Signal Processing and Machine Learning Tools for BCI Analysis (Part-1) Md. Zia Uddin Bio-Imaging Lab, Department of Biomedical Engineering Kyung Hee University

Contents • Abstract • Introduction • Spectral Filtering • Special Filtering • Classification

Abstract • This presentation is about signal processing and machine learning techniques and their applications to BCI. • Overview of general signal processing and classification methods as used in single-trial EEG analysis is given. • For further study, original publications are encouraged.

Why ML for BCI • Subject wise experiments • Subject to subject result variances for same kind of experiments. • Session wise experiments • Session to session huge result variability for the same person. • Real time experiments • The system needs to identify the subjects mental state from single trial. • Much more complexity arises. Solution • A session and user brain signature adaptable system is necessary to overcome the subject to subject and session to session huge variability.

Why Preprocessing? • Relevant information extraction is difficult because of large dimensional data (i.e., Curse of dimensionality). • Dimensionality has to be reduced keeping the discriminative information and eliminating undiscriminative information. • Most of the classification methods calculate covariance matrix of the data for further feature analysis. Huge covariance matrix is required in the case of large dimensional data. • Thus, Prepropcessiong steps regarding dimensionality reduction is required • In some cases • A priory knowledge is used (e.g., spatial Laplace filter at predefined scalp locations) • Automatic methods (e.g., spatial filters determined by common spatial pattern analysis)

Spectral Filter: FIR & IIR • Common approach is to use digital frequency filter • To consider desired frequency range • Two sequences of poles (a) and zeroes (b) with length na and nb are necessary that can be calculated by Butterworth or elliptic. • The source signal x is filtered to y as a(1)y(t)=b(1)x(t) + b(2)x(t-1) +...+ b( nb )x(t- nb -1) – a(2)y(t-1) -...- a( na )y(t- na -1) • Where a and na are constrained to be 1, is called FIR filter (i.e., considering all zeros). • Advantage of FIR • Produce steeper slopes in between pass and stop band. • In most of the BCI applications, band pass filter is required to consider specific frequency range.

Spectral Filter: Fourier-Based Filter • A good alternative than FIR and IIR is to use temporal Fourier-based filtering in BCI. • A signal switches from temporal to the spectral domain. • The filtered signal is obtained by choosing suitable weighting to the relevant frequency components and applying Inverse Fourier Transformation (IFT). • The short time window determines the frequency resolution

Spatial Filter: Bipolar Filtering • EEG channels are measured as voltage potential relative to a standard reference (referential recording). • Also, it is possible to record all the channels as voltage difference between the electrode pairs. • From referential EEG, bipolar channels can be obtained by subtracting the respective channels FC4-CP4=(FC4-ref) - (CP4-ref)=FC4ref -CP4ref • Reduces the effect of local smearing by computing local gradient. • Focuses on the local activity while contributions of more distant sources are attenuated

Spatial Filter: Common Average Reference • The mean of all EEG channels are subtracted from each channel to get the common average reference signals. • Reduces the influence of far field sources but may introduce some undesired spatial smearing • Artifacts of one channel may spread to all other channels.

Spatial Filter: Laplace Filtering • More localize filter can be obtained through this. • Laplace signals are obtained by subtracting the average of surrounding electrodes from each individual channel. C4Lap =C4ref- ¼(C2ref + C6ref + FC4ref + CP4ref) • The choice of surrounding channels determine the characteristics of the filter. • Usually, small Laplacians are used (as example given above). • Large Laplacians use neighbors at 20% distance as defined in international 10-20 system.

Spatial Filter: Principle Component Analysis(1) • Represent multidimensional data with fewer number of variables retaining main features of the data. • It is inevitable that by reducing dimensionality some features of the data will be lost. It is hoped that these lost features are comparable with the “noise” and they do not tell much about underlying population. • The method PCA tries to project multidimensional data to a lower dimensional space retaining as much as possible variability of the data. • Its simplicity makes it very popular. But care should be taken in applications. First it should be analyzed if this technique can be applied.

Spatial Filter: Principle Component Analysis(2) • Orthogonal directions of greatest variance in data • Projections along PC1 discriminate the data most along anyone axis • First principal component is the direction of greatest variability (covariance) in the data. • Second is the next orthogonal (uncorrelated) direction of greatest variability • So first remove all the variability along the first component, and then find the next direction of greatest variability • And so on … PC 2 Original Variable B PC 1 Original Variable A

Spatial Filter: Principle Component Analysis(3) • We can ignore the components of lesser significance. • We do lose some information, but if the eigenvalues are small, we don’t lose much • n dimensions in original data • calculate n eigenvectors and eigenvalues • choose only the first p eigenvectors, based on their eigenvalues • final data set has only p dimensions Choosing top components and forming a feature vector Get data Subtract the mean Calculate the eigenvectors and eigenvalues of the covariance matrix Calculate the covariance matrix Basic Steps Eigenplot

Visual Evoked Potential Extraction from Single Trial EEG signals using PCA filtering(1) • Problem Definition: • Remove the noise to get VEP in the single trial 29 channels EEG data without ensemble averaging • Technique adopted to solve the Problem: • Selection of principal components as basis for the reconstruction of signal • Methodology • Given signal is divided into an ensemble of signals, for each channel • An ensemble average for each channel is obtained as a reference • Apply PCA to find out the orthonormal eigenvectors which are used as basis for signal approximation • Selection of Principal components as basis by looking at the frequency components present in the “prototype signal” i.e. the averaged signal

Visual Evoked Potential Extraction from Single Trial EEG signals using PCA filtering(2) Original Signal Filtered signal Original signal Template signal Single epoch after PCA filtering Reconstructed epoch stacks

Spatial Filter: Independent Component Analysis(1) • Basically ICA is applied for Blind Source Separation (BSS) • Assume an observation (signal) is a linear mix of unknown independent source signals • The mixing (not the signals) is stationary • We have as many observations as unknown sources • To find sources in observations • Need to define a suitable measure of independence • … For example - the cocktail party problem (sources are speakers): Find Z • Formal Statement • N independent sources … Zmn ( MxN ) • linear square mixing … Ann ( NxN ) • produces a set of observations … Xmn ( MxN ) • ….. XT = AZT

Spatial Filter: Independent Component Analysis(2) • ‘demix’ observations … XT ( NxM) into YT = WXT YT (NxM)  ZTW (NxN) A-1 • How do we recover the independent sources? • (We are trying to estimate W  A-1 ) • …. We require a measure of independence!

Spatial Filter: Independent Component Analysis(3) • The source signals are mixed by random non orthogonal matrix • JADE algorithm was applied to demix the signals • After reordering and scaling, the demixed signals are very similar to sources. • PCA would fail here as the mixed signals are not orthogonal to each other, which is the key assumption of PCA. • Other ICA algorithms • Infomax • FastICA

Spatial Filter: Independent Component Analysis(4) Applying ICA to single-trial EEG epochs • Data Collection • EEG data were recorded from 31 scalp electrodes • 29 placed at locations based on a modified International 10-20 system • one placed below the right eye (VEOG), • one placed at the left outer canthus (HEOG). All • 31 channels were referred to the right mastoid and were digitally sampled for analysis at 256 Hz with a 0.01- to 100-Hz analog bandpass plus a 50-Hz lowpass filter. • Subjects participated in a 2-hour visual spatial selective attention task in which they were instructed to attend to filled circles flashed in random order in five locations. • Component IC1, generated by blinks • IC4 generated by temporal muscle activity.

Spatial Filter: Independent Component Analysis(5) Applying ICA to single-trial EEG epochs (2) • The scalp maps and power spectra of the 31 independent components derived from target response epochs from a 32-year-old autistic subject. • Blink and eye movement artifact components (IC1 and IC9) had a typical strong low frequency peak. • Temporal muscle artifact components (i.e., ICs 14, 22, 27, and 29) had characteristic focal optima at temporal sites and power plateaus at 20 Hz and higher.

Conclusion • Next class more classification techniques and some practical examples.

Thank you

General Signal Processing and Machine Learning Tools for BCI Analysis (Part-1)

General Signal Processing and Machine Learning Tools for BCI Analysis (Part-1)

Presentation Transcript

Digital Signal Processing for Communications and Information Systems (DSP-CIS)

Signal Processing

Bayesian Machine Learning for Signal Processing

Basic Acoustics + Digital Signal Processing

Data Mining Practical Machine Learning Tools and Techniques

Some Useful Machine Learning Tools

Computational Tools for Image Processing

General Signal Processing and Machine Learning Tools for BCI Analysis (Part-1)

Signal Processing

Digital Signal Processing

General Signal Processing and Machine Learning Tools for BCI Analysis (Part-2)

Information Theoretic Signal Processing and Machine Learning

When Signal Processing Meets Machine Learning

Machine Learning for Signal Processing Fundamentals of Linear Algebra - 2

Machine Learning for Signal Processing Principal Component Analysis

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Machine Learning for Signal Processing Linear Gaussian Models

Machine Learning for Signal Processing Eigenfaces and Eigen Representations