Hierarchical Temporal Memory

Hierarchical Temporal Memory “The Classification of Un-preprocessed Waveforms through the Application of the Hierarchical Temporal Memory Model” April 18, 2009 Version 1.0; 04/18/2009 John M. Casarella Ivan G. Seidenberg School of CSIS, Pace University

Topics to be Discussed • Introduction • Intelligence • Artificial Intelligence • Neuroscience • Connectionism and Classical Neural Nets • Pattern Recognition, Feature Extraction & Signal Processing • Hierarchical Temporal Memory (Memory – Prediction) • Hypothesis • Research • Results

Introduction "I was proceeding down the road. The trees on the right were passing me in orderly fashion at 60 miles per hour. Suddenly one of them stepped in my path." John von Neumann providing an explanation for his automobile accident.

Intelligence • What is Intelligence? • A uniquely human quality? • “means the ability to solve hard problems” • Ability to create memories • Learning, language development, memory formation (synaptic pattern creation) • The human ability to adapt to a changing environment or the ability to change our environment for survival

Alan Turing • “Can machines Think?” • Connectionism • Model digital computer like child’s mind, then “educate” to obtain “adult” • “Unorganized Machines” : A network of neuron-like Boolean elements randomly connected together • Proposed machines should be able to ‘learn by experience’ • The Turing Test - constrained and focused research • Imitate human behavior • Evaluate AI only on the basis of behavioral response

Turing’s unorganized machine

Machine Intelligence • What is Artificial Intelligence? • The science and engineering of making intelligent machines, especially intelligent computer program • The Objectives of AI • Create machines to do something which would require intelligence if done by a human • To solve the problem of how to solve the problem • von Neumann and Shannon • Sequential processing vs. parallel • McCarthy (Helped define AI), Minsky (first dedicated AI lab at MIT) and Zadeh (Fuzzy Logic) • Various Varieties • Expert Systems (rule based, fuzzy, frames) • Genetic Algorithms • Perceptrons (Classical Neural Networks)

Neural Nets • McCulloch and Pitts • Model of Neurons of the brain • Proposed a model of artificial Neurons • Cornerstone of neural computing and neural networks • Boolean nets of simple two-state ‘neurons’ • Concept of ‘threshold’ • No mechanism for learning • Hebb - Pattern recognition learned through by changing the strength of the connection between neurons

Classical Neural Networks • Rosenblatt • Perceptron Model - permitted mathematical analysis of neural networks • Based on McCulloch and Pitts • Linear combiner followed by a hard limiter • Activation and weight training • Linear Separation - No XOR

Classical Neural Networks • Minsky and Papert, what where they thinking • Mathematical proof perceptron model of limited usefulness • Classes of problems which perceptrons could not handle • Negative impact on funding • Narrow analysis of the model • Incapable of learning the XOR - wrong • Incorrectly postulating that multi-layer perceptrons would be incapable of the XOR

Classical Neural Networks • The Quiet Years, Grossberg, Kohonen and Anderson • Hopfield • Introduced non-linearities • ANNs could solve constrained optimization problems • Rumelhart and McClelland • Parallel Distributed Processing • Backpropagation • Interdisciplinary nature of neural net research

Neuroscience • Structure of the neocortex • Learning, pattern recognition and synapses • Mountcastle • Columnar model of the Neocortex • Learning associated with construction of cell assemblies related to the formation of pattern associations • Neuroplasticity

Neuroscience • The Biology • Neocortex >50% of human brain • Locus of: perception, language, planned behavior, declarative memory, imagination, planning • Extremely flexible/generic • Repetitive structure • The Neocortex • Hierarchy of cortical regions • Region - region connectivity • Cortical - thalamic connectivity • Cortical layers: cell types and connectivity

Neuroscience • Layer 1 • Layer 2 • Layer 3 • A • B • Layer 4 • Layer 5 • A • B • Layer 6 • A • B

Electrocardiogram • Electrocardiogram (ECG) records the electrical activity of the heart over time • Breakthrough by Willem Einthoven in 1901 • Electrodes placed as per a pattern • ECG displays the voltage between pairs of these placed electrodes • Immediate results • Assigned letters P, Q, R, S and T to the various deflections

Electrocardiogram Measurement of the flow of electrical current as it moves across the conduction pathway of the heart. Recorded over time Represents different phases of the cardiac cycle

Electrocardiogram

Hypothesis • Application of the HTM model, once correctly designed and configured, will provide a greater success rate in the classification of complex waveforms • Absent of pre-processing and feature extraction, using a visual process using actual images

Research • Task Description • Create an image dataset of each waveform group for classification • Determine, through organized experiments, an optimized HTM • Apply optimized HTM to the classification of waveforms using images, devoid of any pre-processing or feature extraction

Hierarchical Temporal Memory Overview • Each node performs similar algorithm • Each node learns • 1) Common spatial patterns • 2) Common sequences of spatial patterns • (use time to form groups of patterns with a common cause) • “Names” of groups passed up • - Many to one mapping, bottom to top • - Stable patterns at top of hierarchy • Modeled as an extension of Bayesian network with belief propagation • Creates a hierarchical model (time and space) of the world Hierarchy of memory nodes

Hierarchical Temporal Memory • Structure of an HTM network for learning invariant representations for the binary images world. • This network is organized in 3 levels. Input is fed in at the bottom level. Nodes are shown as squares. • The top level of the network has one node, the middle level has 16 nodes and the bottom level has 64 nodes. • The input image is of size 32 pixels by 32 pixels. This image is divided into adjoining patches of 4 pixels by 4 pixels as shown. Each bottom-level node’s input corresponds to one such 4x4 patch.

Hierarchical Temporal Memory

Hierarchical Temporal Memory This figure illustrates how nodes operate in a hierarchy; we show a two-level network and its associated inputs for three time steps. This network is constructed for illustrative purposes and is not the result of a real learning process. The outputs of the nodes are represented using an array of rectangles. The number of rectangles in the array corresponds to the length of the output vector. Filled rectangles represent ‘1’s and empty rectangles represent ‘0’s.

Hierarchical Temporal Memory This input sequence is for an “L” moving to the right. The level-2 node has already learned one pattern before the beginning of this input sequence. The new input sequence introduced one additional pattern to the level-2 node.

Hierarchical Temporal Memory • (A) An initial node that has not started its learning process. • (B) The spatial pooler of the node is in its learning phase and has formed 2 quantization enters • (C) the spatial pooler has finished its learning process and is in the inference stage. The temporal pooler is receiving inputs and learning the time-adjacency matrix. • (D) shows a fully learned node where both the spatial pooler and temporal pooler have finished their learning processes

Hierarchical Temporal Memory

Temporal Grouping Hidden Markov Model - A

Temporal Grouping HMM - 2

Temporal Grouping HMM - 3

Experimental Design • HTM Design, Parameters and Structure • Determine the number of hierarchies to be used • Image size in pixels influences the sensor layer • Pixel Image size broken down into primes to determine layer 1 and layer 2 array configuration • Determine the number of “iterations” of viewed images at each layer • Small learning and unknown datasets used

Hierarchical Temporal Memory • Memorization of the input patterns • Learning transition probabilities • Temporal Grouping • Degree of membership of input pattern in temporal group • Belief Propagation

Experimental Design • Waveform Datasets • Individual beats broken down by classification and grouped (SN, LBBB, RBBB) • Teaching and unknown dataset randomly created • Teaching sets of 50, 90 and 100 images used • Multiple sets created • Teaching vs. Unknown Datasets • Traditional ratios 1:1 or 2:1, teaching to unknown • With HTM model, the ratio was 1:3, teaching to unknown

ECG Waveform Images - Sinus

ECG Waveform Images

Individual ECG Beat Images Left Bundle Branch Block Normal Sinus Right Bundle Branch Block All images were sized to 96 x 120 pixels

ECG Series Waveform Images Left Bundle Branch Block Right Bundle Branch Block Sinus

Results – Individual Beat Classification • Learning • Smaller number of teaching images • Diverse images produce greater classification pct • Overtraining not evident, saturation may exist • RAM influences performance • Waveform Datasets • Diversity • Noise : approx. 87 pct w/o inclusion in teaching set • Average > 99 percent classification • Average differentiation of images by class approx. 99 pct

NUPIC Model Results 48 object categories producing 453 training images. Only99.3 percent of the training images were correctly classified. (32 x 32 pixels) Of this “distorted set”, only 65.7 percent were correctly classified within their categories

Individual Beat Results Percent Classified by Model

Results by Dataset

Results by HTM Model

IR Spectra Sample IR Spectra

Results – IR Spectra Classification pct > 99

Gait Waveforms ALS Control With a limited teaching and unknown set > 98 pct Huntingtons

References (Short List) • [4] Computational Models of the Neocortex. http://www.cs.brown.edu/people/tld/projects/cortex/ • [6] Department of Computer Science, Colorado State University. http://www.cs.colostate.edu/eeg/?Summary. • [8] George, Dileep and Jaros, Bobby. The HTM Learning Algoritims. Numenta, Inc. March 1, 2007. • [11] Hawkins, Jeff and Dileep, George. Hierarchical Temporal Memory, Concepts, Theory, and Terminology. Numenta, Inc. 2006. • [12] Hawkins, Jeff. Learn like a Human. http://spectrum.ieee.org/apr07/4982. • [15] Swartz Center for Computational Neuroscience. University of California San Diego. http://sccn.ucsd.edu/eeglab/downloadtoolbox.html • [16] Turing, A. M. “Computing Machinery and Intelligence”. Mind, New Series, Vol. 59, No. 236. (Oct., 1950). pp 443 – 460.

Hierarchical Temporal Memory