1 / 23

Symbolic vs Subsymbolic, Connectionism (an Introduction)

Symbolic vs Subsymbolic, Connectionism (an Introduction). H. Bowman (CCNCS, Kent). Overview. Follow up to first symbolic – subsymbolic talk Motivation, clarify why (typically) connectionist networks are not compositional introduce connectionism, link to biology activation dynamics

velvet
Download Presentation

Symbolic vs Subsymbolic, Connectionism (an Introduction)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Symbolic vs Subsymbolic, Connectionism (an Introduction) H. Bowman (CCNCS, Kent)

  2. Overview • Follow up to first symbolic – subsymbolic talk • Motivation, • clarify why (typically) connectionist networks are not compositional • introduce connectionism, • link to biology • activation dynamics • learning algorithms

  3. Recap

  4. /p/.1 /b/.1 /p/.2 /b/.2 /p/.3 /b/.3 /p/.4 /b/.4 /u/.1 /u/.2 /u/.3 /u/.4 A.1 B.1 Z.1 A.2 B.2 A.3 B.3 A.4 B.4 Z.2 Z.3 Z.4 SLOT 1 A (Rather Naïve) Reading Model PHONOLOGY ORTHOGRAPHY

  5. Compositionality • Plug constituents in according to rules • Structure of expressions indicates how they should be interpreted • Semantic Compositionality, “the semantic content of a (molecular) representation is a function of the semantic contents of its syntactic parts, together with its constituent structure” [Fodor & Pylyshyn,88] • Symbolists argue compositionality is a defining characteristic of cognition

  6. Semantic Compositionality in Symbol Systems • Meanings of items plugged in as defined by syntax M[ X ] denotes meaning of X M[ John loves Jane ] = …………. M[ loves ] ..……….. M[ John ] M[ Jane ]

  7. Semantic Compositionality Continued • Meanings of atoms constant across different compositions M[ Jane loves John ] = …………. M[ loves ] ..……….. M[ Jane ] M[ John ]

  8. The Sub-symbolic Tradition

  9. Rate Coding Hypothesis • Biological neurons fire spikes (pulses of current) • In artificial neural networks, • nodes reflect populations of biological neurons acting together, i.e. cell assemblies; • activation reflects rate of spiking of underlying biological neurons.

  10. integrate (weighted sum) sigmoidal w1j w2j wnj x1 x2 xn inputs Activation in Classic Artificial Neural Network Model Positive weights: Excitation Negative weights: Inhibition output - yj activation value - yj node j net input - hj

  11. Sigmoidal Activation Function Saturation: unresponsive at high net inputs Threshold: unresponsive at low net inputs Responsive around net input of 0

  12. Characteristics • Nodes homogeneous and essentially dumb • Input weights characterize what a node represents / detects • Sophisticated (intelligent?) behaviour emerges from interaction amongst nodes

  13. Learning • directed weight adjustment • two basic approaches, • Hebbian learning, • unsupervised • extracting regularities from environment • error-driven learning, • supervised • learn an input to output mapping

  14. Use term PDP (Parallel Distributed Processing) Example: Simple Feedforward Network • weights initially set randomly • trained according to set of input to output patterns • error-driven, • for each input, adjust weights according to extent to which in error Output Hidden Input

  15. Error-driven Learning • can learn any (computable) input-output mapping (modulo local minima) • delta rule and back-propagation • network learning completely determined by patterns presented to it

  16. Example Connectionist Model • “Jane Loves John” difficult to represent in PDP models • Word reading as an example • orthography to phonology • Words of four letters or less • Need to represent order of letters, otherwise, e.g. slot and lots the same • Slot coding

  17. /p/.1 /b/.1 /p/.2 /b/.2 /p/.3 /b/.3 /p/.4 /b/.4 /u/.1 /u/.2 /u/.3 /u/.4 A.1 B.1 Z.1 A.2 B.2 A.3 B.3 A.4 B.4 Z.2 Z.3 Z.4 SLOT 1 A (Rather Naïve) Reading Model PHONOLOGY ORTHOGRAPHY

  18. pronunciation of a as an example • Illustration 1: assume a “realistic” pattern set, • a pronounced differently, • in different positions • with different surrounding letters (context), e.g. mint - pint both built into patterns • frequency asymmetries, • how often a appears at different positions throughout language reflects how effectively pronounced at different positions • strange prediction: if child only seen a in positions 1 to 3, reach state in which (broadly) can pronounce a in positions 1 to 3, but not at all in position 4; that is, cannot even guess at pronunciation, i.e. get random garbage! • labelling externally imposed: no requirement that the label a interpreted the same in different slots • in symbol systems, every occurrence of a interpreted identically

  19. contextual influences can be beneficial, for example, • reflecting irregularities, e.g. mint – pint • pronouncing non-words, e.g. wug • Nonetheless, highly non-compositional: no sense to which plug in constituent representations • can only recognise (and pronounce) a in specific contexts, but not at all in others. • surely, sense to which, learn individual (substitutable) grapheme – phoneme mappings and then plug them in (modulo contextual influences).

  20. Illustration 2: assume artificial pattern set in which a mapped in each position to same representation. • (assuming enough training) in sense, a in all positions similarly represented • but, • not actually identical, • random initial weight settings imply different (although similar) hidden layer representations • perhaps glossed over by thresholding at output • still strange learning prediction: reach states in which can recognise a in some positions, but not at all in others • also, amount of training needed in each position is exorbitant • fact that can pronounce a in position i does not help to learn a in position j; start from scratch in each position, each of which is different and separately learned

  21. Connectionism & Compositionality • Principle: • with PDP nets, contextual influence inherent, compositionality the exception • with symbol systems, compositionality inherent, contextual influence the exception • in some respects neural nets generalise well, but in other respects generalise badly. • appropriate: global regularities across patterns extracted (similar patterns treated similarly) • inappropriate: with slot coding, component representations not reused

  22. Connectionism & Compositionality • alternative connectionist models may do better, but not clear that any is truly systematic in sense of symbolic processing • alternative approaches, • localist models, e.g. Interactive Activation or Activation Gradient models • O’Reilly’s spatial invariance model of word reading? • Elman nets – recurrence for learning sequences.

  23. References • Anderson, J. R. (1993). Rules of the Mind. Hillsdale, NJ: Erlbaum. • Bowers, J. S. (2002). Challenging the widespread assumption that connectionism and distributed representations go hand-in-hand. Cognitive Psychology., 45, 413-445. • Evans, J. S. B. T. (2003). In Two Minds: Dual Process Accounts of Reasoning. Trends in Cognitive Sciences, 7(10), 454-459. • Fodor, J. A., & Pylyshyn, Z. W. (1988). Connectionism and Cognitive Architecture: A Critical Analysis. Cognition, 28, 3-71. • Hinton, G. E. (1990). Special Issue of Journal Artificial Intelligence on Connectionist Symbol Processing (edited by Hinton, G.E.). Artificial Intelligence, 46(1-4). • O'Reilly, R. C., & Munakata, Y. (2000). Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain.: MIT Press. • McClelland, J. L. (1992). Can Connectionist Models Discover the Structure of Natural Language? In R. Morelli, W. Miller Brown, D. Anselmi, K. Haberlandt & D. Lloyd (Eds.), Minds, Brains and Computers: Perspectives in Cognitive Science and Artificial Intelligence (pp. 168-189). Norwood, NJ.: Ablex Publishing Company. • McClelland, J. L. (1995). A Connectionist Perspective on Knowledge and Development. In J. J. Simon & G. S. Halford (Eds.), Developing Cognitive Competence: New Approaches to Process Modelling (pp. 157-204). Mahwah, NJ: Lawrence Erlbaum. • Page, M. P. A. (2000). Connectionist Modelling in Psychology: A Localist Manifesto. Behavioral and Brain Sciences, 23, 443-512. • Pinker, S., Ullman, M. T., McClelland, J. L., & Patterson, K. (2002). The Past-Tense Debate (Series of Opinion Articles). Trends Cogn Sci, 6(11), 456-474.

More Related