1 / 23

Quantum Interaction 2013 Leicester July 25-27, 2013

Contextual Query Using Bell Tests Joao Barros, Zeno Toffano , Youssef Meguebli and Bich-Liên Doan SUPELEC (École Supérieure d’Électricité) FRANCE. Quantum Interaction 2013 Leicester July 25-27, 2013. Quantum Interaction Research at SUPELEC .

gotzon
Download Presentation

Quantum Interaction 2013 Leicester July 25-27, 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Contextual Query Using Bell TestsJoao Barros, Zeno Toffano, Youssef Meguebli and Bich-LiênDoanSUPELEC (École Supérieure d’Électricité)FRANCE Quantum Interaction 2013 Leicester July 25-27, 2013

  2. Quantum Interaction Researchat SUPELEC • Research activity initiated in 2011 under the impulse of • Bich-Liên DOAN (Computer Science: Information Retrieval, Semantic Web), Dep. Of Computer Science. • Zeno TOFFANO (Physicist, lectures Quantum Mechanics. Research: Solid State Physics, NMR, Lasers, Fiber Optics Telécommunications), Dep. Of Telecommunications. • 2 PhDs • Joao BARROS (MsC: Theoretical Physics) at the heart of this research (funding from « Fondation SUPELEC »). • Youssef MEGUEBLI (MsC: Computer Science) : Opinion based Information retrieval. • We undertook some preliminary investigations in the form of tests. • (arXiv:1207.4328) • We emphasize on the experimental « Quantum-like » approach.

  3. Preliminary Investigations :Poll tests on polysemy in foreign language: • The test assesses correlations in a foreign language (Chinese here). • It aims to show the role of polysemy of words. The question was to quantify the correlation with different meanings of the proposed words. • 4 people were interviewed (all Chinese) to give their opinion scores.

  4. Preliminary Investigations :Poll tests on heterogeneous media • The test proposes nine musical excerpts. • The question is to rate from 0 to 10 whether these excerpts fall under the category "rock" or "blues ". 4 persons were interviewed. • The sum of the results for both categories is very rarely equal to 10 indicating that the chosen categories are certainly not mutually exclusive. • Interference effectsbetween concepts of different media (over-extension/under-extension).

  5. Preliminary Investigations Words belonging to 2 categories : correlation • Tests on words belonging to two categories Fruit, Vegetable or to both (questions are independent) • We define a « Bell-like » correlation parameter • In this analysis we observed no violation of the Bell Inequality (<2)

  6. Preliminary Investigations First approach : « Heuristic Quantum-like HAL model » • Our goal here is to classify the texts according to a context-related search criteria using the HAL algorithm (Hyperspace Analog to Language). • We create a symmetrical HAL matrix (more discussions hereafter). • A query has been undertaken: "A AND B". • Texts referring to the word Tomato (can be considered either as Fruit or as Vegetable). Three documents were selected from the Internet using related keywords (for complete discussion see arXiv:1207.4328) • The score p given by the algorithm corresponds to the probability for this text to be in first position in our request. The average value of the test is defined as E = 2p1 (E from 1 to +1). • A Bell parameter (CHSH type) is defined and calculated as follows • We observe Bell inequality violation in document 2 >

  7. BellInequalities • Long story: • The field of Bell inequality violations (Bell 1964) and entanglement has fascinated many scientists throughout the last decades. An interesting historical narrative is in “How the Hippies Saved Physics” by David Kaiser, Ed. W. W. Norton (Physics World 2012 Book of the Year ). • Much debate • classical and non-classical behaviour • entanglement • local and non-local, contextual and non-contextual • more than Quantum, non-local boxes… • Experiments demonstrating Bell inequality violation • 1969 Clauser: first experiment • 1982 A. Aspect (Orsay France) on polarized photons: definitive proof • Entanglement with Spins (NMR, Rydberg atoms…) • Towards the realization of a Quantum Computer • A new field: Quantum Information • Entanglement is at the heart of this field because it is seen as a potential “resource” for computing (lower complexity) and coding (secure cryptography)

  8. The Bell CHSH Inequality cases • The CHSH (Clauser, Horne, Shimony, Holt)-Bell parameter SBell form is proposed for tests with two binary outcomes, +1 or 1, adapted to query answers (YES/NO), can be defined as follows: • where , , and are tests and stands for the expectation value of the outcome of mutual tests and . can never exceed • Classical, local, separable: lies between and . We could write . • Quantum: The case achieved with bipartite quantum entangled states. is called the Tsirelson’s bound and is a limit for Quantum systems. • No-signalling. The case between and is called the “no-signalling” region. The maximum value can be attained with logical probabilistic constructions often named non-local PR boxes.

  9. HAL and QI research • We investigate the relationships between words within a document; these relationships can be formed by creating a “semantic space” using the Hyperspace Analogue Language (HAL) introduced by Lund and Burgess (1996). • The HAL algorithm does not require any explicit human a-priori judgment. In the procedure a HAL lexical co-occurence matrix is built with a "window," representing a span of words passed over the corpus being analyzed. • Operationally, two words are considered as co-occurring when they appear in the same floating window. The size of this window is a few hits left and right of the word in question. • Similar approach: LSA (Latent Semantic Analysis) also builds matrices in semantic space. • Darányi, Wittek, • Physical analogy between semantic space of HAL and Quantum Theory, where at each word can be associated a given energy (in analogy with spectral emission lines in atoms corresponding to transition energies) • Bruza • HAL used for analogies with Quantum Theory for activating associations of concepts.

  10. The HAL Matrix SemanticSpace • The matrix is built with a "window" representing a span of words passed over the corpus being analyzed. The width of this window can be varied. • Words within the window are recorded as co-occurring with weight inversely proportional to the number of other words separating them within the window (word distance measure). • The information contained in a line is the sum of co-occurrences for words appearing before the word, the information contained in a column represents the sum of co-occurrence for the words appearing after the word. • We used a symmetric real positive matrixobtained by the sum of the HAL matrix and its transpose (equivalent to run HAL backwards). • All words are considered and simple plurals are treated as singular words. Lower and upper case letters are not distinguished. Words having the same origin are treated differently (for example “battle” and “battling” are distinct).

  11. Document « Orange » construction of the HAL Matrix • Symmetric matrix sum of two HAL matrices (forward and backward). • Repeated words contribute to strengthen the associated vector (see “orange” and “the” in the example below). • The rows and columns of the symmetric co-occurrence matrix constitute vectors in a high-dimensional space. • The dimensionality of the space is determined by the number of columns in the matrix (context vectors). TEXT example with a window spanning on 3 words (l = 3) "Thecolourorangetakesits name fromtheorangefruit"

  12. Quantum model for HAL :Vectordefinition • We attribute to each document an associated vector. • The vector state of the document is the linear sum of all the word vectors it contains. Each word vector state is extracted from the lines of the symmetric HAL matrix. • We are interested in analyzing how two words are connected within a document, namely word and word . • The two associated word vectors and define a plane on the semantic space. • We will consider the projection of the document vector state on the plane spanned by and . • This resulting normalized state vector, represents the reduceddocument state vector .

  13. Quantum model for HAL : Vector Orthonormalization • To obtain we take the vectors and and normalize them obtaining two new vectors: and forming a non-orthogonal basis (in general). • We apply the Gram-Schmidt orthogonalization process to the basis , and we obtain two new orthonormalized basis that describe the plane formed by the original vectors and : the basis and . • By projecting the vector on one of these basis, we obtain its projection onto this plane. Taking this vector and renormalizing gives us the desired vector . • The vector can be decomposed on both orthogonal basis. • The coefficients , , and are obtained by projecting the state vector on both basis vectors and then normalizing to unity.

  14. Quantum model for HAL :QueryOperators • We want to define Query operators. • The query operators and are defined • +1 state that corresponds to the word meaning we are interested. • −1 in the orthogonal direction. • When applied to the document state vector defined before : • This action is analogous to the spin Pauli matrix , and we can associate it to in the basis • Other query operators can be defined. We choose and . The action of on gives on the basis: • This action corresponds to switching the components and is equivalent to the spin Pauli matrix

  15. Quantum model for HAL :QueryOperator basis representation • We choose the basis associated to word , , and write the operators with respect to this basis. • Usingthe transformation matrix from the basis to the basis whereis the scalar product here a positive number smaller than 1 • Weobtain the matrix form in the basis associated :

  16. Bell parametercalculationusingQueryOperators • Bell tests are usually a proof of a non-separability of the combination of two different systems. • We define a parameter that can be understood as the sense associated to a word A in a document in correlation with the sense of another word B. • The Bell parameter is the combination quantum mean values with different query operators which can be considered as measuring devices: • Using specific operators associated to words A and B . • This particular operator choice is inspired from the usual example that maximizes the violation of the Bell inequalities . ; ; ; • We calculate quantum mechanical mean values over vector using the Born rule.For example:

  17. Input Document Bell parametercalculation:Q-HAL algorithm Construction of a “clean” (no punctuation marks) sequence of words, including eventual repeated words: Doc list. Construction of the “Dictionary”: sequence of non repeated words: Dic list. Window size l Construction of a primitive HAL matrix: for each word of the Doc list a window of length l is associated and all the scores of the words within it are collected in a matrix. The entry for each score is determined by the position of the words in the Dic list. Complete HAL matrix is obtained by summing this matrix with its transpose. Normalization of each row vector. Determination of the state of the system by summing over all vectors and normalizing. Calculation of the expected values of the defined operators and the Bell parameter. New window size l+1 • Flow diagram of the Quantum HAL algorithm. • The algorithm was implemented using Python programming language along with the string module and pylab. • Our approach presented here can be perceived as an experiment done on objects outside the domain of physics. Plot

  18. Query 1 :word “Reagan” in the context of word “Iran”. • The Bell parameter function of the window size starts from zero and increases until it reaches a maximum (the Tsirelson's bound then drops again. (for 3 documents) • The document “Iran” always gives a constant value of 2. Here one of the words (“Reagan”) is missing. • This suggests that each document has an optimal HAL window size that maximizes the parameter . • The “sooner” a peak appears the less interaction, in the sense of window length, is needed to get higher correlation between the two words. Bearing this in mind, the document “Iran-Contra affair” is clearly the one selected by the model.

  19. Query 2: Test on the polysemy of the word “orange” • Test on the polysemy of the word “orange” and associated concepts. In this example we are interested in the ambiguity between the meanings color and fruit. We also associate the concept of juice. • The query “Orange - Fruit” presents the first peak around for the document “Orange color”, the second in for the document “Orange fruit”, then for the document “Orange juice” and very far away the document “Juice”. • Regarding the peaks we find the expected order for the documents: “Orange juice”, “Orange Fruit”, “Juice” and “Orange Color”.

  20. Query: pathologicaltextexample • We made 2 word queries on « pathological » documents consisting in texts with repeating periodic structure based on the same original document. • The curves still peak at the Tsirelson’s bound and also present other effects probably due to the repetition period. Queries on words A1 and A100 in a text of 5000 words, for text repetition periodicities of 100, 150 and 200.

  21. Comments on Results • The results show Bell parameter that peaks up to the maximal value of Sbell = 2√2, (the Tsirelson’s bound). • We found that the Bell parameter is strongly dependent on the HAL window size. There is an optimal window size that maximizes Sbell. • Reminiscent of what was already noticed (Bruza) a possible explanation : • if the window size is set too large, spurious co-occurrence associations are represented in the matrix • if the window size is too small, relevant associations may be missed. • Comparing different documents, the one with the first appearing peak seems to be the more relevant.

  22. Some Conclusions and Perspectives • The main feature in relation to Quantum Theory explored in this work is the violation of the Bell inequalities which can be related to entanglement and nonlocality. • The results show always correlation on two words due to Bell inequality violation up to the maximal value of , (the Tsirelson’s bound). • We introduced a new tool which has connections with the Quantum Theory: Query Operators. • It is not clear how to interpret the Bell inequality violation here and what is the meaning of the optimal length that maximizes the Bell parameter. • HAL constitutes a good « playground » for doing Quantum-likeexperiments. • We believe that it should be possible, after much experimentation on different documents, to introduce new families of query observables adapted to different purposes and contexts in Information Retrieval. • Can entanglement give a measure of query relevance?

  23. Thank You

More Related