I4OpenResearch. Building Technical Capacity for Earth Science Data and Information in Disaster Planning, Response, Management and Awareness. (How) is Data complexity/level-up to Metadata achievable ? Paths/Organization to achieve such a goal, framing research results
Building Technical Capacity for Earth Science Data and Information in Disaster Planning, Response, Management and Awareness
(How) is Data complexity/level-up to Metadata achievable?
Paths/Organization to achieve such a goal, framing research results
The example of the ‘Discinnet process’
Toward a global scale interdisciplinary experimentation
Some epistemological background assumptions
Important experimental results and interpretations
Presentation of the ‘Discinnet process’
Immediate impact on research/knowledge pace
Interdisciplinary model for research fields trajectories
Proposition for a global scale related experimentation
Expected general epistemological model confirmation
Consequence for earth science question to the conference
ESIPFED - July 12th 2013 - PJSummary
Metadata have meaning, scope, usefulness, persistence as opposed to ‘Big’
However our investigation instruments produce ever tinier, simpler, bigger data
E.g. simple Laws from Statistical ‘Physics’ up to Information/complexity theory
CC, Languages and Math/Category oppose reasoning to computing (Dwk,..Lng)
We relate complex /Meta data strictly to future/subject, Big to past/objects
But highlight the major issue of ‘incommensurability’ of big to meta transitions
The search for symmetries, highest value for purpose and simulating ability
The concept of Oracle and OTM as good reference for Metadata implementing
Relation to P/NP, to the FCP, to time and to complexity (example TSV 615->616)
Big data are numerous, scattered, simple objects, Meta dense, singular, complex
We relate strictly Big object data to past and Meta subjective to future
The issue is about ‘our’ future predictability and optimal drive into it
ESIPFED - July 12th 2013 - PJ1 – The Big to Metadata issue
A promising start is the widest spatial scale boundary gap universe vs. quanta
ESIPFED - July 12th 2013 - PJ2 – Epistemological / measurable background
In “The Knowledge Complexity of Interactive Proof Systems”, Goldwasser, Micali and Rackoff ask “How much knowledge should be communicated for proving a theorem T?” and have the goal to “give a computational complexity measure of knowledge and measure the amount of additional knowledge contained in proofs.”
They then start from the NP class as ‘very successful formalization of the notion of theorem-proving procedure” and use Cook-Levin definition of ‘NP proof-system’ as 2-communicating Turing machine, a prover and a verifier, with the prover deterministic exponential time (EXP) and the verifier polynomial time (P)
ESIPFED - July 12th 2013 - PJ2.1 – Epistemological / measurable background
In all cases there comes better, next, new, more knowledgeable world but evolutionist ‘looks’ more N(P) and Constructivist more oracled, yet both for best and the status of the human (author, actor, automate) and artificial varies
* There is a holistic perspective on Kant’s epistemology (see exhibits)
ESIPFED - July 12th 2013 - PJ2.2 – Epistemological backgrounds and goals
What we have at hand for measurement are distances, axes, projectors, vector lattice, elementary shapes, of which we show the usefulness:
- let’s see infinite dimensional lattice, even H
- yet compare with exponential compact object such as circle
or an egg..and work on the dynamics implied, for instance, by series of non-local harmonics
Importance of experimental results in space, from cosmology to particles, and now getting into Quantum computing from quantum measurement
Importance of the types and numbers of dimensions or of ‘Dimension’
For instance role of Wick rotation (use by Hawking for ‘no-boundary’ universe) in our proposed FCP framework with subjective closure as future
And some Big to Meta data Grammar models with 4 levels, theorems
(and the general FCP pattern from Euclid axioms to Category theory)
In next pages we establish with benchmark NP referential 3-SAT(CNF) vs. P 2-SAT (1 dimension less) , then transition to IP and MIP / QMIP, the optimal closure (entangled/oracle) of scientific knowledge by Discinnet
ESIPFED - July 12th 2013 - PJ2.3 – Important experimental issues
ESIPFED - July 12th 2013 - PJ
Figure elaborated with Dr. Sören Auer (Bonn U.) while preparing common European research program pres. 2010
Researchers(<millions), ‘Big’ authors
And communities, networks (severaltenths)
TOO BIG DATA (-> trillions)
OPEN DATA LINKAGE ISSUE
ResearchPapers/docs (-> billions?)
PB. : FAST CUMULATIVE GROWTH
Funders and researchers alike expect efficient organization/process
Currently they combine semantic/natural language with somehow isolated recourse to experiments, categories, alphabets, even meaning
Tools are available:
These are particularly:
G (V, , Pr, St) Chomsky project for grammar, with End outside
Yet examples of productions in Physics (d->NN), Chemistry (Na + Cl -> NaCl)
The Chomsky project was to account for both natural/semantics and formal
Notice that the End is outside and related to the Productions (hence types 0 to 3)
We will see/assert that the position of the author/observer/oracle is the key
It explains the nature of the differences between P, NP, IP, MIP
A key is the NP equivalence between Non-deterministic and ‘Oracle before/out’
ESIPFED - July 12th 2013 - PJ3.1 – Need of Metagrammar for Metadata
3.11 – Some stuff to be represented
Major Complexity levels and proposed Big (V) to Meta (A/Pk) equivalences :
BIG =P(TIME) NP IP=QIP=PSPACE=NSPACE MIP=NEXP =pb.OTM META
Plus new results with Quantum Multi-prover Interactive Protocols (QMIP)
. such as [Ito, Vidick, 2012] (Verifier = experimenter, Provers = devices)
. [BFK 2010] QMIP = MIP* and “the entire power of quantum information in MIP* is captured by the shared entanglement, not quantum inform.”
The MIP = pb.OTM = NEXP equality and the ‘prior entanglement’ prove that Provers (us, common sense, researchers ‘shared’ meanings) commonly and collaboratively have access to future (‘targeted’, ‘expected’) results
And both the probabilistic frame of the IP and MIP classes and procedures and Alice/Bob split information tests to V, reflect the scientific process
Example of a universality : = rand. limN-> M/N is an ‘Oracle before’
ESIPFED - July 12th 2013 - PJ3.2 – Results from Complexity theory
Observe a closing on :
PSPACE = NSPACE (= IP )
Which we link to fundamental space as FCP field property with formal and for instance Planck cell therefore allowing holistic metadata/meaning effective sharing from Big data/lattice (continuum/computational), see also Sethna on Universality/scale invariance
Pursuing with simple Turing machine grammar model where we highlight the ‘F inside’:
Gfa (K, , , (q0, F)) (linear, finite automaton, linked to F within K )
NP referential 3-SAT(3CNF) opposed to P 2-SAT, which has one dimension less.
In 3 CNF (i(j)) – the benchmark! - the number N of variables -> to maintain Satisfiability if M clauses -> at random (which is where stats/‘Big data’ operate)
ESIPFED - July 12th 2013 - PJ3.4 – Some fundamental underlying experimental results
Aspect experiments / Penrose support Formal/”non local realism”
Concept introduced in 1959 => Oracle Turing machines
Used in many (most?) theorems (including Cook’s foundational)
Quest for physical oracles going on but here is CC representation
ESIPFED - July 12th 2013 - PJ3.5 – Time as Complexity density from CC
We produce it by folding the infinite discrete oracle tape into first square (or later cubic) lattice, then polygonal hence polynomial corresponding to P/N C. Classes, to End with a Spherical one as oracle is defined as ‘instantaneous’ or otherwise EXP vs. POLY TIME
Polygone/polynomial with n symmetries while e i is infinite symmetrical, but infinite process also has to be included, hence S2
We tend to see objects fundamentally ‘in’ space, actually 3D unique background/universe
This is not what QM and other experiments tell and what shared knowledge implies
FCP provides with another picture of the world (see AIP references, last as #1456)
With a wealth of FCP complex coarse-grained histories/objects interwoven and closing
Therefore a cluster of common sense projected intentions seeks for its next oracle/closure
Wick rotation = it exchanges Lorentz (local GR) versus Euclidean
Euclidean apparent objective 3D shape, including Universe
Applying the FCP principle, the ‘Discinnet process’ is assumed to be the most efficient way to represent research fields, study their behavior, model it as clusters of community researchers’ projected intentions versus results and presumably predict their trajectories, hence space (appropriate) and time of their outcomes.
Research would appear for outsiders to live in a world of terms, wholly semantic, when it has to come to a world of measure, quantized.
This is through research projects, from where terms (Meta) are projected onto less but more numerous measurement ‘Big’ data axes
Projectors from projects do matter
They may cascade down the terms to measurement hierarchies to end with the basic MKS grammar base.
New dimensions are how complexity hence future comes to terms and conversely how future as terms are may be consistently projected and shown consistent with the past (resilient) within the set of already realized clauses in phenomena, word of existing objects
State of the Art
fromanticipated or targetedresults(Shaping)
‘In-progress’ hence collaborative, financed, reduceduncertainty, closed future (Implementing)
This slicing is therefore not arbitrary but directly deriving from the FCP framework
A researcher may select at least one most significant targeted result and the related status will derived from DiscInNet process analysis
Example of a ‘Cluster’ of State-of-the-Art (published) significant experimental results
1 – State of the Art (each red star plots the distinctive contribution of a paper)
2 – State of the Art + Proposition of a new research program
3 – State of the Art + Proposition of new program + Visualization of the full set of comparable projects with same advancement status in the world or given areas
For Researchers as for an Observers of the field it is particularly useful to be able to plot the distinctive position/contribution of (2) as compared to other status in the world (3)!
The trend seems to be about a splitting on the long term
4.3 – Research fields are ‘E’* Science maximal/meta data
The ‘Discinnet process’ models epistemology transition from philosophy to science inasmuch as it maps and conveys fields into objects of research through appropriate space/metrics
* Through Discinnet process the data to metadata interaction is both spatial and very earthly
4.4 – Discinnet Synoptic
This is collaborative
Oleaginous plant-based biofuels
Biotechnology for Biofuels
Biofuels socio-economic impact
Biofuels from waste
We will implement and test an Interdisciplinary model on up to 5 pilot cases (fields
FEEL FREE TO PARTICIPATE AND JOIN THE GOVERNANCE OF THIS SERIES OF CLUSTERS AND INTERCLUSTERS TO COVER MAJOR RESEARCH AREAS
Practically we will bring the solution to non-for-profit or foundation in North America with opened interdisciplinary governance from research and appropriation by networks
Current version is on a dedicated server and a lot still has to be done before comprehensive and fully interdisciplinary system but it already works
Our goal remains to get support to strengthen our research program while research institutions so far met have expressed interest under the provision to have access to maximal global cluster ongoing content and yet use it maximally internally, whether on their servers or a limited cloud
Therefore we have decided that the only condition would be that all versions/platforms should at least share ‘published’ (red stars) USP
Then we are ready to put it and next ones on U.S. server(s) at least under a non-for-profit corp. umbrella with governance to be decided and further open licensing, options through this body
As a result research groups may either decide to use it on such server or get it for limited networks or participate to governance and the construction of next versions as drafted next page.
ESIPFED - July 12th 2013 - PJ8.1 – Propositions
For whom : The immediate main beneficiaries should be doctoral schools because a thesis is where successful state of the art and community feedback from inception is where a career may take off or crash
Postdocs, DataScientists/community managers, researchers, funders
With whom : we would like to set-up this governance with/by/from interdisciplinary scientific federation… ESIP is broad and would be great
To whom : we have to convince universities, labs, editors everywhere but would like to do this in an organized fashion
By whom (bringing, discussing, proving, explaining): initially ‘us’ (Discinnet)
Communicating deeply enough the program requires some amount of time in explaining, sharing, training, assistance and integrating remarks
This can be done under specific programs of 3/5 days depending depth and next steps, hopefully with and within appropriate frame
Next and much more ambitious goal is to set a governance with specific advantages first for governing body then for contributors to governance
At international level we think that it could participate to RDA project
ESIPFED - July 12th 2013 - PJ8.2 – Experimental proposition (next)
So here is the idea we have come with:
We are putting in place a non-profit corporation in the US to handle maximal rights
To initiate this governance it appears that a budget to support itself would be appropriate, such as $ 1,500 for a field/lab, $ 3,500 for a department/center and $ 7,500 for a university or a foundation (the ‘naïve?’ base goal being to reach $ 250 K per annum or better (goal $ 500K)) to be used by I4 governance then DIN
For our own part we bring related training and support, for instance 100 man days (half budget) to make it fully shared with US organizations either involved in the governance and/or willing to use a version of their own.
These services could come on top or through licensing to US based I4 non-profit corp.
Full budget would allow more ambitious research sharing through governing body including interconnection with other tools, database, Drupal, etc..
For instance we appreciate support already received from Indiana and the RDA goals
ESIPFED - July 12th 2013 - PJ8.3 – Experimental proposition (next)
ESIPFED - July 12th 2013 - PJ9 – Some Epistemological testing : FCP case
ESIPFED - July 12th 2013 - PJ
We draw from Penrose a case to use as example of ‘natural’ ‘OR’ oracle
This technical oracle is for instance what is targeted through quantum computing or cryptography
Penrose pinpoints the fact that the entanglement even is recovered when one entangled particle is measured, meaning a leap backward 5 years
We interpret this as the fact that more complex entanglement technicality and its intermediary Riemann sphere remain in fact ahead in time to both isolated particles
In other words the more complex theory and its oracle automatically lie in the future of its experimental proof.
A striking consequence of replicable EPR Aspect’s experiments
P. JourneauRicheact - I4OpenResearch
Recall IP introduced Interactive protocol, from where MIP make it Collaborative Discinnet, MIP standing for Multi-Prover (honest or not) Interactive protocol (V, P1,…Pk) which fits perfectly with natural sciences, even constructed (in fact always constructed), whereby a common research is field (of research/meant at first) V tested and testing Provers (depending whether your stance is more transcendental modelist or empirical realist)
To St(art) with IP, the difference with NP was the split between two interacting machines, P out of V that is to say the (Final (oracle), St) couple becoming embedded within the IP class
We take IP = PSPACE = NSPACE very seriously, i.e. equivalent GR definition of space-time! This is an example of closure of a notion of space with ‘real’ oracle, here Euclidean (mapping) as defining the form of – or rather ‘with’ – usual (potentially gravitational) space.
Question asked by R. Penrose, but otherwise by C. Thiercelin in“Does the empire of the meaning belong to the empire of nature”, a question related to the Epistemological considerations going to be tested
« Is the whole universe an ‘isolated system’?»
Steps such as inflaton, light, mass(Higgs field), cosmological constant, life, etc.. And then the fast growing ‘Earth’(s) transformation transitions in construction…?’
Or the ‘OR’ issue (U/R) by Penrose versus ‘Consciousness ‘and other Interpretations?
What this proves is that the best data oriented organization interacts optimally and at all scale collaboratively human authors, sole bearers of the meta-meaning: this is clearly already being put in place
However it has to integrate considerably more the most appropriate interactions between them, experimental data and docs/texts.
Currently, even in most advanced collaborative tools/platforms, this is still far too much scattered, far too semantic, foreign to their assumed common experimental/verification/confrontation space or…. Battle field!
ESIPFED - July 12th 2013 - PJ10 – Applying to Complexity, Climate, B2M organization
Assuming research fields are comprised within the boundaries of QM renormalization and related complexity, hence between the highest complexity classes oracles, form where they undergo objective reduction by projection on N-dimensional lattice, the Discinnet process wholly and optimally represents their entire dynamics.
ESIPFED - July 12th 2013 - PJ11 – Conclusion
Through single-Prover/researcher (IP class) versus Multi-Prover Interactive Protocol (MIP ) collaborative design, complexity degree changes the direction of time
What some fields of research tell : 1.1 : Category theory
Still debated thirty years ago, whether Set theory or Category theory is the most foundational to mathematics: solved in favor of categories, back to Kant.
A Category is defined in [AL91] as:
The book then presents some “common categories”: the category of sets, with functions as morphisms, the topological spaces, with continuous functions, the vector spaces, with linear transformations, the groups, with group homomorphisms and the partially ordered sets, with monotone functions.
We put in bold some major, especially for our FCP conclusion that (meta)data are oriented with the role of ‘source’ and ‘target’ domains and ‘morphisms’ between them.
FCP MetamodelExhibit 1 – Teachings from related fields a /1
a this document reuses elements from recent or pending communications, namely CCCA’12, foundations of science, Toulouse Space KM conf. 2012
What fields of research tell : 1.2 : Linguistics/Formal languages
Related to Category theory, with objects as strings based upon some alphabet
Turing added the concept of oracle, which we link to natural vs. formal languages
Chomsky’s universal grammar (meta)model (worked practically for formal) follows:
Gc ≡ (VN, VT, Pr, St)
In this definition VN represents “variables”, VT “terminals” – corresponding to words for objects – Pr “productions” and St the “Start symbol”, belonging to VN.
Both VN and VT are finite subsets of V*, itself defined as “the set of all sentences (or strings) composed of symbols of V, including the empty sentence”, hence non countably infinite, V being some alphabet.
Pr Productions are transitions or processes such as denoted in any a b (Morphisms)
St marks Input and process (algorithm) eventually ends with Output or decision or target if ‘Accepted’ , whether Authorized by human halting or Automated if predictable.
There are four levels of formal languages or equivalently decision problem, of which the least powerful is the ‘regular’ or type 3 or finite automaton (fa) language, embedded into the type 2 ‘context free language’ (cfl) type, itself less powerful and embedded into the broader type 1 ‘context-sensitive languages’ (csl), or recursive (all algorithms) itself within most general (without oracles) recursively enumerable (re) type 0.
We propose that these (meta)(data)models derive from next FCP universal model.
FCP MetamodelExhibit 1 – Teachings from related fields /2
What fields of research tell : 1.3 : Turing machines to semantics
Turing Machines TM (types 1 to 3) bear their own/predictable/de(finite) targets F:
Gfa (K, , , (q0, F)) (linear, finite automaton, linked to F within K )
K and for TM (computers) wholly relate to Chomsky’s grammar models; besides, equivalences between Church, Turing and Chomsky’s models were demonstrated.
We encompass the oriented (q0, F) within general time A : the target domain for F is all the more limited for fa and on the contrary all the more (meta) powerful when there is a change of cardinality or, in Computational Complexity, of complexity class.
(see particularly interesting case of the P to NP to IP to MIP transitions next pages)
B) From most powerful computing to semantic space
FCP MetamodelExhibit 1 – Teachings from related fields /3
b C. Mouton, Induction du sens de mots à partir de multiples espaces sémantiques, Recital 2009
c A. Kojeve, La notion d’autorité, 1942
What fields of research tell : 1.4 : foundations of Physics
There are several types of bridges between so-called formal and physical domains
However most basic physical data model is time-space as a de(formed) field, whether continuous (Einstein’s GR proposition) or finite discrete quanta lattice
G ≡ (, , , r) or G ≡ (, , , r)
There are spherical coordinates but in the second definition we replace second angle by change
Observe that the first definition has two angular type dimension and only one spatial
The second definition better exhibits the dynamics of perturbative theory, contributes to bridge the gap between physics and grammar’s Pr Productions/Transition/Morphisms a b
A typical transition representation , from Feynman graphs to nuclear and chemical reactions
In both cases the angular fits with the formal content, whether for instance as curvature for gravitational data or other typical angles of physical to chemical properties.
Also we generalize this through a representation of oracles easy to derive from classical typewriting
Now the most interesting case is the extended definition of time, related to authority as complexity level: the experimenter asks questions and expects answers from measurement apparatus through reducing projections (Penrose)
Time oriented measures complexity difference as well as related number of steps
We propose that these (meta)(data)models is a derivation of FCP universal model
which may relate (cf. AIP # 1446, 2012) entangled particles NEXP power as compared to separated to general mechanism
FCP MetamodelExhibit 1 – Teachings from related fields /4
FCP MetamodelExhibit 1 – First conclusions and applications /6
What fields of research tell : Theory about research trajectories
We derive some conclusions from Computation complexity result
Particularly from the applicability of Practically Predictive Data class P since recursive (reusable/predictive) as effective algorithms (Polynomial time) to Knowledge ‘hard’ NP (Non-deterministic) class particularly as NP-complete, which was pushed to the limits through IP and MIP classes.
- NP class (Non-deterministic Poly time), defined as ‘at once’ brought solution then polynomial time verified, was split in Provers vs. Verifier
- IP stands for Interactive Protocol between Provers all powerful (human i.e. (semantic) oracles) and polynomial bounded Verifier: it is interesting to see how this applies through the process presented on slide 3.
- MIP is Multi-Prover Interactive Protocol, shown to hav e NEXP power level
[FOR92]concludes about demonstrated difference IP versus MIP:
A ‘co-necessity’ model presented as FCP, denoted (A, L, S, N), implemented in Discinnet to widest types of languages/structures iv
Examples of applications to other types of metadata model?
(somehow as we strictly relate Time to FCP complexity and physical are most ancient, simple phases)
FCP Metamodel2 – Principles of French (FCP) Discinnet model
ivAlso see F. Heylighen, A Structural Language for the Foundations of Physics, I.J.G.S. 18, 1990
Immermann  montre quelques équivalences :
CSL (FO + positive TC) NL ( Non-déterministe LOGSPACE)
FO logique First Order (sur N) et TC, ‘Transitive Closure’, fermeture transitive des relations
(SO+TC) PSPACE = NSPACE (spécifique CC mais comparabilité )
SO logique Second Order (fonctions sur N) et PSPACE Polynomial Non-déterministe PSPACE
En rappelant les premiers niveaux de la hiérarchie de complexité computationnelle :
L NL *L P NP *P PSPACE = NSPACE
Où P est Polynomial Time et NP est Non-déterministe Polynomial Time MAIS ‘Linear Time’
Alternation des espaces et temps comparable à quantificateurs de la hiérarchie mais aussi U/R en physique : les relations ci-dessus montrent l’encadrement de la zone polynomiale temps par les zones CSL (algorithmes) = (FO+TC) au niveau LOGSPACE et (SO+TC) au niveau PSPACE, reliés au principe FCP
Immermann produit encore les autres théorèmes suivants :
P = (FO + LFP) = (FO + ATC)
Où LFP est pour Least Fixed Point et ATC, Alternating Transitive ClosureOperator
Capacity of the model to be more predictive through metadata:
On one hand we have demonstrations from Computation complexity
Now in order to prove the predictive power of FCP model processed through Discinnet we will show that it uniquely embeds MIP needs:
FCP MetamodelExhibit 2 – Principles of (FCP) Discinnet model /1
P. Journeau Richeact - Discinnet LabsMain References