170 likes | 300 Views
This dataset introduces a logical framework for performance relation metrics (PRMs) in research environments. It defines key relations such as Researcher, Institute, Paper, and author relations, encapsulating various dependencies and uncertainties that influence factors like salary and citation likelihood. The dependency types include intra-relational and inter-relational dependencies, as well as reference, existence, and identity uncertainties. Additionally, the framework establishes syntactical rules and aggregates to quantify a researcher's impact based on their publications, reinforcing the integration of logical reasoning in analytics for research outputs.
E N D
Example • Institute(InstId,Type) • Researcher(RID,Area,Salary,InstID) • Paper(PaperId,Topic) • Author(RId,PaperID) • Cites(PaperId1,PaperId2)
Types of Uncertainty • intra-relational dependency • a researcher’s salary depends on their research area • inter-relational dependency • a researcher’s salary depends on the type of institute they work at • reference uncertainty • a paper’s author is more likely to be a research in the same area as the paper • exists uncertainty • a citation between two papers is more likely to exist if they are on the same topic • identity uncertainty • the authors of two distinct papers are more likely to be the same individual if the author names are similar and if the co-authors are the same
DependsOn Predicates • Examples: • DependsOn1(Salary; Area) Researcher(RId,Salary,Area,InstId) • DependsOn1(Salary; Area, Type) Researcher(RId,Salary,Area,InstId), Institute(InstId,Type)
Rules for DependsOn • The set of DependsOn predicates occur only in the heads of clauses • The body of a DependsOn clause may contain extensional predicates, built-in predicates • Every descriptive attribute A must appear as the first argument of a DependsOn predicate • If there is more than one DependsOn predicate for a particular attribute, require for each corresponding key, only one DependsOn matches.
Aggregates • A researcher’s salary depends on the number of publications they have: • CountRIDAuthor;PaperId(RId,CntPapers) • this takes the Author relation, Author(RId,PaperId) groups by RId and takes the count • Equivalent to • select RId,count(PaperId) as CntPapers from Author group by RId • More general form: • Aggrkeypredicate;Aggr-Variable-List(Key,AggrVal)
Syntax • Predicates – ordinary predicates, aggregates, DependsOn • Clauses – Key Constraints, DependsOn Clauses • CPDs
Semantics • Attribute Uncertainty • Background theory provides instantiations for both the primary key and foreign keys through a set of partially instantiated extensional predicates
Researcher-Inst(101) Researcher-Inst(102) Researcher-Institute-Inst(101,201) Researcher-Institute-Inst(102,201) Institute-Inst(201) Paper-Inst(301) Author-Inst(101,301) Author-Inst(102,301) Paper-Inst(302) Author-Inst(101,302) Cites-Inst(301-302) A Sample KB
Intensional Predicates to Introduce RVs • Area, Salary Researcher(RId,Area,Salary,InstId) Researcher-Institute-Inst(RId,InstId) • Type Institute(InstId,Type) Institute-Inst(InstId)
Dependency Graph • Convert each numbered DependsOn statement to a general binary relation • DependsOn(Ai;….,A,,…) … • Let Vi and V, be instantiations • we add Vi < V, • We require < to be acyclic
Reference Uncertainty • Paper(PaperId,Topic), Venue(VenueId,Area), PublishedIn(PaperId,VenueID) • Paper-Inst(301), Paper-Inst(302), Venue-Inst(stoc), Venue-Inst(focs), Venue-Inst(icse), Venue-Inst(pldi), Venue-Inst(isca) • Venue PublishedIn(301,Venue) • Venue PublishedIn(302,Venue) • VenueKeys = { VenueId | VenueId Venue-Inst(VenueId)} • VenueKeys = {stoc,focs,icse,pldi,isca}
FKDependsOn • FKDependsOn(VenueId;Area;Topic) PublishedIn(PaperId,VenueId), Paper(PaperId,Topic),Venue(VenueId,Area) • General form: • FKDependsOn(variable;<partition-variable-list>;<parents>) • Once a partition is chosen for a variable, the key is chosen uniformly from that partition.
Ensuring Coherence • We require that the parents and the variables that define the partition come before the fk variable in the dependency graph • Also, any dependencies based on the fk must occur after the fk is determined.
Existence Uncertainty • CiteExists(PaperId1,PaperId2,Exists) • Cites(PaperId1,PaperId2) CiteExists(PaperId1,PaperId2,True) • DependsOn(Exists;Topic1,Topic2) CiteExists(PaperId1,PaperId2,Exists), Paper(PaperId1,Topic1), Paper(PaperId2,Topic2)