Nlp interchange format nif
1 / 32

NLP Interchange Format (NIF ) - PowerPoint PPT Presentation

  • Uploaded on

NLP Interchange Format (NIF ). Presented by : Swaran Lata Email : Dated:1 st March 2013. Paradigm shift in the evolution of internet. “Internet is the network of networks.”. Web 1.0. Web 2.0. Web 3.0. Web 1.0.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'NLP Interchange Format (NIF )' - jariah

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Nlp interchange format nif

NLP Interchange Format (NIF)

Presented by :

Swaran Lata

Email :

Dated:1st March 2013

Paradigm shift in the evolution of internet
Paradigm shift in the evolution of internet

  • “Internet is the network of networks.”

Web 1.0

Web 2.0

Web 3.0

Web 1 0
Web 1.0

  • The first stage was linking web pages and sharing with web pages

  • The concept of Hyperlink was introduced in 1993

  • Characteristics

    • Personal Web pages

    • Static web pages

    • HTML based sites

    • HTML forms sent via email

    • Use of framesets

    • The main type of connection was dialup having 50k bandwidth

    • Read only content

  • EgYoutube (Business Paradigm Shift in web)

    • Rebecca black Justin Beiber have become international stars overnight

    • Dhanush’sKolaveri D has become international hit

Web 1 0 era
Web 1.0 era



HTML static web pages

Web 1.0

Content Management Systems


Web 2 0
Web 2.0

  • Web 1.0 graduated into Web 2.0 during 2003-06

  • Web 2.0 is about user-generated content and the read-write web. People are consuming as well as contributing information through blogs

  • Concept of “prosumer” i.e. minimal differentiation between producer and consumer of content

  • Examples

    • Social Networking Sites – Hosted services

    • Blogs – Web Applications

    • Wikis – Mashups

    • Video Sharing Sites – Folksonomies

Web 2 0 era
Web 2.0 era

Web 2.0

RSS Feed

Web 3 0
Web 3.0

  • Will be metaverse

  • Will be a web development layer that includes characteristics

    • TV-quality open video

    • 3D simulations

    • augmented reality

    • human-constructed semantic standards

    • pervasive broadband, wireless, and sensors

  • a time when "the internet swallows the television.“

  • Web 3.0 will allow the user to sit back and let the Internet do all of work for them

Web 3 0 contd
Web 3.0 (Contd..)

  • Web 3.0 Technologies (Semantic Web) Includes

    1. Artificial intelligence

    2. Automated reasoning

    3. Cognitive architecture

    4. Composite applications

    5. Distributed computing

    6. Knowledge representation

    7. Ontology (computer science)

    8. Recombinant text

    9. Scalable vector graphics

    10. Semantic Web

    11. Semantic Wiki

    12. Software agents

Web 3 0 era
Web 3.0 era



Better Search Engines

Web 3.0



Linked data

Machine Readable data

What is semantic web
What is Semantic web

  • Web of data

  • The Semantic Web, an extension of the current one[].

  • It provides well-defined information,

  • Enabling computers and people to work in cooperation

  • Framework for sharing and reusing of data

  • Correlation of data with real world objects

Important components of semantic web
Important components of Semantic Web

  • Major components:

    • Resource Description Framework (RDF)

    • Web Ontology language(OWL)

    • Linked Data

    • Vocabulary

    • SPARQL

    • Simple Knowledge Organization system (SKOS)

Resource description framework rdf
Resource Description Framework (RDF)

  • An XML-based language used to describe resources

  • Resources can include entities, concepts, properties and relations

  • Captures the meta data about the “externals” of a document

  • Can use a serialized model, RDF triplets, special notation, or graphs to describe data

Web ontologies owl
Web Ontologies (OWL)

  • An ontology is an explicit specification of a conceptualization.

  • An ontology consists of a set of axioms which place constraints on sets of individuals (called "classes") and the types of relationships permitted between them.

  • To define an instantiate of  Web ontologies.

  • OWL is a family of knowledge representation languages for authoring ontologies.

  • OWL differs from an XML schema in that it is a knowledge representation, not a message format.

  • Documents from different domains can be merged together to answer a user query.

Linked data and it components
Linked Data and it components

  • Linked Data describes a method of publishing structured data making it more useful & understand .

  • Linked Data publishes data on the web in such a way that it is machine readable.

  • Linked Data may be as diverse as databases maintained by two organisations in different geographical locations, or heterogeneous systems within one organisation that have not easily interoperated at the data level.


  • URIs are used to identify things.

  • Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.

  • Provide useful information about the thing in the standard formats such as RDF/XML.

  • Include links to other, related URIs to improve discovery of other related information on the Web.

Linked open data lod 2 technology
Linked open Data (LOD 2) Technology

  • The LOD2 stack is an integrated distribution of aligned tools which support the life-cycle of Linked (Open) Datafrom extraction, authoring/creation over enrichment, interlinking, fusing to visualization and maintenance.

    The life-cycle comprises in particular the stages :

  • Extraction of RDF from text, XML and SQL

  • Querying and Exploration using SPARQL

  • Authoring of Linked Data using a Semantic Wiki

  • Semi-automatic link discovery between Linked Data sources

  • Knowledge-base Enrichment and Repair

Linked open data lod 2 project
Linked open Data (LOD 2) Project

  • NLP2RDF is a LOD project that is developing the NLP Interchange Format (NIF).

  • NIF aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations.

  • The output of NLP tools can be converted into RDF and used in the LOD Stack.

What is nif
What is NIF

  • NLP Interchange Format (NIF) is an RDF/OWL-based format that allows to combine and chain several Natural Language Processing (NLP) tools in a flexible, light-weight way.

    The core of NIF consists of three parts:

    1. A set of URI recipes, used to create unique and potentially stable URIs to anchor annotations in documents.

    2. A vocabulary, which can represent Strings, Words and Sentences as RDF resources.

    3. Transformations for the programmatic usage of the Ontologies of Linguistic Annotations (OLiA).

Important components of nif
Important Components Of NIF

  • Structural Interoperability :URI recipes are used to anchor annotations in documents with the help of fragment identifiers. The URI recipes are complemented by two ontologies (String Ontology and Structured Sentence Ontology), which are used to describe the basic types of these URIs (i.e. String, Document, Word, Sentence) as well as the relations between them.

  • Conceptual Interoperability:The Structured Sentence Ontology (SSO) was especially developed to connect existing ontologies with the String Ontology and thus attach common annotations to the text fragment URIs. The NIF ontology can easily be extended and integrates several NLP ontologies.

  • Access Interoperability: A REST interface description for NIF components and web services allows NLP tools to interact on a programmatic level.

Nif integration architecture
NIF – Integration Architecture

NIF Wrapper

NIF Wrapper




NIF Wrapper

RDF Model


Associated standard
Associated Standard

  • Web Ontology language(OWL)

  • NLP

  • Linked Data

  • RDF

  • How nif helps nlp requirements of web
    How NIF Helps NLP Requirements of Web

    • All URIs created by the mentioned URI recipes should be typed with the respective OWL Class.

    • In each returned NIF model there should be at least one URI that relates to the document as a whole.

    • Each other annotated String should be related to the URI given to the Document with a property that is a sub property of str:subString.

    • For each annotation, a reference model should be used, so the annotations are machine-interpretable.

    How nlp tools are integrated with nif models
    How NLP Tools are integrated with NIF Models

    • NLP tools can be integrated with NIF, if an adapter is created, that is able to parse a NIF Model into the internal data structure and also to output the NIF as a serialization.

      A NLP pipeline can then be formed by either:

    • Passing the NIF RDF Model from tool to tool

    • Passing the text to each tool and then merge the NIF output to a large model.

      The URI recipes of NIF are designed to make it possible to have zero overhead and only use one triple per annotation

    The structure of word net

    The Structure of Word net

    Wn: word

    Wn: synset

    Wn: word Sense

    Wn :word

    Wn: has sense

    Wn: in synset

    Rdf: type

    Wn: lexical




    Wn: word

    Wn: word Sense

    Wn: synset

    Wn :word

    Wn: has sense

    Rdf: type

    Wn: in synset


    Wn: lexical



    Relation to other word senses, e.g. antonym

    Relation to other synset e.g. hypernym , hyponym

    How word net is related to semantic web rdf
    How word net is related to semantic web/RDF

    Data Base of different lexical and semantic web relation b/w Hindi words



    Linked Data

    Hindi Word Net

    Xml model
    XML Model

    • XML is a tree-structured document

      • Nodes

        • Element nodes

          • Children can be ordered

          • Recursive elements

            (parts under parts)

        • Attribute nodes

        • Mandatory or optional

      • Edges

        • Sub-element edges

        • Attribute edges

        • IDRef edges

      • Constraints

        • References

        • Value restrictions, OneOf

        • Cardinality

    • Trees are more flexible than tables

      • Any number of nodes can be added anywhere without breaking the model

    Future work
    Future work

    • Wordnet to RDF format

    • Wordnet with other Ontologies like

      - Library

      - ISSN

    • Matching of Wordnet vis-à-vis generic ontology.

    • Proliferation of Semantic Web/Linked Data through creating awareness.

    • Development Semantic Web/Linked Data for use in Wordnet.

    • To evolve the opportunities for implementation of Semantic web in Indian Languages.

    Nlp interchange format nif


    Thanks & Questions