Semantic Web

Introduction
Semantic means the study of meaning in language. It is the study of interpretation of signs or symbols, which are used in agents or communities within same or different contexts.
In French language word has its origins in Greek “semantikos” means “significant”.

Web means a complex network of fine threads constructed by a spider, or WWW (World Wide Web (W3)) is a system connected with different server systems that support specially formatted documents. These documents are in HTML (Hypertext markup language) which links to other document including audio, video and different graphics. There are metadata, which gives describes other data.

Semantic web is the effort to enhance the internet system to get required knowledge from linked documents over the internet or network of computers. The main goal of semantic web is to get data or knowledge from the documents and not just document from the interlinked documents.

The current web is used for machine to human and it is not based on machine to machine.

So alternative approach is to represent web content in a form that is accessible by machine.

We can use intelligent techniques to get correct information and data by machines.

Twenty years ago, World Wide Web (internet) was invented by computer scientist Sir Timothy John Berners-Lee also known as TimBL.

WEB 1.0
It is the read-only web. The first web allowed us to search information and read it. There was only static information in html web pages in which users can read it and get the information. The main goal of web1.0 is to provide information to any users at any time around the world.

WEB2.0
As per the description of “Berners- Lee”, currently we are working in the “Read- write” web. From web2.0, we can contribute the content created by other users and we can interact with it. It has changed web in very short time. Examples like Wikipedia, you tube, linked in, blogs and social media facebook which relies on user’s submission.  In web2.0 users have higher potential to contribute and interact with given content.

WEB 3.0
It is a web of data, as per Berners-Lee defined it as “Read-Write-Execute”. To illustrate it consider it as semantic markup and web services. It reduces the communication gap between humans and computerized applications. Very big challenge of web is to present information to its users with context.  Currently software applications cannot provide context to data so it becomes difficult what is relevant.

For example, if user search for apple then how to identify user is searching for fruit or for mobile phone or other electronic devices which have similar terms.

Example for Web3.0 is Wolform Alpha and Alexa devices.

  1. WolframAlpha is computational intelligence search engine.

For example, if we search “India” in google search engine then it displays about news and sites related to India.
If we search “India” in “wolframalpha” then it gives more detailed analysis about India.

Result of Google

Result of Wolframalpha
Alexa: It’s the speech recognition and artificial intelligence to bring results in action.

Web 3.0 uses dynamic applications, interactive services and “machine-to-machine” interaction. It refers to the next generation of web for the better communication which helps to end users for their daily needs.

WEB3.0 (Semantic Web)

As shown in above figure semantic web is divided in separate layers, let’s understand each layer with its short description.

URI: Uniform Resource Identifier

It is basically a string of standardized form to identify the specific resources (documents) from the internet.  A subset of URI is URL, which gives the location of document as follows:  http: //www.abc.org/

Another subset of URI is URN that allows identifying a resource without implying its location on web.

Machine needs a unique, consistent way to identify a concept or specific thing.

Examples: Standard identifiers

ISBN (International Standard Book Number): 0-123-45678-9, it is a unique numeric book identifier bar code. It is used by publishers, book sellers and stock control.

ISMN (International Standard Music Number): Its thirteen-character alphanumeric identifier for printed music developed by ISO.

ISAN (International Standard Audiovisual Number): It’s a voluntary numbering system for the identification of audiovisual works. It provides a unique and international recognized permanent number for audiovisual work registered in the ISAN system.

An international variant of URI is “International Resource Identifier” (IRI).

UNICODE:
It is a standard of encoding international character set which allows all human languages can be read and write on internet using one standard format.

XML:  Extensible Markup Language
Extensible Markup Language (XML) layer includes XML namespace and XML schema definitions which gives common syntax used in the semantic web.

XML is a general-purpose markup language for documents containing structured information which contains elements that can be nested. It includes attributes and content.

Resource Description Framework (RDF)

R- Resources (Pages, documents and ideas everything that can have a URI)
D- Description (Attributes, features and relations of the resources)
F-Framework (Model, languages and syntaxes for these descriptions)

A core data representation format for semantic web is RDF.
It is a framework for representing information about resources in a graph form. It was primarily intended for representing metadata about WWW resources, such as the title, author, and modification date of a Web page, but it can be used for storing any other data.

RDF is based on triples subject-predicate-object that represents form graph of data.

 Web Ontology Language (OWL)

Ontology is the way to describe taxonomies and classification of simple and complex networks.

OWL is the language to represent rich and complex knowledge about things, group of things and relations in multiple things. Its aim to facilitate interoperability among web content using vocabulary and formatting that gives automatic machine processing.

More detailed ontologies can be created with Web Ontology Language (OWL).

There are three parts as follows:

  1. OWL Lite for taxonomies and simple constrains.
  2. OWL DL for full description logic support.
  3. OWL Full for maximum expressiveness.

Tools supported for OWL,

  1. Apache Jena ( java )
  2. Graph DB(Java, c#)

Simple Protocol and RDF Query Language (SPARQL)

SPARQL is query language for semantic web.

We can use it for following tasks:

To pull values from structured and semi-structured data, explore data by querying unknown relationships, in single query we can perform complex joins in different database.

SPARQL is SQL-like language; it uses RDF triples (subject, predicate and object). It is query language and also a protocol for accessing RDF data.

SPARQL endpoint accepts queries and returns results via HTTP. “Generic” endpoints, queries any RDF data which are accessible via web and “Specific” endpoints are used to query against particular datasets.

The result of SPARQL can be return in following formats:

XML, JSON, RDF and HTML

Dataset: Friend of a Friend (FOAF)

FOAF is a standard RDF vocabulary for describing people, their relationships with other people, activities and objects.

FOAF gives us facility to communicate with group of people through social network so it does not require any type of centralize database system.

Computers use FOAF profiles to find specific object or person details. For example, on social networking site “Facebook”, we can find friends list and list of friend’s friends.

Proof & Trust

Above both layers are under research.  All the applications construction of a proof is done according to predefined rules and all the parties need to validate proof for the details of the data.

Proof layers need to provide explanations for the provided results and the mechanism for the provided results. Also need to verify internal mechanism of reasoning system.

Cryptography

To get trusted and reliable details in all the above layers, we can use cryptography.

We can use digital signatures for verification of the origin of the sources of the information given by each agent in different layers.

Fig2: Web evolution

As per shown in above figure we can identify how the internet is change gradually.

We can see the past, current and future of internet. Currently we are working in web2.0 and moving toward web 3.0.

Leave a Reply

Your email address will not be published. Required fields are marked *