Artificial Intelligence for Big Data
上QQ阅读APP看书,第一时间看更新

RDF—the universal data format

With the background of Ontologies and their significance in the big data world, let us look at a universal data format that defines the schematic representations of the Ontologies. One of the most adopted and popular frameworks is the Resource Description Framework (RDF). RDF has been a W3C recommendation since 2004. RDF provides a structure for describing identified things, entities, or concepts designed to be read and interpreted by computers. There is a critical need to uniquely identify an entity or concept universally. One of the most popular ways in the information science field is the use of Universal Resource Identifiers (URIs). We are familiar with website addresses, which are represented as Universal Resource Locators (URLs). These map to a unique IP address and hence a web domain on the internet. A URI is very similar to a URL, with the difference that the URIs may or may not represent an actual web domain. Given this distinction, the URIs that represent the real-world objects must be unambiguous. Any URI should be exclusive to either a web resource or a real-world object and should never be used to represent both at the same time, in order to avoid confusion and ambiguity:

Here is a basic example that describes the https://www.w3schools.com/rdf resource:

When defining RDFs, there are the following considerations:

  • Define a simple data model
  • Define formal semantics
  • Use extensible URI-based vocabulary
  • Preferably use an XML-based syntax 

The basic building block of the RDF is a triple that consists of a Subject, Predicate, and an Object. The set of triples constitutes an RDF graph:

Let us look at an example of a database of books and represent it with RDF XML:

The first line of the RDF document is the XML declaration. The XML declaration is followed by the root element of the RDF documents, <rdf:RDF>.

The xmlns:rdf namespace specifies that the elements with the rdf prefix are from the http://www.w3.org/1999/02/22-rdf-syntax-ns# namespace. The XML namespaces are used to provide uniquely named elements and attributes in an XML document.

The xmlns:book namespace specifies that the elements with the book prefix are from the - http://www.artificial-intelligence.big-data/book# namespace.

The <rdf:Description> element contains the description of the resource identified by the rdf:about attribute.

The elements <book:author>, <book:company><book:year>, and so on are properties of the resource.

W3C provides an online validator service (https://www.w3.org/RDF/Validator/), which validates the RDF in terms of its syntax and generates tabular and graphical views of the RDF document: