Graph Databases

In this series of blog posts I will discuss all things related to Graph databases.  Coming from a relational world, graph database caught my attention primarily for the following reasons

  • Flexible schema
  • Linked data a.k.a semantic web
  • Analytics

I will begin with  formally defining a Graph.

A graph is just a collection of vertices and edges—or, in less intimidating language, a set of nodes and the relationships that connect them. Graphs represent entities as nodes and the ways in which those entities relate to the world as relationships. This general-purpose, expressive structure allows us to model all kinds of scenarios.

Graph Space

One way to slice the Graph space is to look at the graph models employed by the various technologies. There are three dominant graph data models:

  • Property graph
  • Resource Description Framework (RDF) triples
  • Hypergraphs.

Another view of the Graph space is to view them as

  • Graph technology used for OLTP
  • Graph technology  used for OLAP

Property Graph

  • It contains nodes and relationships
  • Nodes contain properties (key-value pairs)
  • Relationships are named and directed, and always have a start and end node
  • Relationships can also contain properties (key-value paris)

The following diagram illustrates a Property Graph.

 RDF

RDF provides a general, flexible method to decompose any knowledge into small pieces, called triples, with some rules about the semantics (meaning) of those pieces.

  • A fact is expressed as a triple of the form (Subject, Predicate, Object).
  • Subjects, predicates, and objects are given as names for entities, whether concrete or abstract, in the real world. Unlike Property graph the triples have no properties.
  • Names are in the format of URIs, which are opaque and global.

The following diagram illustrates a RDF graph.

RDF triples RDF triples fall under the general category of graph databases because they deal in data that—once processed—tends to be logically linked.

Since RDFs typically integrates a web of data the notion of provenance is essential, therefore, a 4th element which is optional was introduced which extends the triples. This element is referred to as the ‘context’ and the collection is referred to as the RDF quad.

In my next post I will discuss Properties of Graph Databases….

Advertisements

About atiru

Product Strategist and architect for harnessing value from data.
This entry was posted in Topics related to Graph Databases and Compute, Linked Data (RDF). Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s