In this series of blog posts I will discuss all things related to Graph databases. Coming from a relational world, graph database caught my attention primarily for the following reasons
- Flexible schema
- Linked data a.k.a semantic web
I will begin with formally defining a Graph.
A graph is just a collection of vertices and edges—or, in less intimidating language, a set of nodes and the relationships that connect them. Graphs represent entities as nodes and the ways in which those entities relate to the world as relationships. This general-purpose, expressive structure allows us to model all kinds of scenarios.
One way to slice the Graph space is to look at the graph models employed by the various technologies. There are three dominant graph data models:
- Property graph
- Resource Description Framework (RDF) triples
Another view of the Graph space is to view them as
- Graph technology used for OLTP
- Graph technology used for OLAP
- It contains nodes and relationships
- Nodes contain properties (key-value pairs)
- Relationships are named and directed, and always have a start and end node
- Relationships can also contain properties (key-value paris)
The following diagram illustrates a Property Graph.
RDF provides a general, flexible method to decompose any knowledge into small pieces, called triples, with some rules about the semantics (meaning) of those pieces.
- A fact is expressed as a triple of the form (Subject, Predicate, Object).
- Subjects, predicates, and objects are given as names for entities, whether concrete or abstract, in the real world. Unlike Property graph the triples have no properties.
- Names are in the format of URIs, which are opaque and global.
The following diagram illustrates a RDF graph.
RDF triples RDF triples fall under the general category of graph databases because they deal in data that—once processed—tends to be logically linked.
Since RDFs typically integrates a web of data the notion of provenance is essential, therefore, a 4th element which is optional was introduced which extends the triples. This element is referred to as the ‘context’ and the collection is referred to as the RDF quad.
In my next post I will discuss Properties of Graph Databases….