Graph Database using Java [OrientDB and Gremlin]

Posted on

What is Graph Database?

In computing, a graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data. A graph database is any storage system that provides index-free adjacency.
More info : Graph Database

There are many Graph Databases available on net. The well-known and commonly used ones are Neo4j, OrientDB, Allegro, etc.

For my project I experimented with OrientDB. My focus of this tutorial would be to cover the setup and implementation of OrientDB through Java.

A little introduction to OrientDB,
OrientDB is an open source NoSQL database management system written in Java. The major advantage of using OrientDB over other graph DB is, it’s a multi-model database, supporting graph, document, key/value, and object Databases, and the relationships are managed as in graph databases with direct connections between records. There are very few platforms that provide such model flexibility.

More Info :OrientDB ; NoSQL ; Multi-model Database

Implementation

Note: I have used Gremlin as a Querying Language. (Gremlin is a graph Traversal Language specialized to work with Property Graphs). One can also use NoSQL queries as well.

A great Tutorial to Learn OrientDB: http://pettergraff.blogspot.it/2014/01/getting-started-with-orientdb.html
A great Tutorial to Learn Gremlin: http://gremlindocs.com/

Code snippet to implement OrientDB in Java using Gremlin as a querying Language:

Installing Libraries:
1. Install OrientDB: download

2.
Add the following Libraries in your project from the ‘lib’ folder, inside the OrientDB installation folder
gremlin-* (all jars)
orientdb-* (all jars)
commons-* (all jars)
pipes-*
blueprints-core-*
concurrentlinkedhashmap-lru-*
blueprints-core-*

This tutorial is based on the assumption that the database has been already created in OrientDB (It is very easy to create, you will find it in tutorials mentioned above). If not, you can work with the example DB already present in OrientDB: GratefulDeadConcerts

Code Snippet:

OrientGraph graph = new OrientGraph("plocal:/databases/Animal_Data", "reader", "reader");

//Adding and Vertex and setting properties:
Vertex vanimal1 = graph.addVertex("class:Animal");
vanimal1.setProperty("Name", "Tiger");

Vertex vanimal2 = graph.addVertex("class:Animal");
vanimal2.setProperty("Name", "Deer");

Vertex vEmpty = graph.addVertex(null); // Create an vertex in V class(superclass of all Vertex)

Edge ehunts = graph.addEdge(null, vanimal1, vanimal2, "hunts");
//Above statement creates an Edge of class E (Superclass of all Edges)

Similarly ‘getters’ can be used to fetch vertices and properties.

SQL Queries: http://orientdb.com/docs/last/Tutorial-Java.html

Gremlin Snippets:

Defining a pipeline

GremlinPipeline pipe = new GremlinPipeline();

1. Fetching features of a Vertex

Vertex temVert;
Iterable vertices = graph.getVertices("Name","Lion");

This returns all the lists of Vertex having property “Name” as “Lion”.
For my case its only one vertex, hence taking that vertex into an vertex object

if(vertices.iterator().hasNext()) // checks if the returned list of Vertex is empty or not.
{
temVert = vertices.iterator().next();
LiveLocation = temVert.getProperty("Lives").toString();
}

2. For getting path to all the 1 hop Neighbors

Iterable vertices = graph.getVertices("Name","Lion");
if(vertices.iterator().hasNext())
{
temVert = vertices.iterator().next();
pipe.start(temVert).bothE().bothV().simplePath().property("Name").path();
for (Object path : pipe) {
System.out.println(path);
}
}

The pipe can be broken as follows:
Start() – defines the starting vertex.
bothE() – defines both incoming and outgoing Edges from Start() vertex.
Note: For unidirectional edges use inE() and outE() for incoming and outgoing respectively

bothV()– defines vertices which have both incoming edges and outgoing edges.
For selecting vertices having only one directional edges, one can use inV() or outV() as per the requirement.(For traversing a tree or any other unidirectional graphs)

Above two statements can be written in short by using both()
e.g.

pipe.start(temVert).both().simplePath().property("Name").path();

This fetches all the vertices outgoing and incoming to the current vertex.

simplePath() – is used to capture the path including the edges, else without it, the query would return only the list of vertices.

Property(“”) – helps to select and display relevant property. Otherwise the path will contain only Vertex Ids.

Path() – it return whole path from start of the pipe. Else only the final result would be displayed. For above snippet, only the property(“Name”) of all the vertices neighbour to the starting point with one hop.

Similarly, for two hop paths:

pipe.start(temVert).bothE().bothV().bothE().bothV().simplePath().property("Name").path();

OR

pipe.start(temVert).both().both().simplePath().property("Name").path();

There are many other ways to achive…

For more examples use: http://gremlindocs.com/

And for more information regarding Orient DB.. Refer to its Manual: Manual

More References:
http://www.fromdev.com/2013/09/Gremlin-Example-Query-Snippets-Graph-DB.html
https://github.com/orientechnologies/orientdb/wiki
http://devdocs.inightmare.org/introduction-to-orientdb-graph-edition/

Leave a comment