### What is a Graph anyway?

Many people associate “graph” with a line chart or similar. That is not what we are talking about here. “Graph” is a mathematical concept, which was first described in 1736 by Euler.

The picture in your head should look more like this:

The picture in your head should look more like this:

(Public Domain, https://commons.wikimedia.org/w/index.php?curid=614057)

The graph above is a directed (the arrowheads), acyclic (no recursive loops) graph, which is what we are interested in here.

What graphs do very well is represent connected data. What representing the real world is about is communicating these two perspectives:

• Structure (connectedness) and

• Meaning (naming and other definitions)

So, even if your physical target is not a graph database, borrowing the paradigms of the (property) graph to make platform independent representations of data models makes perfect sense.

At the bottom of this page you will find references to the theoretical background of property graphs.

The graph above is a directed (the arrowheads), acyclic (no recursive loops) graph, which is what we are interested in here.

What graphs do very well is represent connected data. What representing the real world is about is communicating these two perspectives:

• Structure (connectedness) and

• Meaning (naming and other definitions)

So, even if your physical target is not a graph database, borrowing the paradigms of the (property) graph to make platform independent representations of data models makes perfect sense.

At the bottom of this page you will find references to the theoretical background of property graphs.

*For now think about property graphs as being mathematically well-founded (graph theory). Just as the relational model (in the Dr. Ted Codd style) is based on mathematics (set theory). Going graph does not imply lack of precision.*

### Concepts of the Property Graph Model

The concept map below explains the most important concepts used in the property graph context:

Both Nodes and Relationships can (should) have “names” (formally called labels for nodes and types for relationships), just like concepts and their relationships have in the diagram above.

Relationships are directed, which is visualized by the arrowheads.

Both Nodes and Relationships may be associated with Properties, which are “key / value” pairs such as e.g. Color: Red. On the data model level, we call the key "Property Name".

Relationships are directed, which is visualized by the arrowheads.

Both Nodes and Relationships may be associated with Properties, which are “key / value” pairs such as e.g. Color: Red. On the data model level, we call the key "Property Name".

### Solution Level (Logical) Models

Here is an example of a directed graph representation of the good old Microsoft Northwind data model:

Together nodes and relationships explain the context very well. Nodes represent entity types, which I prefer to call types of business objects. Edges, better known as relationships, represent the connectedness and, because of their names, bring semantic clarity and context to the nodes. Named relationships is a great help when looking into functional dependencies, for example. And the graph representation communicates structure a lot better than an ER diagram.

If necessary, we can add the properties discretely like this:

If necessary, we can add the properties discretely like this:

Property Graphs are similar to concept maps in that there is no normative style (like in UML, e.g.). The style used above is the preference of the author, so feel free to find your own style. If you communicate well with your readers, you have accomplished the necessary.

The labeled property graph model is the best general purpose data model paradigm that we have today.

The important things are the names and the structure (the nodes and the relationships). The properties supplement the solution structure by way of adding content. Properties are also basically just names, but they also can signify “identity” (the general idea of a key on the data model level). Identities may be shown in italics (or some other special effect to your liking). And uniqueness can be signaled using bold font, for example:

The labeled property graph model is the best general purpose data model paradigm that we have today.

The important things are the names and the structure (the nodes and the relationships). The properties supplement the solution structure by way of adding content. Properties are also basically just names, but they also can signify “identity” (the general idea of a key on the data model level). Identities may be shown in italics (or some other special effect to your liking). And uniqueness can be signaled using bold font, for example:

For more details see the book about Graph Data Modeling.

### Physical Models for Graph Databases

Obviously, since our solution data model is already a property graph, moving it to a property graph platform is very easy. There are some physical aspects, which should be taken into consideration:

- Uniqueness and identity
- Data types
- Properties on relationships
- A relationship may point to different types of business objects, like when “Part” and “Waste” are both “part of.” This is both a strength and a weakness. Strength because it reflects the way people think about the reality. And a weakness because people might get confused, if the semantics are unclear. These considerations are very good reasons for defining a solution data model, even if your physical data model is going to be based on a “flexible” or even non-existing schema.
- Ordered, linked lists; possibly including time-series, which can also be handled elegantly in a graph representation (such as e.g. a relationship of type “next” or “previous”).

### Graph Theoretical Background

On the Github site for the OpenCypher project you can find a good, formal definition based on mathematical graph theoretical concepts. Basically a property graph in the sense it is used here is a directed, vertex-labeled, edge-labeled multigraph with self-edges, where edges have their own identity. In the property graph paradigm, the term node is used to denote a vertex, and relationship to denote an edge. See Wikipedia’s definitions for reference:

• Directed multigraph

• Labeled multigraph

The Apache Tinkerpop ™ project is also based on very similar concepts.

There are several physical implementations of property graph technologies. One of the most well know is Neo4J from Neo Technologies, most of the rest of the companies are listed on the Tinkerpop site. Neo has a good introduction to property graphs over here: https://neo4j.com/developer/graph-database/

• Directed multigraph

• Labeled multigraph

The Apache Tinkerpop ™ project is also based on very similar concepts.

There are several physical implementations of property graph technologies. One of the most well know is Neo4J from Neo Technologies, most of the rest of the companies are listed on the Tinkerpop site. Neo has a good introduction to property graphs over here: https://neo4j.com/developer/graph-database/

You may follow the sequence or explore the site as you wish:

There is much more about property graphs for data modeling in the book about Graph Data Modeling: