Logo

Interactive Visualizations for Dynamic and Multivariate Networks. Free, online, and open source.


Use

Visualizations
Formatting Data
Importing Data
Troubleshooting

Resources

Demo Data
Other Tools

Project

Publications
Contribute
Team
Contact us







Data Formats

Link Tables Node Tables Location Tables Finding the right format Formatting Time Issues with Data Files and Formating

The Vistorian imports data formatted in tables in csv format (comma separated values, https://en.wikipedia.org/wiki/Comma-separated_values) files. Each csv contains one table. You can export csv-files from

If you have edited or compiled your data files manually, you can use the free online tool CSVLint to check if your CSV file is properly formatted.

This page explains what information you can put in your CSV files so The Vistorian can visualize it. There are some rules to follow where which information goes. Once you have formatted your network into one or more tables, you can upload them through the import wizard in the Vistorian.

A table for a simple network with link types can look like so:

PersonA PersonB Relation type
Bob Anton Work
Bob Charles Friends
Charles Maria Work

Troubleshooting Issues with Data Formatting and Dataset Files

For issues and errors related to data files formatting and upload please check our troubleshooting page.

Terminology

This document and the Visualization Manual follow the general network terminology, however there are many concepts, potentially meaning the same.

A network (or graph) is a set of nodes (or actors, or vertices, etc…) and links (or relations, or edges, or arcs, etc…). Attributes are values associated to nodes and links. Our network model currently supports the following information:

Data Tables

The Vistorian knows three types of tables and depending on your network you need to provide between 1 and all 3 tables (explained later).

Both, node and link tables can express the same network, but in many cases having your data in node or in link table format makes life easier. However, in some cases you have to provide a node table, or a link table or even both. The differences between node and link tables and which table(s) you need for your specific data are explained below.

Link tables are the most common way to format network information. Each row represents a link in the network with the link’s attributes in columns. Using link tables you can specify a variety of link attributes such as

In link tables, you cannot specify attributes for nodes, such a node type.

The following explains examples of how to provide the above information in link tables.

The simplest link table that creates a very simple network has only two columns (attributes), one that specifies the source node of a link and one that specifies the target node:

Sender Receiver
Bob Anton
Bob Charles
Charles Maria

The corresponding CSV-file that can directly be imported into The Vistorian looks like this:

Sender, Receiver 
Bob, Anton 
Bob, Charles  
Charles, Maria  
Maria, Anton 

The two required fields in every link table are:

Additional information for each link can be added as additional columns to the link table. The following table models a social network in which links between persons that live in and move between cities, represent money transactions over several years, as well as the type of transaction.

Sender Sender Location Receiver Receiver Location Amount Year Type
Bob Rome Charles Lisbon 10 1801 Loan
Bob Paris Charles Lisbon 14 1803 Gift
Bob Rome Charles Lisbon 3 1810 Purchase
Bob Rome Anton London 2 1801 Purchase
Anton London Bob London 5 1810 Loan

This table contains one row per transaction. Besides sender and receiver, each transaction has a time (Year), a weight (Amount), a type (Type) as well as the current locations of the persons involved in the transaction. Note that in row 2 (in 1803), Bob is sending the money from Paris, not from London.

The additional columns / attributes can express the following information about the network:

When specifying geographic locations, the application searches for their coordinates online. In order to retrieve the correct coordinates, the following guideline helps the visualization finding the right geocoordinates:

There is a workaround to allow you specify antique or imprecise names, using location tables (see below).

In summary, the standard template for a link table is as follows, while the order of columns does not matter. Mandatory attributes are highlighted bold.

ID Source Node Target Node Weight Time Link Type Source Location Target Location
0

When time is associated with a link in a dynamic network (or temporal, or longitudinal network), there are two options what a link can mean:

For visualization, the main difference is that each event is treated as a different link in the network, while a lasting relationship is treated as the same link with varying attributes.

For example, when selecting a time-period, e.g. 1801-1806, event links will be shown as different links between nodes. The number of links between the same two nodes indicates the number of events, e.g. the number of letters send between the two nodes in the selected period. In contrast, if the links were lasting relationships, only one link will be shown between the nodes, showing the average weight during the selected time period.

To specify an event in your link table, you can stick to the above format. To specify a lasting relationship, you have to add IDs to your links. An ID (identifier) is a unique number identifying the same link. The following table shows an example with two lasting relationships (IDs 0 and 1) and the change of their respective weight (Money ($k)) over four years each.

ID Sender Receiver Money($k) Year
0 Anton Bob 100 1801
0 Anton Bob 100 1802
0 Anton Bob 30 1803
0 Anton Bob 10 1804
1 Anton Charles 10 1801
1 Anton Charles 20 1802
1 Anton Charles 30 1803
1 Anton Charles 100 1804

The first four links are semantically the same relationship: money send between Anton and Bob (over several years). In the visualization, selecting the entire time period 1801-1804 will show

Of course, you can create a network with multiple relationships by using one IDs per relationship. The following table shows two relationships between Anton and Bob and three relationships between Anton and Charles.

ID Sender Receiver Money($k) Year
0 Anton Bob 100 1801
0 Anton Bob 100 1802
1 Anton Bob 30 1803
1 Anton Bob 10 1804
2 Anton Charles 10 1801
3 Anton Charles 20 1802
3 Anton Charles 30 1803
4 Anton Charles 100 1804

Summary: A single link table is sufficient to specify a network. Link tables contain one line per link, allowing to specify attributes on links as well as locations of the respective nodes at the time of the relationship. Some information in link tables are redundant, as in the example just above. While there are technically more sophisticated data and table formats, link tables are human readable and edible with common spread-sheet applications.

Node Tables

Node tables contain on line (row) per node in the network. Node tables can be used in two ways:

Node Tables for Network Creation

A common format used by historians are genealogies. The table below specifies a genealogy where each row contains the immediate family relationships of a person:

CHILD MOTHER FATHER GOD-FATHER GOD-MOTHER PLACE-OF-BIRTH
Bob Celine Charles Dave Eve Paris
Ana Fannie Gerd Mike Dianne London
Celine Maria João Pedro Ana Lisbon
This node table can be interpreted to contain one column per link. In other words, we want the network to contain one link between each node in the child column and each node in the associated columns: Mother, Father, God-father, God-mother. We could even create a node for each name in the Place-of-birth field, if desired.

When using node tables as in the example above, the visualization interprets the column name (e.g, Mother) as link type and colors links differently.

In short, a node table used individually has the following template. Bold headers are required columns.

NODE NODE TYPE RELATION_1 RELATION_2 RELATION_3

Eventually, some networks may require both node and link tables:

A node and a link tables are related through node names. I.e the node names in the node table must match the names of source and target nodes in the link table. The visualization looks up the names internally to create the network. The following two tables show a simple example of a social network where each node has a profession.

Link table:

SENDER RECEIVER
Ana Charles
Charles Bob

Node table:

NAME PROFESSION
Ana Lawyer
Bob Merchant
Charles Accountant

Location Tables

The Vistorian automatically creates location tables if you have specified any ‘source_locations’ or ‘target_locations’ in the link table. If these names are modern english names (e.g. ‘Kaliningrad’, ‘Rome’, etc.), our database will find the geo-coordinates. You need to be connected to the internet in order to retrieve coordinates.

If you are using places which are unlikely to be registered in any modern street atlas, you may want to provide Vistorian with the names and geo-coordinates you have. In this case, you can upload your own location table.

USERNAME LONGITUDE LATITUDE
Atlantis 13.4 45.2

When you assign the location table in The Vistorian, you will see a fourth colum, called ‘geoname’. This is an internal value that you do not have to provide. It’s the modern name The Vistorian is looking for in the geoname database. It is separate from the ‘username’ so you can display your alternative name (e.g. ‘Koenigsberg’) instead of the modern and English name (‘Kaliningrad’).

If you find that any coordinates are wrong you can either:

In both cases, if you find your location to be wrong (e.g. visualized on the map), you can either:

Finding the Right Format

Which tables you need and which information needs to go where depends on every specific case. There might also be may cases that are not currently supported by our visualizations, e.g. locations associated to links. In this case, please send us an email: benj.bach@gmail.com.

This section gives some brief guidelines to find the right tables and formatting.

Currently Work-In-Progress or Unsupported Scenarios

The following cases are currently under development. If you are interested in a particular case, please let us know and/or join the development team.

Formatting Time

INPUT EXAMPLE DESCRIPTION
YYYY 2014 4 or 2 digit year
YY 14 2 digit year
Y -25 Year with any number of digits and sign
Q 1..4 Quarter of year. Sets month to first month in quarter.
M MM 1..12 Month number
MMM MMMM Jan..December Month name in locale set by moment.locale()
D DD 1..31 Day of month
Do 1st..31st Day of month with ordinal
DDD DDDD 1..365 Day of year
X 1410715640.579 Unix timestamp
x 1410715640579 Unix ms timestamp