Jan 182013
 

I receive more and more frequently emails asking for some general advice on network analysis (sometimes called “linked data analysis”) and visualization.

Readers of this blog might be interested in my recommendations.

[disclaimer to network analysts and dataviz -ers reading: hey, there is so much more out there, agreed! That's just a personal and partial list...]

start

Tools for linked data analysis:

Excel for very basic statistical counts,
Pajek (http://pajek.imfm.si/doku.php) for manipulation of the network, if it is huge (like, if you would like to filter out some nodes from a network with 100,000 nodes)
R (http://www.r-project.org/) for more advanced statistics,
still R for modeling, with this package: http://statnet.csde.washington.edu/index.shtml
UCINET (https://sites.google.com/site/ucinetsoftware/home) for network-oriented statistics and modeling, though the network should remain small I think (20,000 nodes or less?)

Tools for viz of linked data:

NodeXL (http://nodexl.codeplex.com/) if you are more comfortable with Excel based software (but less powerful IMHO – disclaimer: I am a member of the Gephi Community support team)
For other forms of viz, it all depends on your intentions. If your final product should be seen from an Internet browser, then look at Javascript libraries like D3 (http://d3js.org), Sigmajs (http://sigmajs.org), ProcessingJs (http://processingjs.org) and so many others.
The best flexibility would be afforded by having a data visualizer with programming skills on board: that would give you access to the wonders of Processing (www.processing.org) and other forms of data visualization. Processing is quite popular so this is not a stretch to think you could find a resource person for it.
Finally, but that’s actually a first step: how do you get from your data, probably in the form of a csv file or something containing thousands of lines of data, to a network?
Surprisingly, there is not much (or none, even) around to do it. For this reason I created Eonydis, a small program to transform transactional data (such as a financial transaction between A and B, happening on day x) into a dynamic network. A dynamic network is simply a network which contains information about time. Download it  here:

People who can help

=> If you are primarily interested in network analysis, not the viz:
subscribe to the mailing list specializing in social networks (anyone can) and post a question asking for help, it will surely return proposals, especially if you have a budget!
Try also this LinkedIn group on social network analysis:
=> If you are primarily interested in the viz:

- you can get help on specific issues on the forum of Gephi, which is generally quite active (I’m on it! ;-) ):http://forum.gephi.org

- you can contract professional dataviz specialists: a very focused place to post your request is this Google group:
- finally, I have a consultancy which can help you define the specs of your project:
* which technologies to use to maximize impact and lower barriers (cost, maintenance, compatibility on devices, …),
* which dataviz agency / free lancer would fit best your project,
* suggestions of possible extensions – maybe that your datasets are even richer than you thought?

Books that could be on your shelves:

These books are not free – but let’s imagine you borrow them from your local library?
The reference book for network analysis is still the one by Wasserman and Faust:
But it’s a kind of technical reference book. To get you started you might prefer this textbook which takes NodeXL as a primary tool, and focuses on social networks, but that will still give you the essentials for linked data in general:
For visualization, Andy Kirk is a trusted person in the dataviz community and he has just a book out ( on dataviz in general):
Another book, also by a reference in the community (not focused on linked data):

http://books.google.nl/books?id=CB9XRIv9oigC&dq=Visualize-This-FlowingData-Visualization-Statistics

Training

In the new trend of Massive Open Online Courses (MOOCs) you have two relevant (free) courses by the best scholars in their field:
Lada Adamic on Coursera: https://www.coursera.org/course/sna
and
Katy Borner from University of Indiana: http://ivmooc.cns.iu.edu/

Other lists of tools:

I found these two lists on data analysis and visualization particularly useful:

Next

Was this post helpful? Follow me on Twitter for more frequent news on these topics and more! => @seinecle.

Clement

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code lang=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" extra="">