Nerd Tools: SWIG

tl;dr: Go check out SWIG.

This past week David Ediger and I gave a brief introduction to STINGER to some folks from GTRI who are using a version of STINGER in the DARPA ADAMS Project.  For anyone who is curious, STINGER is a dynamic structure for storing streaming graph data that can represent temporal and semantic information.  In the past six months or so David and I along with other members of the lab have spent a lot of time  developing and improving STINGER to be robust and provide parallel performance on x86 and the Cray XMT.   The GTRI research team was a highly technical audience and seemed to be very comfortable with the C API and our introduction to the data structure.  After the presentation we even threw in some live code demos (usually a mistake, but in this case it worked out really well – we were even able to replicate the same results we had shown in the presentation during the demo).  I left the meeting feeling really good about STINGER, both where it is and our future plans for it.

STINGER came up again later that day during another meeting for the DARPA SMISC Project.  The team consists of a number of computer scientists and sociology researchers, so I began down the road of explaining the C API and what it was capable of only to realize that I was quickly losing everyone at the table getting into the details of the fine and course grain parallelism and the CUDA + MPI version that I’m developing.  They explained that they mostly deal with machine learning libraries and database connectors in higher-level languages like Java and Python (SciPy and Weka) and they may need to instantiate complex data per each edge or vertex dynamically.  I had forgotten that not everyone is in High Performance Computing and wants to crank out the fastest most efficient code reasonably possible.  We started to think that it might be necessary to hire a programmer to write a wrapper for STINGER just so that it could talk to their algorithms and data sources and vice versa.

This meeting was a tipping point for me.  We recently had an extremely talented intern join our group for the summer, and we put her to work trying to connect STINGER to a visualization suite.  At the time we had chosen Gephi which is open-source and written entirely in Java.  I had explored what it would take to write a JNI wrapper for STINGER to attempt to shoehorn it into Gephi as the internal graph representation — I didn’t want to give our intern a task that I didn’t that I could accomplish — and found that while it would take some time and effort, it could be done.  After the SMISC meeting, I was certain that in order for STINGER to gain wider acceptance within the social network analysis community (and the broader graph theory community), STINGER really needed a higher-level interface too.  By this point we had decided that Gephi’s internal graph representation and GUI structures were too tightly entwined to be able to use it and were looking at other options, but all seemed to use Java or Python.

As it turns out, creating a Python interface to a C library is pretty similar to the JNI setup.  I started to think about how I could write scripts to build the necessary connectors so that we could change the underlying C code easily without having to put much work into updating the wrappers.  Once I realized that this was probably doable, I started to think certainly it must have been done before.  As it turns out, there is an incredibly powerful tool called SWIG (Simplified Wrapper and Interface Generator) that does just what I was looking for.  SWIG takes C and C++ headers and generates code to compile into a library that can be imported into your target high-level language.  The targets include Perl, PHP, Python, Tcl, Ruby, C#, Common Lisp, D, Go language, Java including Android, Lua, Modula-3, OCAML, Octave and R.  It can even support passing complex data types like structures, pointers, and unbounded C arrays of both basic and user-defined types across the language boundary, and designs like singletons completely work.  I experimented with it for a few hours that evening on increasingly complicated test designs until I was satisfied that it could handle STINGER.  By the time I went in the next morning, STINGER had two new build targets, one for Java and one for Python.  It now even has a working demo code in Python that builds a random STINGER graph and runs connected components on it.  I also built a filtering iterator construct that works in C, Python, and Java to enable exploring the graph in a way similar to the C-only traversal macros.

In the end, I found SWIG to be a truly impressive and extremely useful tool.  It will definitely have a permanent place in my development toolbox and STINGER now has new interface options to aide its adoption into the graph community.

This entry was posted in Programming and tagged , , , , , , , . Bookmark the permalink.
  • Rohit Banga

    Awesome find … that (hopefully) puts an end to our regular C vs other programming languages debates

    • robmc2049

      Haha maybe. In calculating connected components, the iterators in C vs. traversal macros have a 50% overhead and using the Python bindings is 15~18x slower than C.