I want to play a little bit with it, the visualization for kernel must be amazing. Also it will be curious to see how the interactions changed from the 2.4 to 2.6 with the new versioning policies.
What do diagrams, maps and blueprints have in common?
All them are used to communicate some message, or in other words to inform. The fact is that the three of them are different kinds of the same concept; they are visualizations.
But, what is Visualization?
Visualization, or Information Visualization, is any technique or process used to present data in a form so that is easy to create a mental model of it. The form in which the data is presented varies heavily depending on the data characteristics, from a simple icon to indicate the women's restroom to a complex interactive three dimensional representation of molecules.
Although these processes exists since the dawn of humankind, beginning with the prehistoric cave paintings, it was with the invention of computer graphics that visualization improved by leaps and bounds. With the use of computers the size of the tractable data sets grew exponentially, and so largest and more detailed sources of data began to be used. Also, computers made possible to generate animations easily and even provided the means to generate interactive visualizations, so they could be manipulated and explored in real time.
But, of course, there are still problems with the existing visualization techniques. If the data set is large and/or complex it may be very difficult to generate a comprehensible representations. A particular case of this situation is dynamic information, where the data is constantly being updated either by changing its values or by changing the relationships in the set. And the trick, is that without a good visualization we cannot understand well how these systems work.
This was the problem that Ben Fry worked in for his Master's thesis at the MIT Media Lab, titled Organic Information Design. Ben approached the creation of dynamic visualizations by studying simple organisms, as an example for decentralized and adaptive systems. By identifying the features that make these systems effective and taking into account the quirks of the human brain and how we perceive the world, Ben developed a completely new and useful visualization which, you may guess, named Organic Information Design.
I can't continue without noting that Ben Fry, together with Casey Reas, co-founded Processing, an open-source language and environment to program visualizations and interactions.
One example of large and complex data sets are commit histories of software projects. In almost any software project all the source files are keep in what is called a repository, a kind of database that keeps track of all the changes made to the files: modifications, additions, deletions, moves, etcetera. In a relatively big software project this means tracking the changes made by tens of programmers over thousands of files. Open-source projects are a great source to gather this type of data. Take for example the Python's subversion repository and try to get who are the most active contributors, if there's any kind of relationship between developers and the kind of files they work in, and all this now over the whole life of the project. Not an easy task, but a great example to understand the power of visualizations.
Michael Ogawa, a Ph.D. student at UC Davis, has developed a visualization, or organic software visualization, targeted to the commit history of software projects and has tested it in some open-sources programs. This visualization is called code_swarm and has been developed using Processing.
In his own words:
This visualization, called code_swarm, shows the history of commits in a software project. A commit happens when a developer makes changes to the code or documents and transfers them into the central project repository. Both developers and files are represented as moving elements. When a developer commits a file, it lights up and flies towards that developer. Files are colored according to their purpose, such as whether they are source code or a document. If files or developers have not been active for a while, they will fade away. A histogram at the bottom keeps a reminder of what has come before.
Returning to the Python commit history, watch the code_swarm visualization.
It shows in a really straightforward way how at the beginning Guido developed Python completely alone, how different people tend to work in different kinds of files, how the prominence of developers changes or how the project grew spectacularly in the year 2000. All these facts are completely invisible accessing to the raw commit history!
Besides the Python commit history Michael Ogawa has uploaded the visualizations for for the commit histories of Eclipse, and Apache. It's interesting to see the differences between the patterns of these projects, but being all of them absolutely successful.
So when developing any program is important to bear in mind that a good visualization can mark the difference between comprehension of data and total perplexity.
Hello! My name is Eduard Giménez and this is my humble weblog.
I'm a programmer settled in Barcelona, trying to figure out my way. Meanwhile I spend my time studying and trying to solve some problems. My interests range from operating systems, complexity and algorithms to economy, psychology and photography. You can find a more about me in my linkedin profile.
A colophon, in publishing, is a brief description usually located at the end of a book, describing production notes relevant to the edition. — Wikipedia
egimenez.{com,net,org} is powered by blogger and a severely customized version of the Simple II template, using the gorgeous Georgia typeface, a serif typeface designed by Matthew Carter in 1993. Selected items such as anchors and the first header, h1, use color #940f04.