Since its opening in May of this year, I had wanted to attend BigBangData exhibition at Barcelona, but for diverse reasons hadn’t found the time to do it; a curious thing, thinking I live very close to CCCB where the event is held. Anyway, last Sunday I visited the exhibition and in general lines was glad with the format and content presented. I must recognize, however, I had some doubts about the event, in the sense of how organizers would be able to present a complete and comprehensive picture about everything related to the world of Big Data, but in synthesized way and accessible to every public. Now, I can say it’s a very well done exhibition that combines several elements that help the understanding the increase of the volume of data that we are currently experiencing, showing for example, a plenty of interactive apps, description of relevant projects, typical and novel visualisations, use cases, videos with interviews with experts, and many reflexions about the power of data in diverse areas of society. So I highly recommend to visit this exhibition that will be open in Barcelona until October 26th. Madrid will be its next stop from February 25th, 2015.
Today, data blast is an unstoppable phenomenon with unexpected consequences. In this sense, the exhibition starts with the proliferation of data centers worldwide and the rise of the bandwidth in the global communications. In this point, it’s interesting to observe the submarine cable map with the fiber optic intercontinental connexions. All this it’s motivated by a recurrent concept (a buzzword, maybe) throughout the exhibition: “datification”, i.e., the use of data as the element key that will move the business in the future as a new form of value, thinking in a simple analogy with our dependence on oil today. On the other hand, other fundamental concepts are added to the speech to make sense of all this thanks to for instance the increase of the data storage capacity, processing power, and the use of new data analysis techniques. Concepts such as: correlation, prediction, pattern, metadata, data mining, aggregation, geolocation, and algorithms are already very present in our lives. In this part of the exhibition, an obvious but fundamental idea in data analysis arises: “Don’t let you can’t see the wood for the trees”, this is, having a lot of data doesn’t mean having useful information and also it’s too easy to get lost in a tangle of data, so it’s necessary to have methodologies and contrasted analysis process that we allow us to find the gold nugget among many rocks. Moreover, it’s precise to take distance in order to see the things correctly, so we can see the real dimension of a problem.
As couldn’t be otherwise, an introduction to data visualisation has to start with two emblematic examples to allow us to appreciate the power of the graphical representation of a problem in order to find patterns that help to solve it. The first historical example presented in the exhibition is the cholera map by John Snow (1854) which changed definitely how we see a outbreak and the second one is the flow map by Charles Joseph Minard (1869) that shows the path of Napoleon’s troops across the Russian empire of Alexander I. The latter sometimes is considered as the first example in data visualisation. Also other example mentioned is the Königsberg Bridge Problem by Euler (1736), which is considered as the starting point for the graph theory and network analysis. Moreover, a book that has an important place in the exhibition is the “Visual Complexity: Mapping patterns of information” (2013) by Manuel Lima (website) that gathers many of the best known visualisations.
Furthermore, many other interesting data visualisations and interactive apps are presented. Some examples are: Flight Patterns (2006, by Aaron Koblin), Barcelona cruise passenger behaviour (2012, by Telefónica), Russian tourism in Catalonia (2012, by Telefónica), and Barcelona commercial footprints (2013, by BBVA). There is also a visitor data analytic system (by Counterest) based on a camera and a facial recognition algorithm, which allows to get, among other things, the number of visitors in a site, their gender, and their average visit time. With this example, I would like to comment something about the use of this technology. Although the approach presented by Counterest is to monitor for example, a store or a supermarket in order to get the customer profile and so to improve its product offering, etc also it’s truth that the use of personal images is against the privacy and anonymity of the consumer. In this sense, I remember a news of 2012 where a mobile app called SceneTap (website / news) caused some controversy because, say, a similar system was used in a bar to determine the gender and approximate age of its customers and then all information could also be available through social networks. It’s easy to imagine various ways in which this information can be used positively or negatively. However, the use of this kind of algorithms (and similar technologies) isn’t new. In fact, different universities worldwide for years are working on pattern recognition mechanisms, but now with all available resources (processing, communications, storage, etc.), it’s already possible to apply them massively and so a new scenario is open and the rules of the game are still fuzzy. Maybe it’s the moment to apply some type of regulation, although I understand it’s a complicated topic because restrictions generally could stop the innovation and often ethical aspects are at disadvantage in comparison with business.
Other group of apps are related to the sentiment analysis, i.e. identification and extraction of subjective information from different sources (e.g. Twitter, Facebook, etc.) by using techniques such as NPL, text analysis, semantic analysis, etc. Here an interesting app is “We feel fine”, a project that explores human emotions on a global scale. Moreover, other project that caught my attention was “Prime Numeric: Live Remix of the UK Leaders Debate” (2010) by SoSo Limited where they apply LIWC (Linguistic Inquiry and Word Count) text analysis libraries to track things like emotion or keyword used by candidates in a debate and then to try to assess the accuracy or vagueness in their expressions and finally to indicate the degree of credibility of each candidate. Now, I wonder how it would work with our politicians in Spain?. Better, don’t answer.
Other topic in the exhibition was how data can be used to improve the democracy in terms of transparency, the promotion of open data politics, and data as a social asset. Civio Foundation is present with some projects that help to understand the responsibility to inform correctly and transparently to the citizen topics of public interest; so its slogan is very revealing: “Bye Obscurity, Hello Democracy”. Some of their projects are: The Pardonometer (Indultometro in Spanish) and Where do my taxes go?. Also I would like to mention a project called Afghanistan: The War Logs by The Guardian where shows insurgent fatalities and cleared devices (IED Attacks) in this troubled region. In general there exists a clear emphasis on the use of data as a key element to inform more rigorously and there is a special nod to data journalism that is very trendy today.
Finally, it isn’t my intention to enumerate all applications or visualisations in the exhibition, I only wanted to present an overall vision about the event and of course recommend it. Now, as final point, there is an issue at the end of the route that I would like to highlight which is “the tyranny of data-centrism” or to put data culture in the center of decision-making. A poster says that there are many possibilities associated to the analysis of massive datasets, but also there are risks and a latent danger of thinking that always the answer to the problems lies in data and “values as subjectivity and ambiguity are especially important at the time when it’s easy to believe that all solutions are computable”. By the way, as curiosity there is a installation called “24 HRs in photos” by Erik Kessels formed by mound of printed photographs that correspond to the images uploaded to Flickr over a 24-hour period…. really a dump of photos which also is an invitation to reflection on the use of our personal photos.