Data Blast

Data, Telecom, Maths, Astronomy, Origami, and so on

Next Stop Dublin: Public Libraries, Supermarkets and Voronoi Diagrams

Leave a comment

I’ve been living in Dublin for only a couple of weeks and I’d like to write a post related to the city. In these few weeks I’ve visited some places that have surprised me pleasantly, as for example: The Trinity College Library with its “Book of Kells“, the huge Phoenix Park with its deers, and the Science Gallery and its interesting temporal exhibitions. In the surroundings of the city I visited the Celtic Boyne Valley (Trim castle included or “Braveheart” castle) and had the opportunity, for first time, to face the “Irish Bog” in the Seahan mountain near to Tallaght. So, I’d like to say simply I’m delighted with the city and its people. Moreover, it’s a very active city in IT issues with several meetups that worthwhile to consider such as: DublinR, Python Ireland, Hadoop User Group Ireland, DublinKind, and Big Data developers Dublin. A special mention is for Chapters Bookstore, a great find. collageDub Dublin Data As a newcomer to the city, I wanted to know where are located some key sites such as supermarkets or public libraries and therefore I got ready to build a map of locations with its respective Voronoi diagram in order to visualize the area of coverage or influence of each point. According to Wolfram MathWorld, a Voronoi diagram is “a partitioning of a plane with points into convex polygons such that each polygon contains exactly one generating point and every point in a given polygon is closer to its generating point than to any other. A Voronoi diagram is sometimes also known as a Dirichlet tessellation. The cells are called Dirichlet regions, Thiessen polytopes, or Voronoi polygons”. In order to find GPS coordinates in the case of the supermarkets I used a Python script to connect Yelp APIv2. I don’t know which is the problem with Yelp API, but I only could gather 1000 of 1153 points that Yelp search browser indicates and which 442 supermarkets are really in the Dublin city area. In the case of the public libraries I used “geopy” package, which geo-locates a query to an address and coordinates. In both cases, I must say there are some differences in the real position of some places, but as proof of concept, for me it’s OK. As Dublin City area I considered the five areas described in the city website:

  1. Central Area: This includes Broadstone, North Wall, East Wall, Drumcondra, Ballybough and the north city centre.
  2. North Central Area: This includes Kilbarrack, Raheny, Donaghmede, Coolock, Clontarf and Fairview.
  3. North West Area: This includes Cabra, Ashtown, Finglas, Ballymun, Santry, Whitehall, Glasnevin, the Phoenix Park and parts of Phibsborough.
  4. South Central Area: This includes Ballyfermot, Inchicore, Crumlin, Drimnagh, Walkinstown, The Liberties and the south west inner city.
  5. South East Area: This includes Rathmines, Rathgar, Terenure, Ringsend, Irishtown, Pearse Street and the south east inner city.

Additionally and as proof of concept again, by means of Dublinked (Open Data) and AIRO, I got two datasets with information about Primary and Post-Primary schools in Dublin city (census 2013-2014). My idea was for example to know how many students are studying in a particular area of the city or how many students are assigned, say, to a specific library (Voronoi polygon). In the case of Post-Primary schools dataset, school coordinates are in UTM coordinates, so it’s necessary to apply a transformation to GPS Coordinates (e.g. CRS(“+init=epsg:29902”) to CRS(“+init=epsg:4326”)). The datasets contain information (2013-2014) about school ethos or separation by gender but I was only interested in total values. In this Github, you can find kml and csv files. Some example:

library(deldir)
library(ggplot2)
library(ggmap)
library(sp)
library(rgdal)
library(maptools)

#Load data with GPS coordinates for Public Libraries in Dublin City 
df <- read.csv("t_lib.csv",header = TRUE, sep = ",",stringsAsFactors=FALSE)

# Voronoi data
vor <- deldir(df$long, df$lat)

# Creating Voronoi polygons
w = tile.list(vor)
polys = vector(mode='list', length=length(w))
for (i in seq(along=polys)) {
 pcrds = cbind(w[[i]]$x, w[[i]]$y)
 pcrds = rbind(pcrds, pcrds[1,])
 polys[[i]] = Polygons(list(Polygon(pcrds)), ID=as.character(i))
 }
SP = SpatialPolygons(polys)
voro = SpatialPolygonsDataFrame(SP, data=data.frame(x=df$long,y=df$lat, row.names=sapply(slot(SP, 'polygons'), function(x) slot(x, 'ID'))))

#Generating DataFrame with polygons
pvor1=data.frame()
for (i in seq_along(voro)){
pvor2=SP@polygons[[i]]@Polygons[[1]]@coords[,1:2]
pvor2=as.data.frame(pvor2)
pvor2$ID<-df$name[i]
pvor1<-rbind(pvor2,pvor1)
}

#Ploting: Points, Polygons and Segments
dub_map <- get_map(location = "Dublin", zoom = 11)
ggmap(dub_map) + geom_point(aes(x = long, y = lat), data = df, colour = "blue", size = 3)+
geom_polygon(aes(x=V1, y=V2,group=ID,fill=ID),data=pvor1, alpha=0.3)+
ggtitle("Voronoi Polygons for Public Libraries in Dublin City")+geom_segment(
 aes(x = x1, y = y1, xend = x2, yend = y2),
 size = 1,
 data = vor$dirsgs,
 linetype = 1,
 color= "#FFB958")

Voronoi_Dublin In this RPubs you can find the RMarkdown file. Other plots. PS: Donaghmede Library has zero students because this library is out of Dublin City area according to the boundary defined (North Central kml), so surrounding schools were filtered. plot1_1 plot1 plot2 plot3 plot_schools_dens Also it’s possible to generate kml files for points, polygons and segments and put into googlemap.

Comments:

I’d like to comment that “deldir” R package uses the Lee and Schachter’s algorithm for Delaunay Triangulation; however, it’d be interesting to apply an algorithm (e.g. modifying Fortune’s algorithm, etc) that allows generating, say, a weighted Voronoi diagram since in the reality each library has different resources and opening hours and so it’s possible to use other metrics, beyond Euclidean distance. In fact, an interesting next step would be to review “Power diagrams” which are a generalization of the Voronoi diagrams.

As last comment, I want to recommend the book “Longitude” written by Dava Sobel. I know it’s old (1995), but that is also one of the reasons why I wrote this post; it was a kind of inspiration. Well, in short, it’s a true story of a lone genius who solved the greatest scientific problem of his time: measuring the longitude in the sea. It’s a story with a clear scientific background where it’s possible to learn different concepts related to navigation and geography. Moreover, it’s a story of overcoming and how jealousy, egos and ignorance complicate the scientific progress.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s