How I Used Python and Open Data to Build an Interactive Map of Edinburgh’s Beergardens

With summer finally arriving, I wanted to find out where a good place for enjoying a nice chilled drink (alcoholic or non-alcoholic) outside would be in my hometown Edinburgh. So I combined an open data set about chair permits with some geo-coding and created an interactive map of places with outside seating in Edinburgh.

Background and Project Description

During the last few years, UK government has been working on open sourcing its data and Edinburgh City Council is no exception. At https://edinburghopendata.info, you can find a list of data sets containing information about many aspects of public life (event though some files could admittedly do with some updating). For example this page hosts a file containing details about chair and table permits for the year 2014. Luckily, an up-to-date version can be found here. Note that while the file structure is structurally the same for both files, the headers are different, so if you want to look at the historic data, you will need to adapt the code below accordingly. The file contains names and addresses of premises which have permission to put out chairs as well as some additional information. This file forms the basis of this project, which is divided into four parts:

get and load the permit file
use the open street map API to get the latitude and longitudes for each establishment as well as premise category
clean and bin the premise categories
plot the premises on a map using folium

Without further ado, let’s get started. The full notebook can be found on my GitHub.

Step 0: Setting up

First, we import the libraries.

import pandas as pd
import requests
import wget

import folium
from folium.plugins import MarkerCluster

Step 1: Getting Data

We use wget to download the file and read it into a pandas data frame. Make sure to set the encoding since the file contains special characters (lots of Cafes on the list).

filename = wget.download("http://www.edinburgh.gov.uk/download/downloads/id/11854/tables_and_chairs_permits.csv")

df0 = pd.read_csv(filename, encoding = "ISO-8859-1")
df0.head()

	Permit	Premises Name	Premises Address	Start date	End date	Ext	Table Area
0	5.0	The Standing Order	62-66 George Street	01/04/2019	31/03/2020	22:00	6.0m x 1.5m & 6.0m x 1.2m
1	6.0	Bushwick Bar and Grill	7-11 East London Street	01/06/2019	01/09/2019	22:00	7.85m x 1.8m and 2.5m x 1.8m
2	7.0	La Barantine	202 Bruntsfield Place	01/01/2019	31/12/2019	NaN	2.0m x 1.2m
3	10.0	The Inverleith	10 Bowhill Terrace	18/04/2019	31/08/2019	NaN	4.1m x 3.1m
4	12.0	Angels Share Hotel	7-11 Hope Street	01/05/2019	30/04/2020	22:00	4.8m x 1.4m and 1.5m x 1.4m and 1.5m x 1.4m

A quick look at the data reveals that there are a few duplicates in the data. They are mainly due to multiple permits with different start and end dates. A good way of cleaning would be to filter on dates, but frankly I don’t care that much at this point, so I just keep the premise names and addresses and drop the duplicates. (Note: The file also contains information about the table area, which I might revisit at some point in the future). After dropping the duplicates, we are left with 389 rows with premise names and addresses.

# dropping duplicate entries
df1 = df0.loc[:, ['Premises Name', 'Premises Address']]
df1 = df1.drop_duplicates()

# in 2012: 280
print(df1.shape[0])

A remark on the side: In the summer 2014, there were only 280 premises with a chair and table permit. Open air culture is indeed taking off and this is the data to proof it :)

Step 2: Getting latitudes and longitudes for each premise

If we want to visualize premises on a map, addresses are not enough, we need GPS coordinates. There are different APIs, which allow you to query for an address and will return latitudes and longitudes (a process called geocoding. On possibility is to use the Google Maps API, but it comes with caveats. The OpenStreetMap API provides the same functionality, but is free to use and the results are decent enough for my purpose.

We use the pandas map function to obtain the API response for each row. After querying the API, we drop all rows, where we did get not get a response. Again, I am not too botherered about the few premises (about 20) which I am loosing, there are plenty left.

def query_address(address):
    """Return response from open streetmap.
    
    Parameter:
    address - address of establishment
    
    Returns:
    result - json, response from open street map
    """
    
    url = "https://nominatim.openstreetmap.org/search"
    parameters = {'q':'{}, Edinburgh'.format(address), 'format':'json'}
    
    response = requests.get(url, params=parameters)
    # don't want to raise an error to not stop the processing
    # print address instead for future inspection
    if response.status_code != 200:
        print("Error querying {}".format(address))
        result = {}
    else:
        result = response.json()
    return result

df1['json'] = df1['Premises Address'].map(lambda x: query_address(x))

# drop empty responses
df2 = df1[df1['json'].map(lambda d: len(d)) > 0].copy()
print(df2.shape[0])

Looking at the json fields in the response, we find that in addition to the coordinates, the API also returns a field named ‘type’, which contains the type of premise at this address. I add this information to the data frame together with the coordinates.

# extract relevant fields from API response (json format)
df2['lat'] = df2['json'].map(lambda x: x[0]['lat'])
df2['lon'] = df2['json'].map(lambda x: x[0]['lon'])
df2['type'] = df2['json'].map(lambda x: x[0]['type'])

The most frequent premise types are cafes, pubs, restaurants, tertiary and houses:

df2.type.value_counts()[:5]

cafe          84
pub           69
restaurant    66
tertiary      33
house         27
Name: type, dtype: int64

Step 3: Assigning Premise Categories

I am mostly interested in distinguishing between two types of premises: the ones that sell coffee and are more likely to be open during the day (like coffee shops and bakeries) and the ones that sell beer and are more likely to open in the evenings (like pubs and restaurants). I therefore want to sort my premises into three categories:

Category 1: day-time places (coffee shops, bakeries, delis, ice-cream)
Category 2: pubs, restaurants, fast-food and bars
Category 3: everything else

To do this I have two sources of information: the premise name and the type returned by OpenStreetMap. Looking at the data, we find that the type is good first indicator, but also that many places are labeled incorrectly or not at all. I therefor apply a two-step approach: i) Assign the category based on the OpenStreetMap type ii) Clean up the data using its name, where this steps overwrites step i). To clean up the data, I decided to overrule the OpenStreetMap classification if the premise name contains certain key elements (such as ‘cafe’, ‘coffee’ or similar for coffee shops and ‘restaurant’, ‘inn’ or similar for restaurant and pubs). This misclassifies for example Cafe Andaluz as coffee shop, but works decently well in most cases. Particularly it seems to most keep to the pattern of classifying as coffee shops places, which are likely to be open during the day, so it works for my purpose. Of course, with fewer than 400 entries, one could manually go through the list and assign the correct category to each and every one of the entries. However, I am interested in creating a process, which can be easily transfered to other places, therefore a manual intervention specifically taylored to Edinburgh’s scenery is not suitable.

Step 3a: Assigning Premise Categories According to OpenStreetMap Type

def define_category(mytype):
    if mytype in ['cafe', 'bakery', 'deli', 'ice_cream']:
        category = 1
    elif mytype in ['restaurant', 'pub', 'bar', 'fast_food']:
        category = 2
    else:
        category = 3
    return category

# assign category according to OpenStreetMap type
df2['category'] = df2['type'].map(lambda mytype: define_category(mytype))

Step 3b: Overwriting Categories According to Premise Name

def flag_premise(premisename, category):
    """Flag premise according to its name.
    
    Parameter: 
    premisename - str
    
    Returns:
    ans - boolean
    """
    prem = str(premisename).lower()
    if ((category == 'coffeeshop' and ('caf' in prem 
                                       or 'coffee' in prem 
                                       or 'Tea' in str(premisename) 
                                       or 'bake' in prem 
                                       or 'bagel' in prem 
                                       or 'roast' in prem))
         or 
        (category == 'restaurant' and ('restaurant' in prem 
                                       or 'bar ' in prem 
                                       or 'tavern' in prem 
                                       or 'cask' in prem 
                                       or 'pizza' in prem
                                       or 'whisky' in prem
                                       or 'kitchen' in prem
                                       or 'Arms' in str(premisename)
                                       or 'Inn' in str(premisename) 
                                       or 'Bar' in str(premisename)))):
        ans = True
    else:
        ans = False
    return ans

# flag coffee shops and restaurants according to their names
df2['is_coffeeshop'] = df2['Premises Name'].map(lambda x: flag_premise(x, category='coffeeshop'))
df2['is_restaurant'] = df2['Premises Name'].map(lambda x: flag_premise(x, category='restaurant'))

A quick inspection shows that the reassignement seems reasonable:

# show some differences between classification by name and by type returned by the API
df2.loc[(df2.is_coffeeshop) & (df2.type != 'cafe'), ['Premises Name', 'type']].head(10)

	Premises Name	type
19	One20 Wine Café	bar
27	Café Habana	theatre
38	Southern Cross Café	fast_food
73	Café Rouge	restaurant
95	Café Andaluz	restaurant
114	The Manna House Bakery	tertiary
115	Crumbs Café	house
146	Cafe Renroc	residential
185	Hard Rock Café	tertiary
227	Snax Café	house

I reassign the category for the premises flagged as restaurant or coffee-shop. Should a premise have been flagged as both, the coffee shop category takes precedence:

# re-set category if flagged as restaurant or coffeeshop through name
df2.loc[df2.is_restaurant, 'category'] = 2
df2.loc[df2.is_coffeeshop, 'category'] = 1

Step 4: Visualization

Finally, we use Python’s Folium package to visualize our results as markers on a map. Adding the individual points to MarkerClusters allows us to summarize the symbols into groups if too many symbols are in the same region. Creating a separate cluster for each category allows us to use the LayerControl option to toggle each of the categories individually. We use the ‘fa’ prefix to use the font-awesome (instead of the standard glyphicon) symbols.

# central coordinates of Edinburgh
EDI_COORDINATES = (55.953251, -3.188267)
  
# create empty map zoomed in on Edinburgh
map = folium.Map(location=EDI_COORDINATES, zoom_start=12)

# add one markercluster per type to allow for individual toggling
coffeeshops = MarkerCluster(name='coffee shops').add_to(map)
restaurants = MarkerCluster(name='pubs and restaurants').add_to(map)
other = MarkerCluster(name='other').add_to(map)

# add coffeeshops to the map
for chairs in df2[df2.category == 1].iterrows():
    folium.Marker(location=[float(chairs[1]['lat']), float(chairs[1]['lon'])], 
                  popup=chairs[1]['Premises Name'],
                 icon=folium.Icon(color='green', icon_color='white', icon='coffee', angle=0, prefix='fa'))\
    .add_to(coffeeshops)
    
# add pubs and restaurants to the map
for chairs in df2[df2.category == 2].iterrows():
    folium.Marker(location=[float(chairs[1]['lat']), float(chairs[1]['lon'])], 
                  popup=chairs[1]['Premises Name'],
                 icon=folium.Icon(color='blue', icon='glass', prefix='fa'))\
    .add_to(restaurants)
    
# add other to the map
for chairs in df2[df2.category == 3].iterrows():
    folium.Marker(location=[float(chairs[1]['lat']), float(chairs[1]['lon'])], 
                  popup=chairs[1]['Premises Name'],
                 icon=folium.Icon(color='gray', icon='question', prefix='fa'))\
    .add_to(other)
    
# enable toggling of data points
folium.LayerControl().add_to(map)    
    
display(map)

Supplementary Step 5: Saving the map to png

I wanted to have a screen shot of the map to be able to imbed the static version into my Medium post (which does not accept the dynamic version). The best way to get a static version (which is not just taking a screenshot) which I have found is to save the map in HTML format and then use Selenium to save a screenshot of the HTML. This is how this can be done (credits to this stackoverflow post for the Selenium part).

Note: In order to get the following to work you need to install the geckodriver. Download the file from here and put it into /usr/bin/local (for Linux machines).

import os
import time
from selenium import webdriver

# save map
fn = 'beergarden_happiness_map.html'
tmpurl = 'file:///{path}/{mapfile}'.format(path=os.getcwd(),mapfile=fn)
map.save(fn)

# download screenshot of map
delay = 5
browser = webdriver.Firefox()
browser.get(tmpurl)
# give the map tiles some time to load
time.sleep(delay)
browser.save_screenshot('{mapname}.png'.format(mapname=fn.split('.')[0]))
browser.quit()

Summary

In this post we downloaded an open data set containing chair and table permits from the Edinburgh Council. We then used the Open Street Map API to obtain the types and GPS positions for the premises based on their address. After some additional data cleaning based on the premise names, we binned the premises into the three categories “coffee shop”, “pub/restaurant” and “other” and plotted them on an interactive map, which we saved in HTML format and subsequently converted to png format.

Conclusion

We now have a working beer garden and open air coffee shop map of Edinburgh and can enjoy the summer sitting outside with a nice iced-coffee or an ice-cold beer - Prost! :)

Map put to good use: A chilled post-work drink in the sun :)

Share on

Twitter Facebook Google+ LinkedIn

How I Used Python and Open Data to Build an Interactive Map of Edinburgh’s Beergardens

Jessica Walkenhorst

Background and Project Description

Step 0: Setting up

Step 1: Getting Data

Step 2: Getting latitudes and longitudes for each premise

Step 3: Assigning Premise Categories

Step 3a: Assigning Premise Categories According to OpenStreetMap Type

Step 3b: Overwriting Categories According to Premise Name

Step 4: Visualization

Supplementary Step 5: Saving the map to png

Summary

Conclusion

Share on

Leave a Comment

You May Also Enjoy

Building and Deploying a Meetup Organizer App for London’s PyData Meetup

Why You Shouldn’t Deploy Streamlit Applications from Your Main GitHub Account and What You Should Do Instead

How to use fastai, Gradio and Hugging Face to Save an Ugly Duckling from a Youth of Misery

We Are Beating Child Mortality - How to Create Interactive and Animated Visualizations in Python