Chris Brownlie

Data Scientist & Statistical Officer - Department for Education

Aug 2019 - Present
Data Scientist (Higher Statistical Officer)
- Creating complex R Shiny applications for interactive analysis
- Developing SQL Server databases for the collection and storage of data
- Creating interative dashboards using Microsoft PowerBI
- Modelling complex funding using multiple data sources

Sep 2018 - Aug 2019
Data Scientist
- Providing monthly financial forecasts for a £18bn spend
- Transferring complex funding models from Excel into R
- Created an R package for survey analysis

Sep 2017 - Sep 2018
Graduate Data Engineer
- Experience manipulating and exploring a complex SQL Server Data Warehouse
- Extensive learning of SQL and R
- Development of interactive applications for monitoring Data Warehouse processes

Editor - Data Slice

March 2020 - Present
Editor
- 500+ subscribers
- Average over 4k views per article
- Data Slice is an online publication I founded to host my blog and to combat the devaluation of Data Science articles on Medium
- Articles focus on interesting datasets or topics and always aim to provide novel insights
- See 'Articles' section for examples

Writer - Towards Data Science

April 2019 - Present
Writer
- 400+ followers of my personal profile
- 50k+ views across all 12 articles
- Towards Data Science is the largest Data Science publication on Medium with over 8m monthly viewers
- From my first blog post in April 2019 I gained several thousand views and was asked to contribute to TDS
- See 'Articles' section for examples

Sep 2019 - Sep 2021
MSc Data Science - University of Sheffield
- Studied part-time whilst working as a Data Scientist at the DfE
- Key modules include: Data Mining, Data Visualisation, Researching Social Media & Database Design
- Awarded funding by the Department for Education Analytical Community
Sep 2014 - Jul 2017
BSc Economics - University of Nottingham
- Key modules: Econometrics, Mathematical Modelling, Development Economics
- Campus Ambassador for Nottingham Economics and Finance Society
- Awarded Gainsborough Prize 2017 for best submission to the Nottingham Economic Review

Mel Rugby

Mel is the name of both a Twitter bot which I created for the Rugby World Cup 2019 to tweet score predictions for matches, and the Machine Learning project which produced the predictions.

I started this project shortly before the Rugby World Cup in 2019 and the project was developed using R. There are four core parts to the framework:

The Scraper
The first part of the project uses various webscraping packages in R (mainly httr, xml2, rvest and RSelenium) to scrape a variety of data from across the internet, pertaining to rugby matches. This includes: match results and statistics (since the last time this was run), upcoming fixtures and team announcements for upcoming games.
The Feature Extractor
In this part of the project, the raw data is transformed into a selection of structured, readable tables. Additional features are also extracted such as: team form, individual player form, a hybrid world ranking and relative strength of forwards vs backs.
The Models
I used a combination of models to produce the best results. After investigating, I found I had the best results when using a decision tree classifier to identify 'high-scoring games' - where the match is likely to be very one-sided. Then depending on how upcoming fixtures were classified, they were fed into a model that was trained on previous instances of that type of match. For example, if an upcoming game is predicted to be a high-scoring game, the result is predicted using a model for high scoring games. This helped to deal with the issue of historic data being imbalanced towards low scoring games
Mel
The final stage of the framework is a series of scripts which take outputs from previous steps, format them and then tweet them from the @mel_rugby Twitter account. This includes tweeting predictions for upcoming games as well as results for previous games and updates on the accuracy of the framework.

All four parts of the model were combined into a single master script, meaning this script could be scheduled regularly and required no human interference.

surveyr

'surveyr' is an R package I developed from scratch to enable quick and easy analysis of survey responses. It is available on my Github.

This was developed as part of the Data Science Labs programme at the Department for Education. During the programme I was presented with a problem that colleagues in the Department were experiencing.

The problem was that for large surveys, colleagues were struggling to analyse the responses, either taking small samples and analysing by hand or contracting the work out for several thousand pounds.

I approached this by using my expertise in Text Analytics and Natural Language Processing to develop a package which would allow other analysts in the Department to quickly and easily analyse survey responses, even if they had no experience with text analysis.

Chris Brownlie

Data Scientist & Writer

Nottingham, UK

Experience

Data Scientist & Statistical Officer - Department for Education

Data Scientist (Higher Statistical Officer)

Data Scientist

Graduate Data Engineer

Editor - Data Slice

Editor

Writer - Towards Data Science

Writer

Education

MSc Data Science - University of Sheffield

BSc Economics - University of Nottingham

Articles:

3.5 Years of a Relationship, in Whatsapp Messages

An Automated Framework for Predicting Sports Results

A Game of Words

Projects

Mel Rugby

surveyr

Contact Me