Intro

Hello there! I currently work as a senior data scientist in Capital One's internal data science consultancy.

I specialise in building both supervised and unsupervised classification and regression models, timeseries analysis, computer vision and NLP. This work is and has been applied to solve business problems in a huge range of fields; from understanding fraudulent spend patterns and the features that drive customers to default, to classifying the contents of an oil tanker.

My current and previous industry experience covers tech-startup environments in commodities and the financial services sector. I was formerly a postdoctoral research fellow in machine learning applications to high-energy astrophysics.

Please enjoy navigating this webpage for links to some of my publicly-available Github projects. Most of my previous work is not available for public release but I have prepared a few example projects. These cover example NLP applications to sentiment and intent analysis problems, timeseries forecasting, and other more general ML problems. These tend to be in the form of Jupyter notebooks. Several of these projects are fully-developed Python packages installable using PyPi with python 3.7. Please do get in touch to suggest any improvements or report bugs.

GitHub Projects

Please see below for a short description and links to some of my publicly-available model-building work


Timeseries Analysis

A timeseries is just a quantity (number) recorded at regular intervals. Stock-markets, sizemographs, dollar-to-euro conversions and daily virus cases are all examples of timeseries. Because of the massive variety of industrial applications it tends to crop up again and again! The two main applications of timeseries analysis are forecasting and anomaly detection.


  • An intro to timeseries modelling on Medium


  • Divinity
  • This is a timeseries forecasting package similar to facebook prophet with a few tweaks and customisations from my time in astrophysics. It is still very much in the RnD phase with the aim of automating away the features of a timeseries forecast as much as possible.


    Computer Vision

    This branch of data science is concerned with training models to make classifications from an input image (e.g. whether an image is of a Dog or Cat). These problems typically use neural nets to make classifications or predict the probability of an image belonging to each of several classes.


  • Vessel Detection
  • In a previous life I was interested identifying vessels in satelite images. A mock-up of the type of work involved is provided here. The approach uses a convolutional neural net (ConvNet), trained on a publicly-available Kaggle dataset, to identify whether or not an image contains a vessel.


    Natural Language Processing

    A fancy-sounding name for what is simply teaching a computer to understand text. I have previous industry experience with applications of NLP. Some of which I have converted into publicly-available code using non-classified online data for method and concept illustration.


  • Sentiment Analysis
  • Intent analysis is often used to interpret reviews and understand customer satisfaction. Someone may write either a 'good' or 'bad' review and we might want to track these relative occurences over time to understand if some business process is improving or worsening from an end-user perspective.


  • Intent Analysis
  • This tends to be used for things like chat-bots where we need to automatically classify customer querries and direct them appropriately. These are typically multi-class problems.

    About

    Contact

    Elements

    Text

    This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


    Heading Level 2

    Heading Level 3

    Heading Level 4

    Heading Level 5
    Heading Level 6

    Blockquote

    Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

    Preformatted

    i = 0;
    
    while (!deck.isInOrder()) {
        print 'Iteration ' + i;
        deck.shuffle();
        i++;
    }
    
    print 'It took ' + i + ' iterations to sort the deck.';

    Lists

    Unordered

    • Dolor pulvinar etiam.
    • Sagittis adipiscing.
    • Felis enim feugiat.

    Alternate

    • Dolor pulvinar etiam.
    • Sagittis adipiscing.
    • Felis enim feugiat.

    Ordered

    1. Dolor pulvinar etiam.
    2. Etiam vel felis viverra.
    3. Felis enim feugiat.
    4. Dolor pulvinar etiam.
    5. Etiam vel felis lorem.
    6. Felis enim et feugiat.

    Icons

    Actions

    Table

    Default

    Name Description Price
    Item One Ante turpis integer aliquet porttitor. 29.99
    Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
    Item Three Morbi faucibus arcu accumsan lorem. 29.99
    Item Four Vitae integer tempus condimentum. 19.99
    Item Five Ante turpis integer aliquet porttitor. 29.99
    100.00

    Alternate

    Name Description Price
    Item One Ante turpis integer aliquet porttitor. 29.99
    Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
    Item Three Morbi faucibus arcu accumsan lorem. 29.99
    Item Four Vitae integer tempus condimentum. 19.99
    Item Five Ante turpis integer aliquet porttitor. 29.99
    100.00

    Buttons

    • Disabled
    • Disabled

    Form