Bayesian prediction models for customer default, Hidden Markov Models for fraud detection (Oakam Loans Ltd., contract completed)
Use of probabilistic models to determine internal fraud using minimal information to determine low probability events.
General and specialised statistical models (e.g. logistic regression & Bayesian logistic regression for external risk prediction)
Clustering models using Hadoop, Hive and Pig on AWS for utility data (Advizzo Ltd., contract completed)
Development of an Amazon Web Services big data platform to perform machine learning on electrical smart meter data and to construct nearest neighbour behaviour patterns
Development of a general risk strategy and assessment of customer risk level (Auto Service Finance Ltd., contract ongoing)
Extraction of credit reference agency data (XML data extraction using R) to build customer risk monitors.
General statistical advice on the correct application of statistical techniques to risk assessment in general. For example, comparison of default and non-default populations.
Developing & building customer behaviour models using R & SQL.
Development of models and protocols to measure levels of statistical uncertainty from a variety of scientific instruments.
Development of numerical algorithms for SMART electrical metering system signal disaggregation.
Customer segmentation and time-series analysis for customer energy forecasting (Matlab and R)
Experienced Customer Segmentation (mobile SMS and web campaigns) modeller. Designed advanced techniques for targeted marketing.
Created analysis tools using customer and transactional data analysis (SQL and R)
e.g. combination of logistic regression coupled with a neural network for improved actionable insight
Experience of managing and manipulating very large transactional datasets.
Theoretical and practical knowledge of regression (generalised linear modelling) and clustering (k-means, CHAID, etc.), supported by advanced Excel and VBA skills.
Statistician delivering leading edge analytical solutions for mobile SMS/web-based direct marketing campaigns.
Optimised SMS/WEB message generation using advanced general linear modelling techniques.
Language processing and predictive modelling of text.
Development of campaign analysis algorithms (in R)
Optimised constrained space experimental designs for best message generation.
General linear model sample-size determination to assess message importance.
Successful incorporation of campaign analysis algorithms into a JAVA based marketing analytics platform reducing campaign results generation time.
Supporting the Head of Strategy and Pricing Development Manager in developing the overall customer and product roadmap for both paid new media advertiser e-services and print advertiser products.
Ownership and development of online advertising Return On Investment (ROI) models and ongoing model tracking using advanced Excel spreadsheet modelling.
Support customer Lifetime Value & profitability calculation and prediction.
Development of customer insight across multiple product holdings using the following to recommend compelling customer propositions for both online and print advertiser products:
e.g. using a mixture of logistic, linear and GLM regression and SQL for data retrieval (>1million rows).
Cross-sell, up-sell and retention/attrition.
Pathway analysis and propensity modelling.
Profiling, segmentation and trend analysis.
Customer behaviour modelling.
Development of customer insight through the use of techniques such as:
CHAID, clustering including Trees and Principle Components Analysis.
Logistic, linear, multiple, multivariate, general and generalised regression.
Advanced spreadsheet modelling (including VBA)
Development of complex statistical models using R/Splus/SPSS for market data modelling.
Supporting business impact modelling and financial scenario analysis, business case / business model development.
Application of advanced statistical techniques solving non-routine problems.
e.g. used probabilistic principal components analysis to impute missing values in market data.
Insurance Services Office
Delivered leading edge expertise in modelling techniques and end to end modelling/analytics projects.
Full life-cycle experience of implementation, analysis and calibration of complex mathematical/statistical models.
Application of statistical modelling to personal insurance claims data/business data using a variety of techniques including:
Piecewise regression, clustering and segmentation to derive cost/benefit analysis and the identification of KPIs.
Generalised linear modelling for propensity analysis.
Data analysis using SQL, including data cleansing (ETL) and advanced Excel.
Management of statistical/BI projects.
Presentation of results both internally (up to and including board level) and externally (e.g. RBS, Zurich, and Liverpool-Victoria). Mentoring staff on all aspects of statistical modelling/statistics.
System Analytics and Applications, Ahura Scientific
Initially joined Ahura Scientific (a US-based (Massachusetts) defense and security company) to research & develop the embedded statistical and decision support software for their newest integrated analytical system for the non-invasive detection of counterfeit drugs, impurity assessment etc. I also worked on their unknown chemicals and explosives system.
Work focused on the development and practical application of multivariate statistical procedures for the identification of explosive materials in unknown binary mixtures using Raman spectroscopy, multivariate statistical geometry and Bayesian analysis techniques. Algorithm quality/optimisation was performed through the use of decision theory and extensive data simulation. Evidence-based decision and real-time error propagation work.
More Work Experience
Less Work Experience
My work at LGC Ltd. (Europe’s leading biotechnology and life sciences laboratory) focused on the application of both routine and advanced statistical - mathematical models to experimental problems within the laboratory.
Development of software (VBA) for; the propagation of uncertainties in x and y for a linear model, error propagation using Monte-Carlo resampling and bootstrapping to determine between-laboratory precision.
Investigated the coupling of a logistic regression model with binomial distribution theory to obtain a qualitative proficiency scoring model.
Developed a 3-D visualistion routine (Matlab) for protein separation investigation using multidimensional mass spectrometry.
Application of chemical metrology and multivariate statistics to problems within the field of bioinformatics leading to the development of a new method for simultaneous chemical identification, later published (awarded an honorarium by the Government Chemist).
In 6 years at LGC Ltd., I published 5 papers/publications, all of which involved a high degree of modelling.
Development of a cost reducing mechanism through the application of neural networks to experimental designs, and also designed a new and highly generalised numerical method to estimate multivariate regression uncertainties, again published in a high quality peer-reviewed journal.
Minimisation of reference material stability uncertainty; using robust linear modelling (M and MM-robust estimators) and the Mandel-Paule weighting algorithm (modelling performed using R).
During 2006 – 2008 I applied my modelling/programming skills to the field of metabonomics and proteomics (publication awaiting journal submission); the product was the successful completion of a comprehensive untargeted approach for the identification of biochemically interesting compounds using Orthogonal Partial Least Squares Regression with controlled false discovery rates.
Education and Training
Data Science & Analytics
Royal Holloway, University of London
e.g. design, construction & implementation of a database system
Programming (R, Matlab, Java core)
e.g. coded single-layer BP neural net & regression decision tree
Real-time & batch machine learning
e.g. coded expert learner systems
Hidden Markov Models
Large scale data storage and processing (MongoDB, Hadoop, MapReduce)
e.g. data-mining & visualisation of Gb data sets
e.g. integer programming, non-integer programming, heuristic & non-heuristic algorithms
Dissertation: Quantitative Structure-Activity (QSAR) Modelling with Conformal Prediction
PHD Analytical Chemistry
Specialised in the application of chemometric theory, supervisors: Dr. Hywel Evans and Prof. P. Worsfold
More Education and Training
Less Education and Training
Bsc (hons) Ocean Science & Astronomy
Third year dissertation supervision by Dr. Hywel Evans