Will Jobs

Data scientist with expertise in NLP, computer vision, and traditional ML who excels at communicating with all levels of the business

EXPERIENCE

Travelers Insurance
Data Scientist
Apr 2024 – Present
  • Leading the development of an LLM application to create coverage analyses for reported losses
  • Trained transformer models (BERT, etc.) for multi-label classification of text in general liability lawsuit documents and leveraged LLMs to identify and extract key information
  • Supported model monitoring efforts by creating an ETL data pipeline to move data from Elasticsearch into Snowflake
  • Managed a data science intern and provided mentorship to junior data scientists
Senior Associate Data Scientist
Jul 2022 – Apr 2024
  • Trained and deployed a neural network on aerial imagery to identify damage to residential properties after severe wind events
  • Led the migration of a business-critical modeling code base from SAS to Python and from on-prem to AWS. Technical requirements included the ability to handle over 100M auto policies while improving run-time
  • Evaluated the feasibility of using tree-based models trained on features derived from aerial imagery to improve price segmentation in an auto insurance product
Associate Data Scientist
Jun 2021 – Jul 2022
  • Used double generalized linear models to determine appropriate pricing factors for millions of personal insurance policies
  • Trained an ensemble model to identify cross-selling opportunities for existing customers as part of a data science competition, coming in 2nd place out of 30 teams
ROI Solutions
Database Developer
Feb 2018 – Jan 2020
  • Used an AutoML platform (DataRobot) to train models to improve the response rate to a client’s direct mail campaign by over 20%
  • Refactored and improved on a legacy Oracle PL/SQL code base underlying a CRM application serving dozens of non-profit clients
  • Worked with client-facing teams to identify user needs and to diagnose and fix bugs in the application
The Cadmus Group
Associate
Aug 2011 – Feb 2018
  • Conducted analyses and created visualizations using Python, R, SQL, and Tableau in projects related to climate change, hydraulic fracturing, drinking water contaminants, greenhouse gas emissions, and home energy use
  • Created a web app incorporating Tableau dashboards for the Army Corps of Engineers to prioritize assets most vulnerable to climate change
  • Developed database applications in support of a variety of EPA programs

EDUCATION

University of Massachusetts, Amherst
Master of Science in Statistics (3.98 GPA)
Aug 2019 – May 2021

Certificate in Statistical and Computational Data Science. Coursework included regression, Bayesian statistics, machine learning, neural networks, natural language processing (NLP), survival analysis, design of experiments, and visualization

Vassar College
Bachelor of Arts in Chemistry, Minor in Computer Science (3.99 GPA)
Aug 2006 – May 2010

General Honors, Departmental Honors, Phi Beta Kappa Society. Publications in ACS Omega and Biophysical Journal

PROJECTS

Dog Breed Classifier | Python, fastai, Streamlit, deep learning
  • Created a web application using Streamlit to classify photos of dogs into one of 150 AKC-recognized breeds
  • Used fastai to fine-tune a ResNet convolutional neural network on 20,000 images of dogs scraped from the internet, achieving a test set accuracy of 53.8%
  • Developed a high-level Python API to greatly simplify the process of downloading public comments from Regulations.gov. The project has attained 20 stars on GitHub as of July 2024
  • Abstracted away the complex pagination scheme and layers of requests while also handling API request limits

TECHNICAL SKILLS

Languages: Python, R, SAS, SQL, PL/SQL (Oracle), JavaScript, HTML, CSS, VBA, Java, C++

Cloud: Amazon Web Services (AWS Certified Cloud Practitioner), Google Cloud Platform (GCP)

Data Science & Machine Learning: LLMs, PyTorch, AWS SageMaker, AWS Bedrock, fastai, DataRobot

Visualization: Tableau, ggplot2, Matplotlib, D3.js

Other: Git, Linux, MS Office