Capital One

For my time at Capital One, I was on the Data Infrastructure team where I built an internal tool for company analysts. I was also lucky to work on a second project with the machine learning team.

Query Parser (Data Infrastructure Team)

Built an internal tool for Capital One employees to cross query company databases using Apache Drill and Facebook Presto
Developed plugin scripts to connect MySQL, PostgreSQL, Snowflake, Redshift, S3, and CSV files to the Drill and Presto code base so that single SQL queries can adapt to and scrape all databases and return a unified result set to the user
Wrote shell scripts to automate the entire Drill setup on an AWS elastic compute cloud (EC2) server

Jira Bot (ML Team Project)

Built an internal Slack Bot that connects Capital One employees based off their skillsets and project similarities
Used an unsupervised machine learning algorithm, tf-idf, to cluster correlated projects so developers could reach out for advice
Compared results to the Paragraph Vector neural network model and determined that tf-idf was better suited for Jira use cases
Python, GenSim, tf-idf, Paragraph Vector Model, Doc2Vec