Background
I have over 10 years of industry experience in data science and engineering, working across diverse sectors from hardware to education, with expertise in LLMs/natural language processing, machine learning operations (MLOps), and user experience optimization through experimentation and iterative product development. My experience spans both large tech companies like Google, Shutterfly, and Discovery Education, as well as innovative startups in healthcare and education.
My journey into data science began during my economics coursework and research at UC Berkeley, where I discovered my passion for uncovering patterns in data. Since then, I’ve continuously evolved my skills through online courses, hands-on projects, and staying current with the latest technologies in the rapidly evolving AI landscape.
Current Focus & Expertise
As a Senior Applied AI Engineer, I specialize in building production-ready AI systems and evaluation pipelines. My recent work includes developing custom evaluation frameworks for chained LLM workflows, ElasticSearch RAG, and LLM-as-a-Judge methodologies. I’ve built multiple AWS-hosted AI APIs to enhance content metadata generation and improve search functionality.
Previously as a Senior Data Scientist, AI (recognized with the Employee Innovation Spotlight Award in 2023), I led company-wide LLM implementation and served as the sole NLP expert. I deployed multi-page applications on AWS ECS for human-in-the-loop learning platforms and implemented agentic workflows using ChatGPT and Llama2 with advanced prompt engineering and chaining techniques.
Technology Stack
My current toolkit includes Python as my primary language, with extensive experience in:
- Large Language Models: ChatGPT, Llama2, SBERT, BART, custom prompt engineering and chaining
- Cloud & Infrastructure: AWS (ECS, Sagemaker, Lambda, API Gateway, S3, DynamoDB), Docker
- Search & Retrieval: ElasticSearch, RAG systems, knowledge graphs (Neo4j)
- Machine Learning:
scikit-learn
,transformers
,pytorch
,tensorflow
, SetFit framework - NLP:
spaCy
,NLTK
,transformers
,keybert
, custom multi-label classification systems - Visualization:
matplotlib
,seaborn
,plotly
- Statistics:
statsmodels
, custom statistical functions for experiment evaluation - Data Processing:
pandas
,numpy
, SQL
I’ve developed reusable frameworks for A/B testing, statistical test selection, and automated data analysis pipelines that I’ve open-sourced and shared with the community.
My Timeline
For Fun
When I’m not building AI systems or analyzing data, you’ll find me outdoors - snowboarding in the mountains during winter or playing Ultimate Frisbee in the park during summer. When I’m inside, I love nothing more than a good book (my most recent books) or playing strategic board game with friends (my all time favorite is Brass: Birmingham). I also love building furniture (my first instructables), and while my SLR camera is a little dusty these days, I used to be quite into photography (my photo collection).