AV

Loading...

MS Data Science @ Columbia University

Arjun Varma

 
Open to Summer 2026 internships — Data Scientist/ML Engineer/Quant intern

Advanced Data Science Consultant @ ZS Associates | Building intelligent systems with ML, Deep Learning & AI for Fortune 500 healthcare clients.

About Me

Passionate about transforming data into actionable insights

I'm an Advanced Data Science Associate Consultant with a passion for building intelligent systems that drive real business impact. Currently pursuing my Master's in Data Science at Columbia University, I bring 3+ years of experience from ZS Associates where I've worked with Fortune 500 healthcare clients on ML platforms, predictive analytics, and LLM-powered solutions.

I specialize in transforming complex data into actionable insights using Python, SQL, PySpark, and modern ML frameworks. My work spans from building organization-wide analytics platforms to developing early cancer detection models that can potentially save lives.

Education

Columbia University

New York, NY

Master of Science in Data Science

Aug 2025 - Dec 2026

  • TA for Business Analytics II: Foundations of AI at Columbia Business School
  • Volunteer at Columbia Disability Services

Vellore Institute of Technology

Vellore, India

B.Tech in Electronics & Communication Engineering

Jul 2018 - May 2022

  • GPA: 4.0/4.0 (WES Evaluated)
  • Special Achiever Award & Merit Scholarship

Career Journey

From engineering student to data scientist at Fortune 500 companies

2018

Started B.Tech at VIT

Electronics & Communication Engineering

Began undergraduate studies at Vellore Institute of Technology

2022

Graduated with 4.0 GPA

Special Achiever Award & Merit Scholarship

Completed B.Tech with perfect GPA (WES Evaluated), received academic honors

2022

Joined ZS Associates

Decision Analytics Associate

Started career in analytics, building PySpark/SQL pipelines for Fortune 500 healthcare clients

2024

Fast-Track Promotion

Associate Consultant (4 cycles vs typical 5)

Promoted early due to accelerated performance; received Expert Associate and Insight Illuminate Awards

2024

Hackathon Top 10%

Lateral Transfer to Data Science

Selected for Data Science vertical after top finish in company-wide hackathon

2025

Advanced Data Science Consultant

ML Platform & LLM Projects

Led org-wide analytics platform, piloted RAG-based LLM for FDA documents, BTC cancer detection model

2025

Started MS at Columbia

Data Science

Pursuing Master's in Data Science; TA for Business Analytics II at Columbia Business School

?

Summer 2026 — Open to opportunities

Work Experience

3+ years of building data-driven solutions at scale

Advanced Data Science Associate Consultant

ZS Associates

Pune, India | Feb 2025 - Jun 2025

  • Worked in Performance Analytics, Forecasting, and Data Science teams for Fortune 500 healthcare clients; collaborated with US-based stakeholders
  • Built and deployed an organization-wide analytics + ML platform consolidating multiple data sources to surface real-time KPIs by territory and product; partnered with PMs and marketing heads for a >$10B revenue oncology portfolio
  • Drove adoption by 1,000+ sales reps and HQ leaders, replacing Excel reports and cutting prep time from days to minutes
  • Piloted a retrieval-augmented LLM to turn FDA approval documents into concise briefs for commercial teams
Award Winner

Decision Analytics Associate Consultant

ZS Associates

Pune, India | Jul 2024 - Jan 2025

  • Led a 5-member team on a strategic initiative to overhaul legacy business rules and modernize processes, saving ~50 hrs/mo and improving first-pass quality to >99%
  • Built and productionized Positive-Unlabeled (PU) learning models at a Fortune 500 organization to systematically infer missing categorical labels in transactional data
  • Implemented automated model drift checks and unit testing to ensure long-term reliability
  • Scored top ~10% finish in company-wide hackathon; selected for lateral transfer into Data Science vertical
  • Received Client Contraste Award for outstanding client outcomes and feedback
Fast-Track Promotion

Decision Analytics Associate

ZS Associates

Pune, India | Feb 2022 - Jun 2024

  • Engineered PySpark/SQL pipelines integrating multiple data sources to deliver brand performance insights across multiple products
  • Defined patient-cohort inclusion/exclusion rules robust to missing/miscoded fields; yielded consistent, audit-ready analytics
  • Drove reporting and ad-hoc analytics that surfaced care gaps and market opportunities, informing key brand strategies across multiple new launches
  • Promoted to Associate Consultant in 4 cycles (typical: 5) via accelerated performance; received Expert Associate and Insight Illuminate Awards

Featured Projects

From ML models predicting cancer to LLM-powered chatbots

Featured Case Study

BTC Cancer Early Detection

Anomaly Detection & Predictive Analytics

ZS Associates | Jan 2025 - May 2025

Developed an ML model to predict monthly Bile Tract Cancer diagnoses from a pool of 250M patients. Addressed critical 45-day claims data delay and improved performance using advanced clustering techniques.

  • 250M patient pool analysis
  • Advanced clustering techniques
  • Industry conference presentation
XGBoostK-meansNLP ClusteringSHAPMLflowPySpark

Financial RAG Chatbot

LLM & Information Retrieval

Columbia University | Nov 2025 - Dec 2025

Built an LLM-powered RAG chatbot that answers questions about company financials from SEC filings. Implemented Streamlit UI + FastAPI backend with ChromaDB semantic retrieval.

  • 4.5/5 quality score via OpenEval
  • SEC filings integration
  • Semantic search with ChromaDB
PythonLangChainChromaDBFastAPIStreamlitGPT-4

Scene-AI

Computer Vision & Deep Learning

Personal Project | 2025

AI-powered scene understanding and analysis application deployed on Railway. Leverages modern ML techniques for intelligent scene recognition and processing.

PythonPyTorchRailwayREST API

Agricultural Product Classification

RAG & Classification System

Columbia University | Aug 2025 - Oct 2025

Built an AI-assisted, retrieval-augmented generation product-classification system for a Series-B East African agtech. Achieved 99% holdout accuracy using GPT-4.

  • 99% holdout accuracy
  • Real-time REST API
  • Compliance & risk alerts
PythonGPT-4RAGREST APIDashboard

Technical Skills

Technologies and tools I use to bring ideas to life

Programming

Python
SQL
C++
R

Analytics & ML

PyTorch
Scikit-learn
Pandas
NumPy

Big Data & MLOps

PySpark
Databricks
MLflow
AWS

Tools & Platforms

Git
Jupyter
Streamlit
Docker

Also experienced with

Deep LearningData EngineeringETL PipelinesSHAPBeautifulSoupmatplotlibS3EMRAthenaSageMakerJiraConfluenceLangChainChromaDBRAGLLMs

GitHub Activity

Recent contributions & projects

View Profile

Recent Activity

Repositories

Fantasy Premier League

Avid FPL player with Top 1% finishes for 4 consecutive years. Combining data analysis passion with sports strategy!

Top 1%4 Years

Get in Touch

Interested in collaborating or have a question? Feel free to reach out!

Contact Information

Email

av3342@columbia.edu

Phone

(347) 987 9427

Location

New York, NY

Connect with me

Download Resume

Send a Message