Daksha Ladia

ML Engineer | Building Scalable AI Systems

About

ML + Backend Engineer pursuing an MS in Computer Science at UMass Amherst. Currently a Graduate Student Researcher at Allen Institute. Former Software Engineer at Microsoft on the Bing Ads team, building ML models for ad ranking and optimization serving millions of users globally.

Love building reliable, scalable AI systems applying research to production, using LLMs, along with AI Security.

Selected Projects

KYC Identity Verification System

End-to-end identity verification system using Fireworks AI for document understanding. Automated extraction from passports and driver's licenses with >95% field accuracy. Includes classification, validation, and complete audit trail for regulatory compliance.
Multi-Agent Browser Automation Platform

Vision LLM-powered system navigating web apps (Notion, Asana) via screenshot analysis and DOM parsing. Modular framework with multi-LLM provider support, hybrid state detection, and automated dataset generation.
Recomm AI - Privacy-Preserving Recommendation System

Shopping recommendation system using Differential Privacy and Federated Learning. Validated against Membership Inference Attacks, misalignment, jailbreaking, and prompt injection attacks on LLMs.
On-Device Privacy-First AI Companion

Multimodal (voice + text) AI companion running entirely on-device. Implemented federated learning for personalized adaptation to tone and mood without data leaving the device.
Healthcare Document Processing Automation

GPT-4o + Google Vision API pipeline for document classification, OCR, and unstructured-to-structured conversion. 99% time reduction in document processing.
Enhanced Information Retrieval via Query Synthesis

Document retrieval pipeline with segmentation, pseudo-query generation, and LLM fine-tuning. 4% precision improvement over BM25 baseline.
Question Answering Bot

LangChain-powered Q&A bot that extracts answers from PDF and JSON documents. Automated question-answer pair generation with context-aware responses using OpenAI API.
LLM Bias Analysis Research

Statistical analysis of pronoun and occupational biases in LLMs. Published research with mitigation strategies. Submitted to COLM'25
Real-Time GitHub Trending Prediction

Apache Kafka streaming pipeline processing thousands of events/second. Neural network models predict repository trends for proactive discovery.
AI SlackBot

Intelligent bot for conversation summarization and Q&A from Slack channel history, improving team productivity.

Publications

Virtual Community: An Open World for Humans, Robots, and Society

Multi-agent embodied AI simulation platform built with Unity and Python. Generated benchmarks for speed and resolution across robots, agents, and scenes. Accepted at ICLR 2026
Do Different Models Differ in their Bias Towards Occupational Preferences?

Comprehensive analysis of pronoun and occupational biases in LLMs using statistical tests. GPT-4o: <5% non-preferred pronoun selection; Qwen2.5: 100% positional bias. Submitted to COLM'25

Research Interests

Context Engineering • Post-Training & Alignment • LLM Inference • Multi-Modal LLMs • Multi-Agent Systems • Bias Detection & Mitigation • Privacy-Preserving ML

Work Experience

Graduate Student Researcher | Allen Institute | Present

Working on semantic-driven language model pre-training, exploring novel approaches to improve language understanding through semantic representations and structured knowledge integration.
Machine Learning Engineer Intern | System1 | May 2025 - Aug 2025

Built multi-agent AI pipeline for NL-to-SQL generation with RAG-based feedback loop, achieving 98% success rate. Designed comprehensive evaluation framework for assessing query accuracy and system reliability.
Personal Project: Automatation of Healthcare Document Processing Platform| Freelance | June 2024 - Aug 2024

Built and sold enterprise healthcare document processing platform to medical insurance company. Architected end-to-end AI pipeline using GPT-4o and Vision API for document classification and OCR, reducing processing time by 99% and enabling thousands of daily document processing with high accuracy.
Software Engineer | Microsoft, Bing Ads | June 2020 - July 2024
- Optimized selective sampling for ad ranking system: 15% runtime improvement, 9% cost savings, 14% efficiency gain
- Built automated Regression Tester Tool from scratch, saving 90% developer validation time
- Enhanced ML models for ad-click prediction: 2-4% AUC improvements across US and international markets
- Led strategic hyperparameter tuning across global markets, increasing user engagement by 4%
Data Science Intern | FarmGuide (Acquired by DeHaat) | Jan 2020 - May 2020

Developed ML models for crop identification and health monitoring with 96% accuracy. Built forecasting algorithms using ARIMA and Prophet for environmental insights and crop recommendations.
Software Engineering Intern | Microsoft, Bing Ads | May 2019 - July 2019

Analyzed seasonality patterns in advertiser campaigns using time-series modeling. Optimized bid pricing algorithms for peak holiday periods.

Recognition

Innovation & Excellence Award

Microsoft (FY20-21) — Pipeline migration project improving operational efficiency
Judge & Mentor

MIT Hacks 2025 (Judge) • HackHarvard 2024 (Mentor) • She Hacks DTU 2021 (Mentor)
Reviewer

Grace Hopper Celebration 2025 — Application review across multiple verticals
MIT Policy Hackathon 2024

Proposed interpretable solution for SpO2 measurement error analysis
Top 1% National Exams

Inspire Scholarship from Government of India (2015)

About

Selected Projects

KYC Identity Verification System

Multi-Agent Browser Automation Platform

Recomm AI - Privacy-Preserving Recommendation System

On-Device Privacy-First AI Companion

Healthcare Document Processing Automation

Enhanced Information Retrieval via Query Synthesis

Question Answering Bot

LLM Bias Analysis Research

Real-Time GitHub Trending Prediction

AI SlackBot

Publications

Virtual Community: An Open World for Humans, Robots, and Society

Do Different Models Differ in their Bias Towards Occupational Preferences?

Research Interests

Work Experience

Graduate Student Researcher | Allen Institute | Present

Machine Learning Engineer Intern | System1 | May 2025 - Aug 2025

Personal Project: Automatation of Healthcare Document Processing Platform| Freelance | June 2024 - Aug 2024

Software Engineer | Microsoft, Bing Ads | June 2020 - July 2024

Data Science Intern | FarmGuide (Acquired by DeHaat) | Jan 2020 - May 2020

Software Engineering Intern | Microsoft, Bing Ads | May 2019 - July 2019

Recognition

Innovation & Excellence Award

Judge & Mentor

Reviewer

MIT Policy Hackathon 2024

Top 1% National Exams