Research Engineer
Lotus Health Ai
Location
Lotus Health San Francisco
Employment Type
Full time
Location Type
On-site
Department
Engineering
Founding Research Engineer
Location: San Francisco (On-site)
Compensation: $180,000 – $220,000 + variable equity based on cash compensation
Technologies: Python, PyTorch, JAX, Transformers, vLLM, Triton or CUDA, PostgreSQL or pgvector, DuckDB, AWS, Modal, Ray, MLflow or Weights and Biases, LangGraph or Agents frameworks
About Us
Lotus Health is a groundbreaking primary care app that integrates your medical records, AI, and real doctors to provide free, personalized healthcare and prescriptions.
Our team includes ex-founders and engineers who have built and scaled consumer apps to millions of users, generating over $100M in annual revenue. Lotus is backed by Kleiner Perkins, and clinicians at Harvard and Stanford.
Role Overview
We are hiring a Founding Research Engineer to design, prototype, and ship the core reasoning and agentic systems that power Lotus. You will turn messy health data into accurate, cited, and actionable guidance for patients and clinicians. The work blends applied research with product engineering: dataset curation, model training and evaluation, retrieval and tool use, safety and alignment, and putting breakthroughs into production. You will work side by side with clinicians and the founding team and your work will reach patients quickly.
Responsibilities
-
Research and Modeling
Build and iterate on agentic workflows that use retrieval, function calling, and planning to answer health questions with citations and uncertainty awareness
Fine tune and distill models for summarization, extraction, classification, and dialogue grounded in medical data
Develop abstention, routing, and fallback strategies that favor safety and correctness
-
Datasets and Labeling
Curate high quality datasets from EHR, claims, labs, devices, and chat logs with rigorous deidentification
Design synthetic data and clinician-in-the-loop labeling pipelines that reflect real clinical use
-
Evaluation and Safety
Own the eval stack for medical correctness, hallucination, bias, safety, latency, and cost
Build automated red teaming and regression suites tied to clinical guidelines and source citations
-
Systems and Production
Ship research to production with robust observability, feature flags, and rollback plans
Optimize inference with batching, quantization, LoRA or QLoRA, and vLLM or TensorRT where appropriate
Collaborate with data and product engineers on retrieval, storage schemas, and lineage so every claim is explainable
What You Bring:
Strong Python and deep experience with PyTorch or JAX
Track record shipping applied ML or LLM features to real users, not just prototypes
Hands-on experience with RAG, prompt and tool design, evaluation harnesses, and error analysis
Solid SQL plus comfort with vector stores and retrieval patterns
Product sense, pragmatism, and the ability to reduce complex systems into simple, reliable components
Willingness to work on-site in San Francisco
Bonus Points:
Experience with clinical ontologies and standards such as SNOMED CT, ICD, LOINC, RxNorm, FHIR, and NCPDP
Background in RLHF or RLAIF, distillation, program-of-thought, and structured tool use
Experience building speech or multimodal pipelines for medical settings
Contributions to open source, published work, or well known eval frameworks
Deep familiarity with observability stacks such as Sentry and Langfuse and containerized deployments such as Docker or ECS
Why Lotus:
Join a team redefining how healthcare information is understood and acted upon. As a founding member you will set research direction, shape the model and agent architecture, and see your work improve care for real people at scale. You will work with exceptional engineers, clinicians, and researchers to build accessible, free primary care for everyone.
If this sounds like you, we would love to meet.