Responsibilities
-
Build, deploy, and maintain end-to-end machine learning models that detect policy violating actors, remove harmful content, and keep users safe.
-
Own medium-sized projects end-to-end (from problem scoping and data collection to training, deployment, monitoring, and iteration).
-
Contribute to scalable inference and data pipelines (e.g., Spark, Kubernetes), including preprocessing, batch and real-time inference, and post-processing components.
-
Apply standardized performance metrics, testing protocols, and evaluation processes to measure model effectiveness and identify risks.
-
Continuously assess and refine deployed models using user feedback, business impact signals, and emerging policy and ethical considerations.
-
Collaborate closely with Data Scientists, Data Engineers, Product Managers, Backend Engineers, and the AI Platform team to ship coordinated user safety improvements.
-
Stay current on advances in AI/ML, particularly LLMs and evaluation methods, and apply appropriate techniques to detection and user safety problems.
What We're Looking For
-
Strong programming skills: Proficiency in Python and SQL, comfort with at least one ML stack (e.g., PyTorch, Hugging Face Transformers), and a working understanding of data pipelines.
-
Domain expertise: Solid understanding of machine learning, deep learning, and emerging AI techniques. Track record of building, debugging, and fine-tuning ML models for user-facing products. Experience with content classification, fraud detection, or related Trust & Safety problems is a plus.
-
System design & architecture: Experience training and deploying ML models in production. Working understanding of distributed computing for inference and training.
-
AI application: Hands-on with AI agents, RAG, structured outputs, and function/tool calling. Understands when to reach for an agent or LLM vs. a classical ML approach.
-
Evaluation frameworks for ML and LLM systems: Comfort designing and running evaluations to measure model effectiveness, fairness, and safety across both classical ML and LLM systems. Familiar with golden datasets, LLM-as-judge patterns, precision/recall/F1, false positive rates, hallucination, and offline/online metrics.
-
Cloud and data platform proficiency: Hands-on with at least one cloud environment (GCP, AWS, or Azure). Familiarity with Databricks, Ray, or Kubeflow is a plus.
-
Data engineering knowledge: Comfortable handling large datasets, including cleaning, preprocessing, and storage, and contributing to batch and streaming pipelines orchestrated with tools like Databricks or Argo.
-
Project ownership: Demonstrated ability to drive medium-sized projects to completion, identify and unblock yourself on technical issues, and ship measurable outcomes with limited oversight.
-
Collaboration and communication skills: The ability to work effectively in a team and communicate complex ideas clearly with individuals from diverse technical and non-technical backgrounds.
-
Strong written communication: The ability to communicate complex ideas and technical knowledge through
-
2+ years of experience as an MLE, applied scientist, or data scientist (depending on education).
-
1+ years of experience applying end-to-end machine learning models in an industry setting — including data collection, model training, deployment, and monitoring.
-
Experience integrating or evaluating LLMs in real-world applications with appropriate baseline metrics and evaluation methodologies is a plus.
-
Familiarity with at least one ML infrastructure component (feature store, training environment, model serving, observability, workflow orchestrator).
-
Previous exposure to Trust & Safety, fraud detection, content classification, or compliance is preferred but not required.
-
Demonstrated use of modern AI tooling in their actual workflow (e.g., Cursor, Claude Code, Codex).
-
A degree in computer science, engineering, or a related field (or equivalent practical experience).