This job is for an Applied AI Engineer (Eval-driven) at an early-stage company focused on eradicating cost overruns in construction. The mission is to tackle a $3 trillion problem that impacts cities and builder-client relationships. The team values collaboration and innovation, working together to build AI-powered solutions that help builders maintain control over project costs.
You'll be responsible for
📊
Defining evaluation problems
Establishing success criteria, failure modes, datasets, labeling guidelines, and score functions to ensure effective evaluation processes.🔧
Building and maintaining an evaluation harness
Creating regression tests, edge-case suites, and quality dashboards to prevent backsliding and ensure consistent performance.🔄
Implementing workflow systems end-to-end
Managing the entire process from data to model/LLM components, post-processing, and acceptance testing until they meet evaluation thresholds.Skills you'll need
🐍
Strong Python skills
Practical experience in shipping ML/AI systems beyond experimentation, ensuring robust and reliable solutions.📈
Experience designing evals for ML/LLM systems
Demonstrated ability in offline metrics, gold sets, error analysis, regression testing, and monitoring to ensure high-quality outputs.⚙️
Comfort with data science and engineering tasks
Ability to handle data wrangling, feature/label design, model/LLM iteration, and productionization effectively.View more