Business Services & Consulting • all cities, MO 25
Engineering & Built Environment Experts (25)
all cities, MO 25On-sitePosted 7 hours ago
Business Services & Consulting
About the Role
We are building a benchmark dataset to evaluate AI models on professional document understanding and instruction following within the Engineering & Built Environment domain.Tasks consist of complex, multi-step requests grounded in real-world workspace files (technical drawings, project specifications, engineering reports), web search, and code execution - each paired with a clearly defined ground truth output and an objective evaluation rubric.You will be responsible for authoring tasks that test an AI's ability to interpret engineering documentation, follow multi-step instructions, and produce precise, well-structured outputs.
We expect a minimum commitment of 15-20 hours per week.Ideal candidates have 3+ years of hands-on experience in one or more of the following sub-domains: - Mechanical engineering - Civil engineering - Industrial engineering - Architecture
We are building a benchmark dataset to evaluate AI models on professional document understanding and instruction following within the Engineering & Built Environment domain.Tasks consist of complex, multi-step requests grounded in real-world workspace files (technical drawings, project specifications, engineering reports), web search, and code execution - each paired with a clearly defined ground truth output and an objective evaluation rubric.You will be responsible for authoring tasks that test an AI's ability to interpret engineering documentation, follow multi-step instructions, and produce precise, well-structured outputs.
We expect a minimum commitment of 15-20 hours per week.Ideal candidates have 3+ years of hands-on experience in one or more of the following sub-domains: - Mechanical engineering - Civil engineering - Industrial engineering - Architecture
What You'll Do
We are building a benchmark dataset to evaluate AI models on professional document understanding and instruction following within the Engineering & Built Environment domain.
Tasks consist of complex, multi-step requests grounded in real-world workspace files (technical drawings, project specifications, engineering reports), web search, and code execution - each paired with a clearly defined ground truth output and an objective evaluation rubric.
You will be responsible for authoring tasks that test an AI's ability to interpret engineering documentation, follow multi-step instructions, and produce precise, well-structured outputs.
We expect a minimum commitment of 15-20 hours per week.