Mathematics Model Prompt Evaluator Job at SaidGig, Remote

QXk4bVhiYWtLZlFMTkk2aVNyckxKNjYyb0E9PQ==
  • SaidGig
  • Remote

Job Description

Role Overview

Expert mathematicians are invited to author and verify high-quality open-ended prompts for AI model evaluation. In this role, you will craft and review challenging, unambiguous mathematical problems across core subdomains, assessing AI reasoning quality and helping establish rigorous evaluation standards for frontier language models.

Task Types

You will be assigned one of two task types:

  • Authoring Task: Create 5 original, open-ended prompts from your assigned subdomain at varying difficulty levels (undergraduate, advanced undergraduate, or graduate/professional). Prompts should require human judgment to evaluate the quality of the AI''s response, such as chain-of-thought reasoning or proof construction.
  • Verification Task: Review 5 authored prompts for clarity, scope alignment, difficulty accuracy, and uniqueness. Edit prompts and difficulty ratings where needed.
Mathematics Subdomains Covered

Probability & Statistics, Algebra (including Linear Algebra), Ordinary/Partial Differential Equations & Dynamical Systems, Geometry, Graph Theory, Number Theory.

Key Responsibilities
  • Author clear, unambiguous, open-ended mathematical prompts that elicit evaluable AI responses.
  • Verify prompts are within the scope of the assigned subdomain and correctly rated for difficulty.
  • Ensure all 5 prompts in a task are sufficiently distinct from one another with varying difficulty levels.
  • Apply expert judgment to assess the depth and quality of mathematical reasoning required.
  • Edit prompts and difficulty assignments where standards are not met.
Ideal Qualifications
  • Master''s degree or higher in Mathematics, Applied Mathematics, Statistics, or a closely related field.
  • 2–6 years of professional or research experience in a quantitative field.
  • Strong command of graduate-level mathematical concepts including proof writing, analysis, and formal reasoning.
  • Experience in academic research, mathematical competition design, or quantitative industry roles is a plus.
  • Excellent written English and ability to craft precise, well-scoped technical questions.
Work Terms

Expected commitment: 10+ hours/week. Asynchronous, fully remote work.

Job Tags

Remote job

Similar Jobs

BEST ONE TIRE & SERVICE

Commercial Tire Technician Job at BEST ONE TIRE & SERVICE

Description: Join our Best-One team - now hiring a Commercial Tire Technician at our Owensboro, KY location. Pay: Competitive; Based on Experience Who we are: Over the past 77 years, what started out as a single bay service station has grown into a respected... 

GC Therapeutics

Scientist/Senior Scientist - Cell Biology (Diabetes) Job at GC Therapeutics

 ...Scientist/Senior Scientist Cell Biology (Diabetes) Company Overview...  ...and interpret experimental data, ensuring comprehensive documentation...  ...approximately 75% of the time including occasional weekend...  ...and equity awards, which are part of the total compensation package... 

Weather-Tite Exteriors

Marketing & Events Coordinator Job at Weather-Tite Exteriors

Were looking for a driven, outgoing, and highly organized professional who thrives in a fast-paced, people-focused environment. This individual is passionate about building relationships, representing our brand in the community, and executing high-level events. You...

Proof of the Pudding

Executive Sous Chef Job at Proof of the Pudding

 ...commitment to creating quality culinary concepts from scratch. This standard has upheld through the expansion into Sports and Entertainment, Food & Beverage Services by providing elevated concessions and premium offerings to collegiate stadiums, PGA golf tournaments,... 

Campbell County Health

Cardiac Catherization Technologist | PRN Job at Campbell County Health

 ...Cardiac Catherization Technologist | PRN JOB SUMMARY The Cardiac Cath Lab Radiology Technologist (RT) will provide direct patient care to cardiac patients and communicate with the physician continuously to ensure patient safety during procedures. Working with...