Soapbox Job Board

Olympiad-grade data creation * author math/physics/compsci problems and multi-step reasoning tasks (proofs, multi-hop, tot, chain-of-thought). * produce fully-worked solutions, hints, rubrics, and counter-examples; deliver in latex + json/parquet. Stepwise reasoning & annotation * decompose solutions into verifiable steps; tag dependencies, error types, and alternate paths. * build evaluator prompts and scoring functions for exact, rubric-based, and partial-credit grades. Model training & evaluation * curate instruction-tuning sets and reasoning curricula; run small/medium scale fine-tunes. * design benchmarks and leaderboards with reproducible scoring, leakage tests, and stats reports. Cheat-resistant talent vetting (for existing contributor pools) * create **secure, variant-rich assessments** to evaluate math/logic ability and proof writing. * techniques we use: parallel forms & dynamic variants (same construct, different instances; per-candidate seeding). * hidden gold & cross-item invariants to detect copy/paste or tool-overreliance. * time-to-solve and process evidence (scratch reasoning, step logs, latex diffs) reviewed by experts. * style & consistency checks (statistical signals + human review) to flag likely ai-generated text. * closed reference sets and controlled tool policies; honor-tests that require original derivations. * output: ranked shortlists, skills profiles, red-flags, and targeted upskilling plans. Red-teaming & safety * adversarial prompts for math/logic hallucinations, spec violations, and content safety; failure taxonomies and fixes.

Something looks off?

Open jobs at Poindexter Labs

This company does not have jobs relevant to this job board at this time.

To view all their jobs, visit their website.