Making impactful research more rigorous — and rigorous research more impactful
David Reinstein · Founder & Co-Director, unjournal.org
Economics Research Away Day · University of Exeter — Friday 19 June 2026
The Unjournal — in brief
This deck was prepared for a one-time talk (University of Exeter, 19 June 2026). For the maintained, reusable version, see the generic Unjournal talk →
A grant-funded nonprofit that commissions experts to publicly evaluate and rate research, and assesses “Pivotal Questions” for stakeholders.
We aim to make impactful research more rigorous, make academic work more useful, support open science & transparency, and improve peer review — aligning research incentives with truth-seeking and social value.
Focus: economics, policy & quantitative social science with potential for global impact.
The 17th-century journal became a useful filter. It shouldn’t be the only signal we coordinate on (nor the only output).
A journal-era filter still governs careers and research credibility
We already disseminate ourselves: working papers, arXiv, SSRN, RePEc, dynamic docs and web pages
So what does a journal really “sell” now? An evaluation. A stamp.
The biggest cost isn’t fees or paywalls: it’s the game
Average economics paper: 3–4 submissions before it’s placed
Reviewer time alone: a back-of-envelope ~$150M/year in econ.
The biggest cost: authors’ time — reformatting, resubmitting, journal-shopping, strategising (“spin it as a hamburgers economics paper for the American Hamburger Journal”) instead of just improving the work
★ THE PUBLISH-OR-PERISH SLOT MACHINE ★
insert: 1 finished paper · ~6 months / spin
1 · pick an arm (which journal?)
AEREJJDE
2 · pull the lever → wait ~6 months →
✗↻✓
rejectR&Raccept
PAYOUT: one line on your CV
Careers staked on a noisy, slow spin — each spin ~6 months; placement takes 2–6 years.
“Playing this game diverts us from producing the most credible, useful research.”
“Published — so stop bothering me about it”
Journals take one format: ~30 static pages (+ a 200-page appendix)
Publication says “done” → slice off the next paper
Little room for improvement, error-correction, building in place
Separating evaluation from publishing → a world of benefits
Decouple the evaluation from the 30-page PDF and it becomes a citable, first-class object — DOI, metadata, discoverable — instead of a one-shot stamp in a “PDF prison”.
Research as a living document
Evaluation isn’t chained to a 30-page PDF:
Any format — dynamic, interactive, replicable documents
Improve or extend in place → ask for further evaluation
Open evaluation feeds open science & replication
An interactive specification-curve / “multiverse” document — far easier to build now, and far more useful than static pages.
Versus a traditional economics journal: ~1–3 years (often 24+ months to acceptance).
Self-reported evaluator effort ≈ 8–32 hours per evaluation. The target above is from our process docs; we track the dates but haven’t yet published a measured median end-to-end.
How it works
Find / receive the research
Prioritise for decision-relevance (as a team)
Recruit an evaluation manager → ~2 paid expert evaluators
Reports + ratings + author response (evaluators may adjust)
Manager synthesis → publish the package, with a DOI
Evaluators paid, named or anonymous. ▶ 2-min explainer · ↓ full workflow & video
A short narrated walk-through of the Unjournal evaluation process · youtu.be/ZCSeAmzMB50
We prioritise research for impact-potential
Prioritisation is triage, not evaluation
First question: will better evidence here change real decisions?
We do prioritize influential, widely-read work, but we don’t chase the merely clever
↓ how the triage actually runs · how the team votes
How the triage runs
Suggestor & assessor each write a short motivating discussion; the assessor rates decision-relevance & potential impact (0–100 vs the pool); the whole team votes (5-point approval scale); management finalises and liaises with authors.
How the team votes
Every candidate paper gets a team vote on impact-potential — Strong Yes / Weak Yes / Unsure / Weak No / Strong No, with vote counts and an average. This is the actual voting board (Coda).
Some considerations
What an evaluation gives you
We don’t accept/reject or assign a tier; so we benchmark instead. Two halves, equally important:
Published evaluations only, selected for decision-relevance; sorted by rating midpoint. Dots = evaluator medians; bars = stated uncertainty (where given). ratings dashboard → · ↓ a bridge to journal tiers
Benchmarking existing signals: a known currency
Predicted vs. merited journal tier (0–5). A translation layer — not an endorsement of placement as the right endpoint.
A profile, not a single score
Three evaluated papers profiled across every criterion — percentile rank (top) and journal tier (bottom). A score on each dimension, with the strengths and weaknesses a single number would hide. explore the live dashboard →
Overall ratings by research area
Every published evaluation’s overall percentile rating, grouped by research area (✗ marks the area median). dashboard →
2024–25 Evaluator Prize · 1st Evaluation of “Water Treatment & Child Mortality: a meta-analysis”
“Very influential.”— GiveWell water team (Teryn Mattox); they had been weighing commissioning their own replication. The eval informs chlorination grantmaking.
“Thorough and thoughtful… extensive write-up and precise recommendations.”— the paper’s authors, who revised the framing in response.
Each square = 1 of 57 tracked papers, by combined evidence tier. Green = LLM-analysed, shaded by share of major post-evaluation changes attributed to our feedback · blue = manually-confirmed update · orange = mixed / weak signal · grey = not yet assessed. LLM attribution via Claude Opus 4.6 — indicative, human verification ongoing.
The people behind The Unjournal
Our management team (7) runs the process and sets standards; field specialists (~60) source and prioritise research:
David ReinsteinFounder · Co-Director
Anirudh TagatCo-Director
Gavin TaylorManagement
Bob KubinecManagement
Hansika KapoorManagement
Ryan BriggsManagement
Alexander HerwixManagement
The management team (7) · the advisory board (16) and field-specialist teams follow ↓
Each paper is evaluated by ~2 domain experts, often matched from our evaluator pool:
Our advisory board — methodologists, forecasters & meta-science researchers across economics, statistics, and policy.
Field-specialist teams (8 areas)
Development economicsAnirudh Tagat · Ryan Briggs · Michael Wiebe · Nathan Fiala · Emmanuel Orkoh · Robert Kubinec · Masyhur Hilmy · Wayne Sandholtz · Lee Crawfurd · Yannick Dupraz · Leena Bhattacharya · William Seitz
Global health & well-beingJake Eaton · Rosie Bettle · Charlotte Lane · Shobhit Kulshreshtha · Jonah Goldberg · Valentin Klotzbücher · Priya Lall · Francesco Ramponi · Sarah Reynolds
Economics, welfare & governanceDavid Reinstein · Julian Jamison · Tabaré Capitán · Joel Christoph · Andrei Potlogea · Greg Sasso · Brian Weber · Daniel Horn · Moritz Hennecke · Seth Benzell
Innovation, meta-science, social impact of technologyDaniela Cialfi · Jordan Dworkin · Kris Gulati · Andrew Kao · Gavin Taylor · Gary McDowell
Environmental economicsTanya O’Garra · Ben Balmford
Animal welfare (markets, attitudes)Josh Tasoff · Kevin Kuruc · Florian Habermacher · Nicolas Treich · Ash Mader · Brinda Poojary
Catastrophic risks, AI governance & safetyDavid Manheim · Anca Hanea · Alexander Herwix · Tristan Williams
These specialists span many universities and institutions worldwide — a good chance some are already in your department. Two — Julian Jamison and Ben Balmford — are here at Exeter.
4 · Pivotal Questions
The Pivotal Questions project
From single papers → identifying stakeholders’ specific ‘operationalized’ questions that matter:
What would change key decisions, and what research evidence informs this?
What do experts believe now — and how uncertain?
Researchers + Practitioners + Stakeholders
Including Founders Pledge & Animal Charity Evaluators; participants from Coefficient Giving & more
Health, wellbeing & development — Jamison, Medina-Lara
cost-effectiveness, WELLBYs, the RCTs we prioritise
Econometrics & methods — Clarke
meta-science; calibrating our ratings
Open research & reproducibility — Kripfganz
public disciplinary judgement alongside repositories & compliance
Development — Roy, Dyer, R. Banerjee
the field RCTs we prioritise
AI, technological change & labour — Hauser (IDSAI), Digital Economy
AI’s societal & labour-market impact — a fast-growing Unjournal priority
Ways to engage and adopt this
Join our team or evaluator pool
Suggest research (and pivotal questions)
Use our outputs & data
Bring students in
Recognise better signals
Evaluate — paid
Staff · postdocs · advanced PhDs
Paid (~$450 avg) for work you partly do already
Faster & more visible than a report that vanishes into a journal
Named or anonymous; counts as service; citable (DOI)
A referee report you’re proud of becomes a public, citable output.
Submit or suggest research
Authors
Submit a working paper → credible public evaluations + ratings
The journal path stays open — get feedback & a public signal before it resolves
Suggest others’ high-impact work (anonymously if you like)
Why request public evaluation?
A public commitment — and a signal.“I’m willing to have this evaluated openly.” Feedback now, a public signal now — journal path still open.
From our author-reluctance model: selecting into evaluation already shifts beliefs (p0 → p_D — a separating signal); a favourable public evaluation then clears the gatekeeper threshold c. interactive version → · ↓ embedded
Try the model live
Interactive: adjust the prior, the selection effect, and the evaluation’s informativeness. open in a new tab → (if the embed doesn’t load)
Evaluation unlocks credibility — wherever you’re from
A Harvard paper? Already trusted. A strong paper from Exeter or Ball State? Unjournal evaluation opens the chest — public, structured evidence that travels independently of institutional prestige.
When is requesting worth it?
Most valuable when your work is strong but under-credited — or sitting just ‘below the bar’
If a commitment to open evaluation becomes a positive signal, you’ll want in early
May be less of a clear win when the work already clears the bar and it’s a sensitive career moment
Timing concerns? Talk to us — we can embargo or schedule
See what economists, funders & practitioners actually care about: a methodological conversation that sharpens your own work and professional understanding
Get involved with real peer review; get feedback on your evaluation from us, and often from the authors
Visibility within a network of funders, grantmakers & impact-minded researchers
Potential RA / fellowship opportunities: evaluation, meta-research analysis, Pivotal-Questions project support
Use our outputs
Evaluation packages & prioritisation outputs → a vetted evidence base to build on, teach, cite, discuss, and engage with
Public evaluations → possible REF / grant / esteem evidence?
The ratings dataset → meta-analysis; Open to field experiment collaborations on the evaluation process
Visibility to research users
Funders & nonprofits read these evaluations
Some use them in grantmaking and methodology
A route to feedback, uptake, and sometimes collaboration
A way to put careful work in front of people who actually use evidence.
Recognise better signals
A strong public evaluation is evidence of quality and usefulness — not just venue: multidimensional ratings (with uncertainty) + expert reports & discussion + an author response + a citable DOI.
For research leaders & managers encouraging engagement signals a commitment to rigour, transparency & innovation — and opens the research-impact channel (our funder & practitioner network, incl. Pivotal Questions).
6 · Looking ahead
AI makes evaluation more important
A flood of plausible papers generated by AI tools, that may or may not be correct or useful
More need for efficient, transparent evaluation — and connection to real stakeholders & impact
Scalable code / data checks
Current ~consensus: keep a human in the loop for the final decisions and vetting
Not “does it fit a top-5 template” — but “is it true, and does it matter?”
How does AI evaluation compare to humans?
One exploratory pilot · ~45 papers
“Frontier” (Jan. 2026) LLM vs. our human ratings: modest rank agreement (r ≈ 0.3)
Human–human agreement still exceeds human–LLM
On written critiques: LLMs catch ~¾ of human concerns, but ~half their flags aren’t substantive
Not yet a substitute? But it’s an open question.
We’re further exploring AI prioritization, research reasoning, and alignment in this domain
Spent too long on “why journal peer review is broken” / its limitations — crowded out time for “what we do”
The lengthy diagnosis generated a lot of defensive reactions (“journals provide more than just a stamp”) — audience pushed back on that framing
ACTION: Trim or move the journal-critique section; lead earlier with what UJ actually does
Framing to fix
Remove or soften the “just a stamp” language — it reads as dismissive and triggered audience friction
Funders and practitioners noted they don’t necessarily know what the highest-priority research areas are — address this more explicitly (UJ helps surface that)
Researchers noted they want independence from funder direction — acknowledge this tension honestly
Navigation / slide order problems
Audience asked process questions that were answered in later slides — had to jump around
ACTION: Move “what is an evaluation and what are the ratings” earlier in the deck (currently too far back)
Audience questions / suggestions worth acting on
“Who are the evaluators?” — came up prominently; needs a stronger early answer
Ben Zarankin: consider whether evaluators should specialize in the evaluator role more explicitly — worth exploring / referencing in the deck
For the next iteration — DONE, ported to generic-talk.qmd (19 June 2026)
These lessons have now been incorporated into generic-talk.qmd (the maintained, reusable deck — live at https://uj-talk.netlify.app):
This Exeter deck is now frozen; the generic deck is canonical going forward. –>