The Unjournal

Making impactful research more rigorous — and rigorous research more impactful

David Reinstein · Founder & Co-Director, unjournal.org

Economics Research Away Day · University of Exeter — Friday 19 June 2026

The Unjournal — in brief

This deck was prepared for a one-time talk (University of Exeter, 19 June 2026). For the maintained, reusable version, see the generic Unjournal talk →
  • A grant-funded nonprofit that commissions experts to publicly evaluate and rate research, and assesses “Pivotal Questions” for stakeholders.
  • We aim to make impactful research more rigorous, make academic work more useful, support open science & transparency, and improve peer review — aligning research incentives with truth-seeking and social value.
  • Focus: economics, policy & quantitative social science with potential for global impact.
  • Output: unjournal.pubpub.org

Good to be back at Exeter

A typical day lecturing at Exeter uni.

1 · The limitations of journal peer-review

An old system, still running the show

The 17th-century journal became a useful filter. It shouldn’t be the only signal we coordinate on (nor the only output).

  • A journal-era filter still governs careers and research credibility
  • We already disseminate ourselves: working papers, arXiv, SSRN, RePEc, dynamic docs and web pages
  • So what does a journal really “sell” now? An evaluation. A stamp.

The biggest cost isn’t fees or paywalls: it’s the game

  • Average economics paper: 3–4 submissions before it’s placed
  • Reviewer time alone: a back-of-envelope ~$150M/year in econ.
  • The biggest cost: authors’ time — reformatting, resubmitting, journal-shopping, strategising (“spin it as a hamburgers economics paper for the American Hamburger Journal”) instead of just improving the work
★ THE PUBLISH-OR-PERISH SLOT MACHINE ★
insert: 1 finished paper · ~6 months / spin
1 · pick an arm (which journal?)
AEREJJDE
2 · pull the lever → wait ~6 months →
rejectR&Raccept
PAYOUT: one line on your CV
Careers staked on a noisy, slow spin — each spin ~6 months; placement takes 2–6 years.

“Playing this game diverts us from producing the most credible, useful research.”

“Published — so stop bothering me about it”

  • Journals take one format: ~30 static pages (+ a 200-page appendix)
  • Publication says “done” → slice off the next paper
  • Little room for improvement, error-correction, building in place

Separating evaluation from publishing → a world of benefits

Decouple the evaluation from the 30-page PDF and it becomes a citable, first-class object — DOI, metadata, discoverable — instead of a one-shot stamp in a “PDF prison”.

Research as a living document

Evaluation isn’t chained to a 30-page PDF:

  • Any format — dynamic, interactive, replicable documents
  • Improve or extend in place → ask for further evaluation
  • Open evaluation feeds open science & replication

An interactive specification-curve / “multiverse” document — far easier to build now, and far more useful than static pages.

2 · What The Unjournal is

unjournal.org  ·  info.unjournal.org  ·  unjournal.pubpub.org

We are not a journal, we don’t “publish papers”

A non-profit commissioning open evaluation of publicly-hosted research with potential for global impact.

  • We commission and pay for expert evaluation; authors can also publish in a journal
  • Multiple evaluations + structured ratings + author response: public, with DOIs
  • Credible, citable peer review, not tied to a journal’s accept/reject

unjournal.org  ·  unjournal.pubpub.org  ·  info.unjournal.org

Funders: Survival & Flourishing Fund (largest)
Long-Term Future Fund
EA Infrastructure Fund

Why journal-independent evaluation can succeed now

Some research-users want to know more than “which journal published it”, and they want to get faster and deeper feedback.

  • Funders & research users who need evidence — e.g. Coefficient Giving, Survival & Flourishing Fund
  • They want credible expert judgment, transparent reasoning, quantified beliefs & uncertainty
  • Decision-relevance and value of information

↓ “but doesn’t this need everyone to move at once?”

Solving the coordination problem

  • Academics ~broadly agree open evaluation is better — but can’t move first alone
  • Funding & grantmaker incentives can tip the balance
  • We’re working to be highly visible — so evaluations & ratings are seen before conventional journals/reviewers weigh in
  • Building a bridge, not asking you to jump off.

Fear of Standing Out → Fear of Missing Out

Making it discoverable where it counts

Unjournal evaluations are indexed in Google Scholar — surfacing with the working paper, not years later.  search “source:unjournal” →

Fast public evaluation/feedback: an increasing priority

One round of public evaluation → a credible output now.

  • A publicly citable signal after one round
  • Versus a traditional journal: 6+ months, R&R at best, then maybe accepted after substantial revisions
  • Fast-moving topics can miss the decision window

AI capabilities · AI’s impact on labour markets · policy windows.

↓ how long does it actually take?

How long does it take?

Target ~2–3 months · prioritisation → published package

  • Recruit ~2 evaluators — ~1–2 weeks
  • Evaluations (reports + ratings) — 5+ weeks  (~3-week turnaround target each)
  • Author response~2 weeks  (longer if revising)
  • Total target: ~7–10 weeks

Versus a traditional economics journal: ~1–3 years (often 24+ months to acceptance).

Self-reported evaluator effort ≈ 8–32 hours per evaluation. The target above is from our process docs; we track the dates but haven’t yet published a measured median end-to-end.

How it works

  1. Find / receive the research
  2. Prioritise for decision-relevance (as a team)
  3. Recruit an evaluation manager → ~2 paid expert evaluators
  4. Reports + ratings + author response (evaluators may adjust)
  5. Manager synthesispublish the package, with a DOI

Evaluators paid, named or anonymous.  ▶ 2-min explainer · ↓ full workflow & video

Our workflow

Watch the 2-minute explainer

▶  Watch the 2-minute explainer on YouTube

A short narrated walk-through of the Unjournal evaluation process  ·  youtu.be/ZCSeAmzMB50

We prioritise research for impact-potential

Prioritisation is triage, not evaluation

  • First question: will better evidence here change real decisions?
  • We do prioritize influential, widely-read work, but we don’t chase the merely clever

↓ how the triage actually runs · how the team votes

How the triage runs

Suggestor & assessor each write a short motivating discussion; the assessor rates decision-relevance & potential impact (0–100 vs the pool); the whole team votes (5-point approval scale); management finalises and liaises with authors.

How the team votes

Every candidate paper gets a team vote on impact-potential — Strong Yes / Weak Yes / Unsure / Weak No / Strong No, with vote counts and an average. This is the actual voting board (Coda).

Some considerations

What an evaluation gives you

We don’t accept/reject or assign a tier; so we benchmark instead. Two halves, equally important:

The substance

  • Detailed referee-style reports
  • The authors’ response
  • Editorial summary

The structured ratings

  • Percentile ranking
  • Journal-tier equivalent (0–5)
  • Nine criteria, with quantified uncertainty
  • Claim identification and assessment

All public, citable, and comparable — see the evaluator interface →

Inside the evaluation form

What an evaluator actually fills in — every rating elicited with a 90% credible interval, not a point score.
Percentile rating  ·  Methods: justification, reasonableness, validity, robustness
0255075100
Midpoint 72  ·  90% CI 66 – 78  (tight — a confident rating; one of nine criteria)
Journal-tier rating (0–5)  — elicited as two separate ratings, each with its own 90% CI
“Should” — normative merit  → Midpoint 3.8, 90% CI 2.7 – 4.6 (wide — more uncertain)
“Will” — predicted placement  → Midpoint 3.2, 90% CI 2.5 – 3.9
012345
0 won’t publish · 1 OK · 2 marginal-B · 3 top-B · 4 marginal-A · 5 top-A (top-5: AER, QJE, Econometrica). Non-integers encouraged; gap between “should” and “will” = the placement lottery.
Claim identification & assessment — evaluators pull out the paper’s key claims and rate, for each, the strength of evidence and its implications.
Rebuilt from the live instrument · open the evaluator form →

3 · What we’ve done

Where we are now

57 evaluation packages on PubPub

100+ expert evaluations

180+ evaluators (120+ PhDs, ~40 profs)

~$450 avg evaluator payment

1,000+ structured ratings recorded

40+ field specialists

ISSN 3071-2173 · 501(c)(3) · DOIs

Founded 2022, public since 2023.

Every rating comes with a credible interval

Published evaluations only, selected for decision-relevance; sorted by rating midpoint. Dots = evaluator medians; bars = stated uncertainty (where given).  ratings dashboard →  ·  ↓ a bridge to journal tiers

Benchmarking existing signals: a known currency

Predicted vs. merited journal tier (0–5). A translation layer — not an endorsement of placement as the right endpoint.

A profile, not a single score

Three evaluated papers profiled across every criterion — percentile rank (top) and journal tier (bottom). A score on each dimension, with the strengths and weaknesses a single number would hide.  explore the live dashboard →

Overall ratings by research area

Every published evaluation’s overall percentile rating, grouped by research area (✗ marks the area median).  dashboard →

What we’ve evaluated — 57 packages by area

Global health & wellbeing15
Development & governance10
Economics, welfare & policy7
Environment & climate6
Meta-science & methods6
Animal welfare & markets5
Catastrophic & long-term risk4
AI & emerging tech2
Behaviour & attitudes2
Published packages (n=57). Health, development, environment & applied micro — Exeter’s wheelhouse.

A concrete example: an award-winning evaluation

2024–25 Evaluator Prize · 1st Evaluation of “Water Treatment & Child Mortality: a meta-analysis”

“Very influential.”— GiveWell water team (Teryn Mattox); they had been weighing commissioning their own replication. The eval informs chlorination grantmaking.
“Thorough and thoughtful… extensive write-up and precise recommendations.”— the paper’s authors, who revised the framing in response.

Read it →

Do authors find it useful?

Across tracked evaluations

  • 19 of 57 tracked evaluations drew an author response (16 formal)
  • Of 22 closely assessed: 15 a positive signal; ~a third substantively revised
  • For 8 papers we compared drafts — a median ~22% of changes traced to our feedback (LLM-assisted)
  • Author survey (n≈8): quality 30–90; one — “as good as a standard referee report, or better.”

Author responses · author survey

Did authors adapt? All 57, tracked

Each square = 1 of 57 tracked papers, by combined evidence tier. Green = LLM-analysed, shaded by share of major post-evaluation changes attributed to our feedback · blue = manually-confirmed update · orange = mixed / weak signal · grey = not yet assessed.  LLM attribution via Claude Opus 4.6 — indicative, human verification ongoing.

The people behind The Unjournal

Our management team (7) runs the process and sets standards; field specialists (~60) source and prioritise research:

David ReinsteinFounder · Co-Director
Anirudh TagatCo-Director
Gavin TaylorManagement
Bob KubinecManagement
Hansika KapoorManagement
Ryan BriggsManagement
Alexander HerwixManagement
The management team (7) · the advisory board (16) and field-specialist teams follow ↓

Each paper is evaluated by ~2 domain experts, often matched from our evaluator pool:

180+ evaluators in the pool

½+ are economists

½+ hold doctorates

40+ field specialists · 8 areas

↓ the advisory board · field-specialist teams

The advisory board

Our advisory board — methodologists, forecasters & meta-science researchers across economics, statistics, and policy.

Field-specialist teams (8 areas)

Development economics Anirudh Tagat · Ryan Briggs · Michael Wiebe · Nathan Fiala · Emmanuel Orkoh · Robert Kubinec · Masyhur Hilmy · Wayne Sandholtz · Lee Crawfurd · Yannick Dupraz · Leena Bhattacharya · William Seitz

Global health & well-being Jake Eaton · Rosie Bettle · Charlotte Lane · Shobhit Kulshreshtha · Jonah Goldberg · Valentin Klotzbücher · Priya Lall · Francesco Ramponi · Sarah Reynolds

Economics, welfare & governance David Reinstein · Julian Jamison · Tabaré Capitán · Joel Christoph · Andrei Potlogea · Greg Sasso · Brian Weber · Daniel Horn · Moritz Hennecke · Seth Benzell

Psychology, behavioral science, attitudes Hansika Kapoor · Jonathan Berman · Mattie Toma · Carina Ines Hausladen · Hannah Metzler

Innovation, meta-science, social impact of technology Daniela Cialfi · Jordan Dworkin · Kris Gulati · Andrew Kao · Gavin Taylor · Gary McDowell

Environmental economics Tanya O’Garra · Ben Balmford

Animal welfare (markets, attitudes) Josh Tasoff · Kevin Kuruc · Florian Habermacher · Nicolas Treich · Ash Mader · Brinda Poojary

Catastrophic risks, AI governance & safety David Manheim · Anca Hanea · Alexander Herwix · Tristan Williams

These specialists span many universities and institutions worldwide — a good chance some are already in your department. Two — Julian Jamison and Ben Balmford — are here at Exeter.

4 · Pivotal Questions

The Pivotal Questions project

From single papers → identifying stakeholders’ specific ‘operationalized’ questions that matter:

  • What would change key decisions, and what research evidence informs this?
  • What do experts believe now — and how uncertain?
  • Researchers + Practitioners + Stakeholders
  • Including Founders Pledge & Animal Charity Evaluators; participants from Coefficient Giving & more

Beliefs on our platforms · overview · workshops: cultured-meat · wellbeing

How a Pivotal Question works

A stakeholder’s decision-relevant question → curated evidence → expert evaluation, with a parallel experts-&-practitioner workshop → synthesis & public output.  overview →

What 10 workshop forecasts looked like

n = 10 expert forecasts from our cultured-meat workshop — medians with 80% credible intervals.

Voices from the workshops

Cultured-meat workshop — Oana Kubinyecz on cell-line cost drivers (top). Wellbeing workshop — Matt Lerner’s DALY ↔︎ life-satisfaction comparison & Michael Plant (HLI) on imperfect-but-usable metrics (bottom).

5 · Would this be useful to you?

Exeter people are already involved

  • Julian Jamison (GPI / Exeter) — field specialist & Pivotal Questions advisor
  • Ben Balmford (LEEP, Exeter) — environmental-economics field specialist
  • We’ve published an evaluation of “How Effective Is (More) Money?…” — co-authored by Jamison & Oliver Hauser (Exeter). Package →
  • And the idea has roots in conversations on this campus

↓ where Exeter economics strengths might fit

Alignment with Exeter’s strengths

Exeter strength …maps onto Unjournal work
Behavioural & experimental — Hauser, Fonseca, Balafoutas belief elicitation; the Pivotal-Questions forecasts
Environmental (LEEP) — Bateman, Groom, Day natural-capital valuation; our climate & animal-welfare evaluations
Health, wellbeing & development — Jamison, Medina-Lara cost-effectiveness, WELLBYs, the RCTs we prioritise
Econometrics & methods — Clarke meta-science; calibrating our ratings
Open research & reproducibility — Kripfganz public disciplinary judgement alongside repositories & compliance
Development — Roy, Dyer, R. Banerjee the field RCTs we prioritise
AI, technological change & labour — Hauser (IDSAI), Digital Economy AI’s societal & labour-market impact — a fast-growing Unjournal priority

Ways to engage and adopt this

  • Join our team or evaluator pool
  • Suggest research (and pivotal questions)
  • Use our outputs & data
  • Bring students in
  • Recognise better signals

Evaluate — paid

Staff · postdocs · advanced PhDs

  • Paid (~$450 avg) for work you partly do already
  • Faster & more visible than a report that vanishes into a journal
  • Named or anonymous; counts as service; citable (DOI)

A referee report you’re proud of becomes a public, citable output.

Submit or suggest research

Authors

  • Submit a working paper → credible public evaluations + ratings
  • The journal path stays open — get feedback & a public signal before it resolves
  • Suggest others’ high-impact work (anonymously if you like)

Why request public evaluation?

A public commitment — and a signal. “I’m willing to have this evaluated openly.” Feedback now, a public signal now — journal path still open.

From our author-reluctance model: selecting into evaluation already shifts beliefs (p0 → p_D — a separating signal); a favourable public evaluation then clears the gatekeeper threshold c.  interactive version →  ·  ↓ embedded

Try the model live

Interactive: adjust the prior, the selection effect, and the evaluation’s informativeness.  open in a new tab → (if the embed doesn’t load)

Evaluation unlocks credibility — wherever you’re from

A Harvard paper? Already trusted. A strong paper from Exeter or Ball State? Unjournal evaluation opens the chest — public, structured evidence that travels independently of institutional prestige.

When is requesting worth it?

  • Most valuable when your work is strong but under-credited — or sitting just ‘below the bar’
  • If a commitment to open evaluation becomes a positive signal, you’ll want in early
  • May be less of a clear win when the work already clears the bar and it’s a sensitive career moment
  • Timing concerns? Talk to us — we can embargo or schedule

Full “model” (v. preliminary, ~Fable-generated with human feedback): unjournal-reluctance-note.netlify.app

Students & early-career researchers

  • See what economists, funders & practitioners actually care about: a methodological conversation that sharpens your own work and professional understanding
  • Get involved with real peer review; get feedback on your evaluation from us, and often from the authors
  • Visibility within a network of funders, grantmakers & impact-minded researchers
  • Potential RA / fellowship opportunities: evaluation, meta-research analysis, Pivotal-Questions project support

Use our outputs

  • Evaluation packages & prioritisation outputs → a vetted evidence base to build on, teach, cite, discuss, and engage with
  • Pivotal Questions / workshops → framing for agendas, grants, collaborations, REF Impact?
  • Public evaluations → possible REF / grant / esteem evidence?
  • The ratings datasetmeta-analysis; Open to field experiment collaborations on the evaluation process

Visibility to research users

  • Funders & nonprofits read these evaluations
  • Some use them in grantmaking and methodology
  • A route to feedback, uptake, and sometimes collaboration

A way to put careful work in front of people who actually use evidence.

Recognise better signals

A strong public evaluation is evidence of quality and usefulness — not just venue: multidimensional ratings (with uncertainty) + expert reports & discussion + an author response + a citable DOI.

For research leaders & managers encouraging engagement signals a commitment to rigour, transparency & innovation — and opens the research-impact channel (our funder & practitioner network, incl. Pivotal Questions).

6 · Looking ahead

AI makes evaluation more important

  • A flood of plausible papers generated by AI tools, that may or may not be correct or useful
  • More need for efficient, transparent evaluation — and connection to real stakeholders & impact
  • Scalable code / data checks
  • Current ~consensus: keep a human in the loop for the final decisions and vetting

Not “does it fit a top-5 template” — but “is it true, and does it matter?”

How does AI evaluation compare to humans?

One exploratory pilot · ~45 papers

  • “Frontier” (Jan. 2026) LLM vs. our human ratings: modest rank agreement (r ≈ 0.3)
  • Human–human agreement still exceeds human–LLM
  • On written critiques: LLMs catch of human concerns, but ~half their flags aren’t substantive
  • Not yet a substitute? But it’s an open question.
  • We’re further exploring AI prioritization, research reasoning, and alignment in this domain

Preliminary methods & results: llm-uj-research-eval.netlify.app/methods

Questions for you

  1. What would make an evaluation count as evidence of quality — reliable, meaningful, valued?
  2. Where would faster public evaluation be most useful?
  3. What would make public evaluation feel safe and valuable for authors?
  4. How could this invigorate teaching & research training?
  5. How could it help build agendas, attract funding, and demonstrate value (e.g. REF)?
  6. Which Exeter strengths connect most naturally?

Thank you

What does open (Unjournal) evaluation provide?

Now: faster, useful feedback + a credible public signal, and useful inputs to practitioners and funders.

Soon: it starts to carry career value.

Eventually: it can replace much or all of what we ask the journal stamp to do.

Which of these would actually help your work?

David Reinstein · contact@unjournal.org · unjournal.org · unjournal.pubpub.org

Pacing / structure issues

  • Spent too long on “why journal peer review is broken” / its limitations — crowded out time for “what we do”
  • The lengthy diagnosis generated a lot of defensive reactions (“journals provide more than just a stamp”) — audience pushed back on that framing
  • ACTION: Trim or move the journal-critique section; lead earlier with what UJ actually does

Framing to fix

  • Remove or soften the “just a stamp” language — it reads as dismissive and triggered audience friction
  • Funders and practitioners noted they don’t necessarily know what the highest-priority research areas are — address this more explicitly (UJ helps surface that)
  • Researchers noted they want independence from funder direction — acknowledge this tension honestly
  • Audience asked process questions that were answered in later slides — had to jump around
  • ACTION: Move “what is an evaluation and what are the ratings” earlier in the deck (currently too far back)

Audience questions / suggestions worth acting on

  • “Who are the evaluators?” — came up prominently; needs a stronger early answer
  • Ben Zarankin: consider whether evaluators should specialize in the evaluator role more explicitly — worth exploring / referencing in the deck

For the next iteration — DONE, ported to generic-talk.qmd (19 June 2026)

These lessons have now been incorporated into generic-talk.qmd (the maintained, reusable deck — live at https://uj-talk.netlify.app):

This Exeter deck is now frozen; the generic deck is canonical going forward. –>

1 · Problem 2 · Unjournal 3 · Evidence 4 · Pivotal Qs 5 · Exeter 6 · Ahead