Listen Get

The Optimal Budget Generator: A Causal Inference Protocol for Maximizing Median Health and Wealth Through Public Goods Funding

Generating Integrated Public Budget Recommendations Using Diminishing Returns Modeling and Cost-Effectiveness Analysis

Author
Affiliation

Mike P. Sinn

Abstract

20-40% of public goods funding is misallocated relative to outcome-maximizing benchmarks, representing trillions annually in foregone welfare gains. Budget processes respond to lobbying intensity and historical precedent rather than causal evidence of effectiveness.

The Optimal Budget Generator (OBG) applies causal inference, diminishing returns modeling, and cost-effectiveness analysis to determine optimal public goods funding levels that maximize two welfare metrics: real after-tax median income growth and median healthy life years. For each spending category, OBG estimates an Optimal Spending Level (OSL) identifying where marginal returns equal opportunity cost.

The Budget Impact Score (BIS) measures confidence in each OSL estimate based on study quality, statistical precision, and temporal recency of the underlying causal evidence. The result is a gap analysis showing which categories are over- or underfunded relative to evidence-based benchmarks, enabling systematic reallocation from low-return to high-return public investments. At system scale, the model’s Optimal-Governance Path reaches 56.7x the Earth baseline after 20 years, raises average income to $1.16M versus $20.5K on the status-quo path, reaches $10.7 quadrillion in total output, and recovers roughly $101T/year in suppressed value (The Political Dysfunction Tax).

Keywords

budget optimization, optimal budget generator, evidence-based policy, meta-analysis, cost-effectiveness, diminishing returns, cross-country analysis, public finance, welfare economics, spending targets

Abstract

This specification describes the Optimal Budget Generator (OBG) framework, a systematic approach to generating integrated budget recommendations that maximize welfare as measured by two metrics: real after-tax median income growth and median healthy life years.

Three ways to figure out optimal spending combine to show the gap between what you spend and what you should spend. The gap is filled with lobbyists.

Three ways to figure out optimal spending combine to show the gap between what you spend and what you should spend. The gap is filled with lobbyists.

JEL Classification: H50, H61, D61, I18, C18

Unlike marginal-return frameworks that ask “where should we invest the next dollar?”, OBG asks “what should the complete budget allocation be?” Each category has a target level - too little means underinvestment, too much means diminishing returns. But unlike the Recommended Daily Allowance for nutrients (where you can meet all targets simultaneously), budget allocation is zero-sum: spending more on one category means less for others. OBG generates integrated recommendations that balance these tradeoffs.

The framework combines two evidence sources: (1) diminishing returns modeling from cross-country dose-response studies, and (2) cost-effectiveness threshold analysis from health economics. The Budget Impact Score (BIS) measures our confidence in each category’s OSL estimate based on the quality and quantity of causal evidence from the econometric literature.

The result is a gap analysis showing which categories are underfunded relative to evidence-based optimal levels, enabling systematic reallocation from overinvestment to underinvestment. Applied to the US federal budget, the framework identifies pragmatic clinical trials as the most severely underinvested category (9,900% below optimal with 637:1 benefit-cost ratio), followed by vaccinations, basic research, and early childhood education.

Scale note. If budget allocation is the core bottleneck, the upside from fixing it is civilization-scale. Under the project’s best-case governance ceiling, the recoverable upside is $101T per year. The 20-year Optimal-Governance Path reaches 56.7x the Earth baseline, raises average income to $1.16M versus $20.5K on the status-quo path, and reaches $10.7 quadrillion in total output. This specification focuses on the allocation layer of that plan; the full derivation lives in The Political Dysfunction Tax46.

59.1 System Overview

59.1.1 What Policymakers See

A dashboard showing spending gaps by category, with clear recommendations:

59.2 Illustrative Example: US Federal Budget Gap Analysis

The following table demonstrates how OBG output would appear. OSL estimates for fully derived categories (pragmatic trials, vaccinations) come from the worked examples in Sections 6-7. Remaining OSL estimates use cross-country benchmarking.

Category Current OSL Gap Evidence Income Effect Health Effect Action
Pragmatic clinical trials $0.5B $50B +$49.5B A (RCTs) ++ +++ Scale 100x
Vaccinations $8B $35B +$27B A (RCTs) + +++ Increase
Basic research $45B $90B +$45B B (spillovers) ++ ++ Increase
Early childhood (0-5) $50B $70B +$20B A (RCTs) +++ + Increase
Military (discretionary) $850B $459B -$391B C (benchmarks) Decrease
Agricultural subsidies $25B $0B -$25B A (welfare analysis) Eliminate

Positive gaps indicate underinvestment; negative gaps indicate overinvestment. Income Effect: impact on real after-tax median income growth. Health Effect: impact on median healthy life years. Scale: +++ strong positive, ++ moderate positive, + weak positive, − negative.

59.2.1 What Budget Analysts See

  • OSL estimates with confidence intervals and methodology notes
  • Cross-country spending data showing spending-outcome relationships
  • Diminishing returns curves identifying optimal spending levels
  • Evidence quality scores (BIS) for each category
  • Sensitivity analysis showing how OSL changes with different assumptions
  • Priority rankings by gap size weighted by evidence confidence

59.2.2 Where This Fits

+-------------------------------------------------------------+
|                    OPTIMOCRACY FRAMEWORK                     |
+-------------------------------------------------------------+
|                                                              |
|  +---------------------+    +-----------------------------+  |
|  |  Budget Generator   |    |  Policy Generator           |  |
|  |  (OBG/BIS Framework)|    |  (OPG/PIS Framework)        |  |
|  |                     |    |                             |  |
|  |  Answers:           |    |  Answers:                   |  |
|  |  "How should we     |    |  "What policies should      |  |
|  |  allocate the       |    |  we adopt/change?"          |  |
|  |  budget?"           |    |                             |  |
|  |                     |    |                             |  |
|  |  Primary output:    |    |  Primary output:            |  |
|  |  Integrated budget  |    |  Enact/Replace/Repeal       |  |
|  |  recommendations    |    |  recommendations            |  |
|  +---------------------+    +-----------------------------+  |
|                                                              |
|  Both feed into: Constitutional Layer (outcome-optimizing rules)  |
+-------------------------------------------------------------+

The OBG/BIS framework answers: “Given what we know about returns to spending, what are the optimal allocation levels?”

Budget generator and policy generator both feed into constitutional rules. It’s checks and balances, but the checks can do maths.

Budget generator and policy generator both feed into constitutional rules. It’s checks and balances, but the checks can do maths.

The OPG framework (see Optimal Policy Generator Specification137) answers: “Which policy reforms beyond budget allocation would most improve welfare?”

59.2.3 Implementation Mechanism

This specification focuses on generating evidence-based budget recommendations. Political implementation mechanisms are discussed separately in Incentive Alignment Bonds138.

59.3 Introduction

59.3.1 Why Budget Allocation Fails Today

Budget allocation is fundamentally a problem of social choice under uncertainty139. The challenge is not simply technical but institutional: current budget processes systematically diverge from welfare-optimal allocations due to political economy dynamics140,141.

Current budgets: lobbying and ‘we’ve always done it this way.’ Result: money goes to things that don’t work instead of things that do. Tradition is expensive.

Current budgets: lobbying and ‘we’ve always done it this way.’ Result: money goes to things that don’t work instead of things that do. Tradition is expensive.

Current budget allocation follows a process dominated by:

  1. Lobbying intensity: Categories with organized beneficiaries (weapons manufacturers, agricultural lobbies) receive disproportionate funding regardless of evidence
  2. Historical inertia: This year’s budget is last year’s budget plus a percentage, not a fresh optimization
  3. Visible vs. invisible beneficiaries: Programs with identifiable beneficiaries (veterans) outcompete programs with diffuse beneficiaries (basic research)
  4. Political salience: Crises drive spending regardless of cost-effectiveness (terrorism vs. air pollution)
  5. Zero-sum framing: Budget debates treat all categories as competing rather than asking which ones are at optimal levels

The result: systematic overinvestment in low-return categories and underinvestment in high-return categories. Historical examples demonstrate the scale of missed opportunities: the smallpox eradication campaign returned an estimated 450:1 ROI92, yet similar high-return public health investments remain chronically underfunded.

59.3.2 The RDA Analogy: Optimal Levels, Not Just Marginal Returns

Nutrition science doesn’t just say “eat more vitamins.” It specifies Recommended Daily Allowances - target intake levels where:

  • Below RDA: Deficiency symptoms, reduced function
  • At RDA: Optimal health benefits
  • Above RDA: Diminishing returns, potential toxicity

Budget allocation should work the same way. For each spending category:

  • Below OSL: Foregone welfare gains (underinvestment)
  • At OSL: Optimal welfare return per dollar
  • Above OSL: Diminishing or negative returns (overinvestment)

infinite spending on any category doesn’t make sense, even one with high returns. Early childhood education has excellent returns - but spending $10 trillion on it wouldn’t produce 10x the benefits of spending $1 trillion. There’s an optimal level.

Nutritionists tell you how much vitamin C you need. OSL tells governments how much education funding they need. One prevents scurvy, the other prevents stupid.

Nutritionists tell you how much vitamin C you need. OSL tells governments how much education funding they need. One prevents scurvy, the other prevents stupid.

59.3.3 What This Framework Provides

Five pieces: evidence-based targets, gap analysis, priority ranking, uncertainty assessment, and wishful thinking. Wait, scratch that last one.

Five pieces: evidence-based targets, gap analysis, priority ranking, uncertainty assessment, and wishful thinking. Wait, scratch that last one.
  1. Target spending levels for each budget category based on evidence
  2. Gap analysis showing where current spending diverges from optimal
  3. Evidence grading so policymakers know which OSL estimates are reliable
  4. Priority ranking for reallocation decisions
  5. Uncertainty quantification around each estimate

59.3.4 Outcome Metrics: What We’re Optimizing

All OBG recommendations ultimately aim to maximize two welfare metrics:

  1. Real after-tax median income growth (pp/year): Year-over-year percentage change in inflation-adjusted, post-tax median household income. Sources: Census Bureau, BLS.

  2. Median healthy life years (years): Expected years of life in good health at the population median. Sources: WHO Global Health Observatory, national health surveys.

The welfare function combines these with equal weight by default:

\[ W = 0.5 \cdot \text{IncomeGrowth} + 0.5 \cdot \text{HealthyYears} \]

Why these two metrics? Most policy effects eventually show up in one or both. Economic policies (taxes, regulations, trade) primarily affect income growth. Health policies (healthcare access, public health, safety) primarily affect healthy life years. Education and infrastructure affect both. See Two-Metric Welfare Function for the complete framework.

Every spending category’s OSL is ultimately justified by its expected impact on these two metrics. The gap analysis and priority rankings reflect which reallocations would most improve the combined welfare function.

59.5 Theoretical Framework

This section formalizes the OBG framework as a social planner’s optimization problem, establishing the theoretical foundations for optimal spending levels and evidence-weighted allocation.

A social planner is someone who plans society. They take evidence, weigh it (not literally), and decide how much money to spend on things. It’s like meal planning but for countries.

A social planner is someone who plans society. They take evidence, weigh it (not literally), and decide how much money to spend on things. It’s like meal planning but for countries.

59.5.1 The Social Planner’s Problem

Consider a benevolent social planner allocating a fixed budget \(B\) across \(n\) spending categories. Each category generates welfare measured using the two-metric framework: real after-tax median income growth and median healthy life years.

Why these specific metrics? They are universal instrumental goods: virtually everyone wants higher purchasing power and longer healthy life, regardless of other values. They are hard to game (improving them requires actually helping typical citizens), measured by independent statistical agencies, and capture most policy effects. GDP can rise while median income stagnates; this framework correctly identifies such outcomes as low-welfare.

Let \(s_i\) denote spending on category \(i\), with \(\sum_{i=1}^{n} s_i = B\). Each category produces effects on both welfare metrics:

  • \(\beta_i^{inc}(s_i)\): Effect on real after-tax median income growth (pp/year)
  • \(\beta_i^{hlth}(s_i)\): Effect on median healthy life years (years)

Total welfare from category \(i\) follows the two-metric welfare function:

\[ W_i(s_i) = \alpha \cdot \beta_i^{inc}(s_i) + (1-\alpha) \cdot \beta_i^{hlth}(s_i) \]

where \(\alpha = 0.5\) by default (equal weight to economic and health welfare). All welfare calculations in this framework flow through these two metrics.

Assumption 1 (Diminishing Returns). For each category \(i\), both effect functions \(\beta_i^{inc}\) and \(\beta_i^{hlth}\) are twice continuously differentiable with positive first derivatives and negative second derivatives for all \(s > 0\).

The social planner maximizes aggregate welfare:

\[ \max_{\{s_i\}_{i=1}^{n}} \sum_{i=1}^{n} W_i(s_i) \quad \text{subject to} \quad \sum_{i=1}^{n} s_i = B, \quad s_i \geq 0 \ \forall i \]

Proposition 1 (Equimarginal Principle). At the optimal allocation \(\{s_i^*\}\), marginal welfare is equalized across all categories with positive spending:

\[ W_i'(s_i^*) = \lambda^* \quad \forall i \text{ with } s_i^* > 0 \]

where \(\lambda^*\) is the shadow price of the budget constraint.

Proof. The Lagrangian is \(\mathcal{L} = \sum_i W_i(s_i) - \lambda(\sum_i s_i - B)\). First-order conditions yield \(W_i'(s_i^*) = \lambda\) for interior solutions. By strict concavity of \(W_i\), the second-order conditions are satisfied. \(\square\)

59.5.2 Optimal Spending Levels Under Uncertainty

In practice, the welfare functions \(W_i(\cdot)\) are not known with certainty. Let \(\hat{W}_i(s)\) denote the planner’s estimate of welfare, with associated uncertainty \(\sigma_i^2(s)\).

Definition 1 (Optimal Spending Level). The Optimal Spending Level for category \(i\) is:

\[ \text{OSL}_i \equiv \arg\max_{s_i} \mathbb{E}[\hat{W}_i(s_i)] - \frac{\rho}{2} \text{Var}[\hat{W}_i(s_i)] \]

where \(\rho \geq 0\) is the planner’s risk aversion parameter.

For risk-neutral planners (\(\rho = 0\)), OSL reduces to the spending level that maximizes expected welfare. For risk-averse planners, OSL accounts for estimation uncertainty.

Proposition 2 (OSL Characterization). Under Assumption 1, with estimated marginal welfare \(\hat{W}_i'(s)\) and estimation variance \(\sigma_i^2(s)\), the OSL satisfies:

\[ \mathbb{E}[\hat{W}_i'(\text{OSL}_i)] = r + \rho \cdot \frac{\partial \sigma_i^2}{\partial s}\bigg|_{s=\text{OSL}_i} \]

where \(r\) is the social discount rate (opportunity cost of public funds).

Proof. The first-order condition for the uncertainty-adjusted maximization problem yields the result. The term \(r\) represents the marginal value of funds in alternative uses; the second term adjusts for risk. \(\square\)

59.5.3 Budget Impact Score as Precision Weighting

The Budget Impact Score formalizes the precision of OSL estimates, enabling evidence-weighted reallocation decisions.

Definition 2 (Budget Impact Score). For category \(i\) with \(n_i\) effect estimates \(\{\hat{\beta}_{ij}\}_{j=1}^{n_i}\), the Budget Impact Score is:

\[ \text{BIS}_i = \min\left(1, \frac{1}{K} \sum_{j=1}^{n_i} w_j^Q \cdot w_j^P \cdot w_j^R \right) \]

where:

  • \(w_j^Q \in (0,1]\) = quality weight based on identification strategy (RCT = 1, cross-sectional = 0.25)
  • \(w_j^P = 1/\text{SE}(\hat{\beta}_j)^2\) = precision weight (inverse variance)
  • \(w_j^R = e^{-\delta(t_{now} - t_j)}\) = recency weight with decay rate \(\delta\)
  • \(K\) = calibration constant

Proposition 3 (BIS as Inverse Variance). Under standard meta-analytic assumptions, BIS is proportional to the precision of the pooled effect estimate:

\[ \text{BIS}_i \propto \frac{1}{\text{Var}(\hat{\beta}_i^{pooled})} \]

where \(\hat{\beta}_i^{pooled}\) is the quality-weighted pooled estimate of spending effects.

Three ingredients that tell you how much to trust a number: how good it is, how exact it is, and how old it is. Like checking the expiration date on milk, but for statistics.

Three ingredients that tell you how much to trust a number: how good it is, how exact it is, and how old it is. Like checking the expiration date on milk, but for statistics.

59.5.4 Gap Analysis and Welfare Gains

Definition 3 (Spending Gap). The spending gap for category \(i\) is:

\[ \text{Gap}_i = \text{OSL}_i - s_i^{current} \]

Proposition 4 (Welfare Gains from Gap Closure). For small gaps, the welfare gain from moving spending from current level to OSL is approximately:

\[ \Delta W_i \approx W_i'(s_i^{current}) \cdot \text{Gap}_i - \frac{1}{2} |W_i''(\bar{s})| \cdot \text{Gap}_i^2 \]

where \(\bar{s}\) is between \(s_i^{current}\) and \(\text{OSL}_i\).

Proof. Taylor expansion of \(W_i(\text{OSL}_i) - W_i(s_i^{current})\) around \(s_i^{current}\). \(\square\)

Corollary 1 (Priority Ranking). Categories should be prioritized for reallocation in order of:

\[ \text{Priority}_i = |\text{Gap}_i| \times \text{BIS}_i \times |W_i'(s_i^{current})| \]

This ranks categories by expected welfare gain adjusted for estimation confidence.

Note: In the simplified implementation (Section 10.2), we normalize by setting \(|W_i'(s_i^{current})| = 1\) for all categories, reducing the priority formula to \(\text{Priority}_i = |\text{Gap}_i| \times \text{BIS}_i\). This assumes equal marginal welfare weights across categories as a first approximation. Future iterations could incorporate category-specific marginal welfare estimates.

59.5.5 Welfare Bounds Under Model Uncertainty

When the functional form of \(W_i(\cdot)\) is uncertain, we can establish bounds on welfare gains.

Proposition 5 (Welfare Bounds). Let \(\underline{W}_i\) and \(\overline{W}_i\) denote lower and upper bounds on the welfare function consistent with available evidence. Then:

\[ \underline{\Delta W} = \sum_{i: \text{Gap}_i > 0} \underline{W}_i'(s_i) \cdot \text{Gap}_i \leq \Delta W \leq \sum_{i: \text{Gap}_i > 0} \overline{W}_i'(s_i) \cdot \text{Gap}_i = \overline{\Delta W} \]

The OBG framework reports both point estimates and these bounds via sensitivity analysis.

59.5.6 Summary of Theoretical Results

Result Implication for OBG
Proposition 1 Optimal allocation equalizes marginal returns
Proposition 2 OSL accounts for both expected returns and uncertainty
Proposition 3 BIS captures estimation precision
Proposition 4 Gap closure yields quantifiable welfare gains
Corollary 1 Priority ranking optimizes reallocation sequence
Proposition 5 Welfare bounds enable robust recommendations

59.6 Core Methodology

59.6.1 Spending Category Data Structure

Boxes connected by lines. The boxes represent different kinds of information. The lines mean they’re related. It’s a family tree for spreadsheets.

Boxes connected by lines. The boxes represent different kinds of information. The lines mean they’re related. It’s a family tree for spreadsheets.

The OBG framework uses a structured representation of budget categories:

-- Spending categories
spending_categories (
    id, name, parent_category_id,
    spending_type, -- 'program', 'transfer', 'investment', 'regulatory'
    outcome_categories, -- which welfare outcomes this affects
    current_spending_usd, fiscal_year,
    data_source, last_updated
)

-- Cross-country spending data
reference_spending (
    category_id, country_code, year,
    spending_usd, spending_per_capita,
    spending_pct_gdp, population, gdp,
    data_source
)

-- Optimal spending level estimates
osl_estimates (
    category_id, estimation_method,
    osl_usd, osl_per_capita, osl_pct_gdp,
    confidence_interval_low, confidence_interval_high,
    evidence_grade, bis_score,
    methodology_notes, last_updated
)

-- Gap analysis
spending_gaps (
    category_id, current_spending_usd,
    osl_usd, gap_usd, gap_pct,
    priority_score, -- gap * BIS confidence
    recommended_action
)

59.6.2 Two Methods for OSL Estimation

Method Use Case Data Required Strengths Limitations
Diminishing returns modeling Categories with cross-country spending-outcome data Effect estimates at multiple spending levels Theoretically grounded, finds optimal “knee” Requires sufficient country variation
Cost-effectiveness threshold Health/life-saving interventions Cost per QALY/DALY, willingness-to-pay Links to standard health economics26 Limited to monetizable outcomes

Each method is detailed below.

59.7 Diminishing Returns Modeling

59.7.1 The Core Concept

The fiscal multiplier literature establishes that spending effects vary systematically with scale144,145. At low spending levels, each additional dollar produces substantial welfare gains. At high spending levels, marginal returns diminish. The OSL is where marginal return equals opportunity cost.

The first dollar you spend helps a lot. The millionth dollar helps less. The graph tells you when to stop spending money on one thing and start spending it on another thing.

The first dollar you spend helps a lot. The millionth dollar helps less. The graph tells you when to stop spending money on one thing and start spending it on another thing.

\[ \text{OSL}: \frac{\partial \text{Outcome}}{\partial \text{Spending}} = r \]

Where \(r\) is the discount rate or opportunity cost of capital (typically 3-7%).

59.7.2 Finding the “Knee” of the Curve

Empirically, we look for the point where the outcome-spending relationship flattens:

Outcome
   ^
   |                    ___________
   |                 __/
   |               _/
   |             _/
   |           _/   <- OSL is around here
   |         _/
   |       _/
   |     _/
   |   _/
   | _/
   |/
   +-----------------------------------> Spending
         Low            High

59.7.3 Estimation Methods

1. Nonlinear regression on cross-country data

Fit diminishing returns functions:

\[ \text{Outcome} = \alpha + \beta \cdot \log(\text{Spending}) + \epsilon \]

Or with saturation:

\[ \text{Outcome} = \alpha + \beta \cdot \frac{\text{Spending}}{\text{Spending} + \gamma} \]

Where \(\gamma\) is the half-saturation constant.

2. Piecewise linear estimation

Estimate separate slopes for different spending ranges to identify where returns diminish.

3. Meta-regression of effect estimates

If multiple studies estimate effects at different spending levels, meta-regression can identify how effects vary with baseline spending. The credibility of such estimates depends critically on identification strategy146.

59.7.4 Worked Example: K-12 Education Spending

Primary metric affected: Real after-tax median income growth (via higher wages from improved skills).

147 exploited court-ordered school finance reforms to estimate causal effects of K-12 spending. Key finding: a 10% increase in per-pupil spending increases adult earnings by 7% for students from low-income families.

Does this effect diminish at higher spending levels?

Evidence from cross-state variation suggests:

Baseline spending (per pupil) Effect of 10% increase Implied marginal return
$8,000 +8% earnings $0.80 per $1
$12,000 +5% earnings $0.50 per $1
$16,000 +3% earnings $0.30 per $1
$20,000 +1% earnings $0.10 per $1

OBG estimation: At $16,000/pupil, the marginal return (~0.30) roughly equals the social discount rate. This suggests:

  • Current US average: ~$15,000/pupil
  • OSL: ~$16,000-$18,000/pupil (modest underinvestment)
  • Gap: ~$50B nationally

Evidence grade: B (strong causal identification, moderate extrapolation uncertainty)

59.8 Worked Example: Pragmatic Clinical Trials

59.8.1 The Highest-Return Public Investment

Metrics affected: Both real after-tax median income growth (via reduced healthcare costs and improved productivity) and median healthy life years (via better treatments). This dual impact contributes to the exceptionally high returns.

Some ways of spending government money work better than others. Also, cheap trials work as well as expensive trials, but cost less. This required two charts to explain.

Some ways of spending government money work better than others. Also, cheap trials work as well as expensive trials, but cost less. This required two charts to explain.

Pragmatic clinical trials represent perhaps the single highest-return category of public investment identified in the literature. While vaccinations return 13:1 and early childhood education returns 4:1, pragmatic trials demonstrate benefit-cost ratios of 637:1148.

The UK’s RECOVERY trial demonstrated this dramatically during COVID-19: it cost approximately $500 versus $41K for traditional Phase 3 trials, a 82x cost reduction149. This single trial identified dexamethasone as a life-saving treatment, preventing an estimated 1 million deaths globally.

59.8.2 OSL Estimation

Pragmatic trials represent an innovation frontier where no country has achieved optimal investment. We estimate OSL from cost-effectiveness analysis:

  1. Unmet medical need: Approximately 2.88 billion DALYs/year from conditions lacking adequate treatment
  2. Cost per DALY averted: Pragmatic trials cost $929 (ADAPTABLE trial) vs. $41K traditional
  3. Scale-up potential: Current global clinical trial spending is approximately $60B/year, but only ~$500M goes to pragmatic/embedded designs
Data Point Value Source
Current pragmatic trial spending (US) ~$500M NIH Common Fund
Traditional trial spending (global)

$60B

Industry + NIH
Cost per patient (pragmatic)

$929

ADAPTABLE trial
Cost per patient (traditional Phase 3)

$41K

Industry average
Cost reduction factor 44.1x Calculated

59.8.3 Diminishing Returns Analysis

Unlike most spending categories, pragmatic trials show increasing returns at current spending levels due to:

  1. Network effects: Each additional participant improves statistical power for all trials
  2. Infrastructure leverage: Platform trials amortize fixed costs across multiple interventions
  3. Learning effects: Evidence accumulation improves trial design efficiency

The “knee” of the diminishing returns curve is estimated at $50-100B annually (vs. current ~$500M), suggesting we are operating far below optimal.

We spend 500 million on pragmatic trials. The graph says we should spend 50 to 100 billion. We are so far to the left of where we should be that we’re practically off the chart.

We spend 500 million on pragmatic trials. The graph says we should spend 50 to 100 billion. We are so far to the left of where we should be that we’re practically off the chart.

59.8.4 Cost-Effectiveness Calculation

Using standard health economics methodology:

Component Value Calculation
Cost per pragmatic trial participant

$929

ADAPTABLE benchmark
QALYs gained per participant 0.05-0.2 Evidence generation value
Cost per QALY $4,600-$18,600 Well below $50K threshold
Scale-up population 50M patients/year 10% of treatable conditions
OSL estimate $50B/year Conservative

59.8.5 Gap Analysis

Metric Value
Current spending (pragmatic trials) ~$500M
OSL $50B
Gap +$49.5B (99x underinvestment)
Gap % of current +9,900%
Opportunity cost 637:1 foregone returns

Evidence grade: A (RCT evidence from RECOVERY, ADAPTABLE; strong theoretical foundation)

59.8.6 Why This Category Dominates

Pragmatic clinical trials have the highest priority score of any category analyzed:

\[ \text{Priority} = |\text{Gap}| \times \text{BIS} = \$49.5B \times 0.90 = 44.6 \]

Among categories requiring increased investment, this is the highest priority score, exceeding basic research (31.5), vaccinations (25.7), and early childhood (17.0). Military spending has a larger absolute priority score (195.5) due to its massive gap, but represents overinvestment requiring reduction.

59.9 Cost-Effectiveness Threshold Analysis

59.9.1 The Standard Health Economics Approach

Cost-effectiveness analysis has become the standard framework for health resource allocation decisions143. The QALY (Quality-Adjusted Life Year) metric enables comparison across diverse health interventions by monetizing health outcomes at a consistent threshold150.

For health interventions, cost-effectiveness analysis provides OSL estimates:

\[ \text{OSL} = \sum_{\text{interventions}} \text{Scale}_i \times \text{Cost}_i \quad \text{where } \frac{\text{Cost}_i}{\text{QALY}_i} < \text{WTP} \]

Where:

  • \(\text{Scale}_i\) = target population for intervention \(i\)
  • \(\text{Cost}_i\) = per-person cost of intervention \(i\)
  • \(\text{QALY}_i\) = QALYs gained per person from intervention \(i\)
  • \(\text{WTP}\) = willingness-to-pay threshold (typically $50K-$150K per QALY)

59.9.2 Building Up from Intervention-Level Data

Four boxes with arrows between them. The boxes show how you take lots of small numbers and turn them into one big number. Addition with extra steps.

Four boxes with arrows between them. The boxes show how you take lots of small numbers and turn them into one big number. Addition with extra steps.

For each health intervention with cost-effectiveness data:

  1. Identify target population who would benefit
  2. Calculate scale-up cost to reach entire target population
  3. Include only interventions below the cost-effectiveness threshold
  4. Sum to get category OSL

59.9.3 Worked Example: Vaccinations

Primary metric affected: Median healthy life years (via disease prevention and mortality reduction).

Vaccinations represent one of the highest-return public health investments, with estimated returns of 44:1 for routine childhood immunization9,151. The economic benefits include avoided medical costs, productivity gains, and reduced mortality8.

Cost-effectiveness estimates from CEA Registry and CDC vaccination cost studies. QALY estimates reflect average health gains across target populations; costs include vaccine acquisition, administration, and program overhead.

Intervention Target pop. Cost/person QALY/person Cost/QALY Source Include?
Childhood routine 4M births $500 0.1 $5,000 CDC VFC Yes
HPV vaccination 4M teens $300 0.05 $6,000 CEA Registry Yes
Flu (elderly) 50M elderly $40 0.01 $4,000 CDC Yes
Shingles 40M eligible $200 0.02 $10,000 CEA Registry Yes
COVID boosters 100M adults $30 0.005 $6,000 CDC Yes

All interventions fall well below the conventional $50,000-$150,000 per QALY cost-effectiveness threshold, indicating strong economic justification for full scale-up.

OBG calculation:

  • Childhood routine: 4M × $500 = $2.0B
  • HPV: 4M × $300 = $1.2B
  • Flu (elderly): 50M × $40 = $2.0B
  • Shingles: 40M × $200 = $8.0B
  • COVID boosters: 100M × $30 = $3.0B
  • Total OSL: ~$16B (vs. current ~$8B)

Gap: +$8B (underinvestment)

Evidence grade: A (RCT evidence for most vaccines, well-established cost-effectiveness)

59.10 Budget Impact Score (BIS)

The Budget Impact Score measures confidence in each category’s OSL estimate based on the quality and quantity of causal evidence. The scoring methodology draws on the established evidence hierarchy from the econometrics literature146,152.

A pyramid of trustworthiness. Randomized trials sit at the top wearing a crown. Someone’s opinion sits at the bottom, wondering what it did wrong.

A pyramid of trustworthiness. Randomized trials sit at the top wearing a crown. Someone’s opinion sits at the bottom, wondering what it did wrong.

59.10.1 BIS Calculation

For each spending category \(i\):

Step 1: Gather effect estimates

Collect all available causal effect estimates \(\{\beta_{i,1}, \beta_{i,2}, ..., \beta_{i,n_i}\}\) from the econometric literature.

Step 2: Compute quality weights

Identification Method Quality Weight (\(w^Q\))
Randomized controlled trial 1.00
Natural experiment (difference-in-differences, regression discontinuity) 0.85
Instrumental variables 0.70
Panel with fixed effects 0.55
Cross-sectional regression 0.25

Step 3: Compute precision weights

\[ w^P_j = \frac{1}{\text{SE}(\beta_j)^2} \]

Step 4: Compute recency weights

\[ w^R_j = e^{-0.03(t_{now} - t_j)} \]

Step 5: Compute confidence score

\[ \text{BIS}_i = \min\left(1, \frac{\sum_j w^Q_j \cdot w^P_j \cdot w^R_j}{K}\right) \]

Where \(K\) is a calibration constant.

59.10.2 Evidence Grading from BIS

BIS Range Grade Interpretation OSL Confidence
0.80 - 1.00 A Strong causal evidence High - proceed with reallocation
0.60 - 0.79 B Good evidence Moderate - proceed with monitoring
0.40 - 0.59 C Mixed evidence Low - pilot before scaling
0.20 - 0.39 D Weak evidence Very low - research priority
0.00 - 0.19 F Insufficient evidence Unknown - cannot estimate OSL

59.11 Gap Analysis and Priority Ranking

59.11.1 Computing Gaps

For each category \(i\):

\[ \text{Gap}_i = \text{OSL}_i - \text{Current}_i \]

  • Gap > 0: Underinvestment (increase spending)
  • Gap = 0: At optimal (maintain)
  • Gap < 0: Overinvestment (decrease spending)

59.11.2 Priority Score

Prioritize reallocation by gap size weighted by confidence:

\[ \text{Priority}_i = |\text{Gap}_i| \times \text{BIS}_i \]

Categories with large gaps AND high confidence should be addressed first.

59.11.3 Illustrative Example: Priority Ranking

The following uses the same illustrative data from the dashboard example above. OSL estimates for pragmatic trials, vaccinations, and K-12 education are derived in Sections 5-7. Other OSL values use cross-country benchmarking. BIS scores summarize available causal evidence quality.

Category Current OSL Gap BIS Inc Hlth Priority Action
Pragmatic trials $0.5B $50B +$49.5B 0.90 ++ +++ 44.6 Scale 100x
Basic research $45B $90B +$45B 0.70 ++ ++ 31.5 Increase
Vaccinations $8B $35B +$27B 0.95 + +++ 25.7 Increase
Early childhood $50B $70B +$20B 0.85 +++ + 17.0 Increase
Military $850B $459B -$391B 0.50 195.5 Decrease
Ag subsidies $25B $0B -$25B 0.90 22.5 Eliminate

Inc = effect on real after-tax median income growth. Hlth = effect on median healthy life years. Scale: +++ strong, ++ moderate, + weak, − negative.

Reallocation plan: Cut military discretionary (-$391B) and agricultural subsidies (-$25B) to fund pragmatic clinical trials (+$49.5B), basic research (+$45B), vaccinations (+$27B), and early childhood (+$20B). Pragmatic trials have the highest priority score among positive-gap categories due to extreme underinvestment combined with strong evidence, and they improve both welfare metrics.

59.12 Multi-Unit Reporting

59.12.1 The Problem with Abstract Scores

Composite scores (like 0-1 BIS values) obscure interpretability. Policymakers and citizens understand dollars, lives, and years - not abstract indices.

59.12.2 Reporting at Multiple Levels

Level Units Use Case Example
0. Core metrics pp/year income growth, healthy life years Primary welfare outcomes “+0.1 pp income growth, +0.05 healthy years”
1. Natural Domain-specific Interpretation within domain “Education: $2,100/student gap”
2. Monetized $ equivalent Cross-domain comparison “Expected welfare gain: $4.00 per $1”
3. Health QALYs/DALYs Health-weighted comparison “12,000 QALYs per $1B invested”
4. Composite 0-1 score Ranking when monetization uncertain “BIS = 0.85”

Level 0 (Core Metrics) reports expected changes to the two welfare metrics directly. All other levels are derived from or convertible to these core outcomes. QALYs (Level 3) translate directly to median healthy life years. Monetized values (Level 2) combine income effects with health effects valued at standard rates.

59.12.3 Conversion Factors

Conversion Value Source Notes
Value of Statistical Life (VSL) ~$10M EPA, DOT US regulatory standard
Value per QALY $50K-$150K ICER, WHO Context-dependent
QALY → $ $100K/QALY Mid-range estimate For cross-domain
Life-year → QALY ~0.8-1.0 Age/health adjusted Quality weighting

59.12.4 Worked Example: Multi-Unit Output

Category: Early Childhood Education

Unit Level Value Interpretation
Natural +$20B gap Current: $50B, OSL: $70B
Per-child +$833/child gap 24M children
Monetized ROI 4:1 NPV return 153
Health (QALYs) +8K QALYs/year Per $1B additional
Composite (BIS) 0.85 High-quality RCT evidence

Recommendation: Moderate underinvestment with strong evidence. Closing the gap would yield ~$80B in NPV returns.

59.13 Quality Requirements and Validation

59.13.1 Minimum Thresholds for OBG Estimation

Criterion Minimum Rationale
Reference countries 5+ Avoid outlier bias
Dose-response studies 3+ Identify diminishing returns
Causal effect estimates 2+ Cross-validate
Data recency Within 10 years Relevance
BIS for reallocation > 0.40 Sufficient confidence

59.13.2 Robustness Checks

For each OSL estimate, report:

  1. Leave-one-country-out: Does excluding any single country change OSL by >20%?
  2. Method comparison: Do diminishing returns and cost-effectiveness methods agree?
  3. Time stability: Has OSL changed substantially over past 5 years?
  4. Sensitivity to assumptions: How does OSL change with ±20% parameter variation?

Four ways to check if your answer is wrong. Look for weird numbers. Make sure you did the same thing every time. Check if it changes when time passes. Wiggle the inputs and see if it explodes.

Four ways to check if your answer is wrong. Look for weird numbers. Make sure you did the same thing every time. Check if it changes when time passes. Wiggle the inputs and see if it explodes.

59.14 Interpreting Results

59.14.2 What the Algorithm Cannot Tell You

Factor OBG Captures OBG Does Not Capture
Evidence-optimal spending level Yes
Confidence in estimates Yes
Direction of reallocation Yes
Political feasibility No
Implementation capacity No
Transition costs No
Distributional effects No
Novel interventions No

OBG provides evidence-based targets. Political judgment is still required for implementation strategy.

59.15 Pilot Program Prioritization

59.15.1 Value of Information for Uncertain Categories

Categories with low BIS but potentially high returns warrant research investment:

\[ \text{VOI}_i = \text{Potential Gap}_i \times (1 - \text{BIS}_i) \times P(\text{high return}) \]

High-VOI categories should receive pilot funding to generate better evidence.

59.15.3 Learning Feedback Loop

A circle with four boxes in it. The boxes say: spend money, see what happened, learn from it, spend money differently. Then you go around the circle again. It’s like learning from your mistakes, but on purpose.

A circle with four boxes in it. The boxes say: spend money, see what happened, learn from it, spend money differently. Then you go around the circle again. It’s like learning from your mistakes, but on purpose.

After each budget cycle:

  1. Measure outcomes: Statistical agencies report welfare changes
  2. Update estimates: New data refines OSL estimates
  3. Recalculate priorities: Gaps and BIS scores updated
  4. Reallocate: Next cycle reflects improved evidence

59.16 Data Sources

59.16.1 Cross-Country Databases

International organizations maintain standardized cross-country spending and outcome data essential for diminishing returns analysis. The OECD provides the most comprehensive harmonized data for high-income countries98.

Database Coverage URL Use Case
OECD iLibrary 38 OECD members oecd-ilibrary.org154 Education, health, social spending
World Bank WDI 217 countries data.worldbank.org155 Broad spending and outcomes
SIPRI Global sipri.org156 Military spending
WHO GHED 194 countries who.int/data/gho157 Health expenditure
UNESCO UIS Global uis.unesco.org158 Education spending

59.16.2 Cost-Effectiveness Databases

Database Coverage URL Use Case
CEA Registry 8,000+ analyses cearegistry.org159 Health cost-effectiveness
Disease Control Priorities LMICs dcp-3.org160 Global health priorities
Cochrane Library 8,000+ reviews cochranelibrary.com161 Health intervention effects
Copenhagen Consensus Development copenhagenconsensus.com162 Development priorities

These databases enable systematic ranking of interventions by cost-effectiveness. For example, deworming programs consistently rank among the most cost-effective health interventions, with costs as low as $30-50 per DALY averted19.

59.16.3 US Budget Data

Source Coverage URL Use Case
OMB Historical Tables 1789-present whitehouse.gov/omb163 Federal spending
CBO Budget Analyses Federal cbo.gov164 Fiscal impact scoring142
USASpending Federal awards usaspending.gov165 Program-level detail
Census of Governments State & local census.gov166 Subnational spending

59.17 Limitations

59.17.1 Diminishing Returns Uncertainty

  • Functional form: True relationship may not match assumed function
  • Extrapolation: Estimating returns outside observed spending range
  • Interaction effects: Returns may depend on other spending categories

Mitigation: Report confidence intervals, use multiple functional forms, acknowledge extrapolation limits.

59.17.2 Implementation Capacity

Higher spending may not translate to outcomes if implementation capacity is lacking.

Money goes through two filters before it becomes results. The filters are called ‘make sure you can actually do this’ and ‘do it slowly so you don’t mess up.’ Filters for money that don’t involve coffee.

Money goes through two filters before it becomes results. The filters are called ‘make sure you can actually do this’ and ‘do it slowly so you don’t mess up.’ Filters for money that don’t involve coffee.

Mitigation: Pair spending increases with implementation assessment; phase in gradually.

59.18 Benchmarking Framework

This section defines how OBG should be benchmarked against historical and prospective budget outcomes.

59.18.1 Retrospective Validation

Question: Did jurisdictions that moved toward OSL achieve better outcomes than those that diverged?

Three steps to check if you were right. Step 1: go back in time (mathematically). Step 2: pretend you did what you should have done. Step 3: compare it to what actually happened. Like replaying a football match in your head where you win.

Three steps to check if you were right. Step 1: go back in time (mathematically). Step 2: pretend you did what you should have done. Step 3: compare it to what actually happened. Like replaying a football match in your head where you win.

Method: 1. Compute OSL for past periods using only data available at that time (to avoid lookahead bias) 2. Identify jurisdictions that moved toward/away from OSL 3. Compare subsequent outcomes using difference-in-differences or synthetic control methods167

Example: US State Education Spending 2000-2015

A retrospective analysis can examine whether states that moved toward education OSL (estimated from high-performing states like Massachusetts and Minnesota) subsequently showed improved test scores and graduation rates relative to states that diverged.

Challenges:

  • Confounding from simultaneous policy changes
  • Limited variation in spending changes within countries
  • Outcome measurement lags (education effects take years to materialize)

59.18.2 Prospective Validation

Question: Do OBG-guided reallocations improve outcomes going forward?

How to prove you’re not making things up. Write down your prediction before it happens. Tell everyone. Wait. Check if you were right. It’s the scientific method for not lying to yourself.

How to prove you’re not making things up. Write down your prediction before it happens. Tell everyone. Wait. Check if you were right. It’s the scientific method for not lying to yourself.

Method: 1. Pre-register OBG predictions publicly before budget decisions 2. Monitor jurisdictions that adopt OBG guidance vs. those that don’t 3. Compare outcome trajectories using appropriate causal identification

Implementation: Publish annual OSL estimates for US federal budget categories, creating a public record for prospective benchmarking. If jurisdictions that adopt OBG guidance systematically outperform those that do not, the framework is doing its job.

59.18.3 Success Metrics

Metric Definition Target Interpretation
Gap reduction Did spending move toward OSL? > 50% of gap closed in 10 years Tests political feasibility
Outcome improvement Did welfare metrics improve more in OBG-following jurisdictions? > 10% relative improvement Tests welfare prediction accuracy
Prediction accuracy Did estimated returns match actual returns? Correlation r > 0.5 Tests underlying model
Cross-method consistency Do diminishing returns and cost-effectiveness methods converge? Agreement within 30% Tests methodological robustness

59.18.4 Evidence Base and Benchmarking

OBG rests on the underlying studies cited throughout (e.g.,147 for education,151 for vaccinations) plus a continuous benchmarking program:

  1. Data collection: Longitudinal spending and outcome data across jurisdictions
  2. Historical OSL estimation: Compute past OSL using only contemporaneously available data
  3. Causal analysis: Identify spending → outcome effects with strong quasi-experimental designs
  4. Publication: Publish pre-registered benchmark studies and annual scorecards

59.19 Sensitivity Analysis

59.19.1 Parameter Sensitivity

Parameter Default Test Range Impact on OSL
Country data set All OECD OECD + G20, High-income only ±15%
Discount rate 5% 3-7% ±20%
BIS confidence threshold 0.40 0.30-0.60 Category inclusion
Recency decay rate 0.03/year 0.01-0.05 Estimate weights

59.19.2 Scenario Analysis

Optimistic scenario: All uncertain categories have high returns Pessimistic scenario: Uncertain categories have low/zero returns Base case: Use point estimates

Report OSL range across scenarios for policy guidance.

Three answers to the same question. The pessimistic one assumes everything will go wrong. The optimistic one assumes everything will go right. The base case assumes you’ll be disappointed but not surprised.

Three answers to the same question. The pessimistic one assumes everything will go wrong. The optimistic one assumes everything will go right. The base case assumes you’ll be disappointed but not surprised.

59.20 Conclusion

The Optimal Budget Generator framework provides a systematic, evidence-based approach to budget allocation. Unlike marginal-return frameworks that can justify infinite spending on high-return categories, OBG recognizes that every category has an optimal level - like the Recommended Daily Allowance for nutrients.

The framework answers three questions:

  1. What is the target? OBG provides evidence-based spending levels for each category
  2. How far are we? Gap analysis shows where current spending diverges from optimal
  3. How confident are we? BIS scores evidence quality so policymakers know which OSL estimates are reliable

Even with imperfect evidence, systematically moving from severe misallocation (military 100% above OSL, vaccinations 75% below OSL) toward evidence-based targets should produce substantially larger welfare gains than current lobbying-driven allocation achieves.

Acknowledgments

The author thanks seminar participants and anonymous reviewers for helpful comments and suggestions. All errors remain the author’s own.

59.21 References

59.22 Appendix A: Analysis Workflow

59.22.1 Complete OBG Analysis Pipeline

+-------------------------------------------------------------+
|                    OBG ANALYSIS WORKFLOW                      |
+-------------------------------------------------------------+

Phase 1: DATA COLLECTION
-------------------------
1. Budget data ingestion
   +-- Pull current spending by category (OMB, USASpending)

   +-- Normalize categories to standard taxonomy
   +-- Identify subcategories for detailed analysis
   +-- Flag data quality issues

2. Cross-country spending data
   +-- Pull spending data from OECD, World Bank
   +-- Include all comparable countries
   +-- Normalize to per-capita and % GDP
   +-- Prepare for regression analysis

3. Effect estimate data
   +-- Search systematic reviews and meta-analyses
   +-- Extract effect sizes with standard errors
   +-- Code study quality (RCT, natural experiment, etc.)
   +-- Build literature database by category

Phase 2: OSL ESTIMATION
-----------------------
4. Diminishing returns modeling
   +-- Fit nonlinear spending-outcome functions
   +-- Identify "knee" of curve
   +-- Calculate marginal returns at current spending
   +-- Estimate optimal level

5. Cost-effectiveness analysis (health/life-saving)
   +-- Identify interventions below CE threshold
   +-- Calculate scale-up costs
   +-- Sum to category OSL
   +-- Document assumptions

6. Method reconciliation
   +-- Compare OSL estimates across methods
   +-- Weight by method reliability
   +-- Produce consensus OSL estimate
   +-- Flag discrepancies

Phase 3: EVIDENCE QUALITY
-------------------------
7. BIS calculation
   +-- Compute quality weights per study
   +-- Compute precision weights
   +-- Compute recency weights
   +-- Aggregate to category BIS

8. Evidence grading
   +-- Assign A-F grade based on BIS
   +-- Document key evidence
   +-- Identify research gaps
   +-- Flag high-uncertainty categories

Phase 4: GAP ANALYSIS
---------------------
9. Compute gaps
    +-- Gap = OSL - Current
    +-- Calculate % gap
    +-- Classify as under/over/optimal
    +-- Apply BIS weighting

10. Priority ranking
    +-- Priority = |Gap| × BIS
    +-- Rank categories
    +-- Identify reallocation pairs
    +-- Estimate welfare gains

Phase 5: OUTPUT GENERATION
--------------------------
11. Multi-unit reporting
    +-- Natural units ($/capita, % GDP)
    +-- Monetized (ROI, opportunity cost)
    +-- Health units (QALYs where applicable)
    +-- Composite (BIS, evidence grade)

12. Sensitivity analysis
    +-- Vary key parameters
    +-- Test country data subsets
    +-- Report OSL ranges
    +-- Identify robust conclusions

13. Documentation
    +-- Generate category reports
    +-- Create methodology audit trail
    +-- Version control estimates
    +-- Publish to dashboard/API

59.23 Appendix B: Glossary

59.23.1 Core Concepts

  • Optimal Budget Generator (OBG): The framework/methodology for generating integrated budget recommendations based on evidence of spending-outcome relationships. OBG accounts for the zero-sum nature of budget allocation and produces Optimal Spending Level (OSL) estimates for each category.

You put evidence and comparison data into the machine. The machine tells you two things: how much you should spend, and how sure you should be. Then you notice you’re spending the wrong amount.

You put evidence and comparison data into the machine. The machine tells you two things: how much you should spend, and how sure you should be. Then you notice you’re spending the wrong amount.
  • Optimal Spending Level (OSL): The evidence-based target spending level for each category, produced by the OBG framework. \(\text{OSL}_i\) represents the optimal spending level for category \(i\). Below OSL indicates underinvestment; above OSL indicates diminishing returns.

  • Budget Impact Score (BIS): A 0-1 score measuring confidence in each category’s OSL estimate based on the quality and quantity of causal evidence. Higher BIS indicates more reliable OSL recommendations.

  • Spending Gap: The difference between current spending and the evidence-based target for each category. Positive gaps indicate underinvestment; negative gaps indicate overinvestment.

  • Diminishing Returns: The economic principle that marginal returns to spending decrease as spending increases. The optimal level is where marginal return equals opportunity cost.

59.23.2 Estimation Methods

  • Cost-Effectiveness Threshold: The maximum acceptable cost per QALY (or other health outcome) for including an intervention in target calculations. Typically $50K-$150K per QALY.

  • Dose-Response Curve: The relationship between spending level (dose) and outcome (response). Used to identify diminishing returns and estimate optimal spending levels.

59.23.3 Evidence Quality

  • Quality Weight (\(w^Q\)): Weight assigned to a study based on identification strategy. RCTs receive 1.0; cross-sectional studies receive 0.25.

  • Precision Weight (\(w^P\)): Weight assigned based on standard error. More precise estimates receive higher weight.

  • Recency Weight (\(w^R\)): Weight assigned based on publication date. More recent studies receive higher weight via exponential decay.

  • Evidence Grade: Letter grade (A-F) summarizing confidence in each category’s target estimate. A = strong evidence; F = insufficient evidence.

59.23.4 Output Concepts

  • Priority Score: Product of gap magnitude and BIS. Used to rank categories for reallocation priority.

  • Value of Information (VOI): Expected benefit of additional research on uncertain categories. High-VOI categories warrant pilot funding.

  • Multi-Unit Reporting: Presenting results in natural units, monetized equivalents, health units, and composite scores for interpretability.

59.24 Appendix C: Illustrative Comparison to US Budget

This appendix applies the OBG methodology to the US discretionary budget as an illustrative exercise. “Current” figures reflect approximate FY2024 budget authority. OSL estimates for pragmatic trials, vaccinations, and K-12 education are derived from the worked examples in Sections 5-7. Other OSL values integrate cross-country benchmarking and published cost-effectiveness evidence. BIS scores summarize the strength of the causal evidence.

59.24.1 Illustrative US Discretionary Budget vs. OSL Targets

Category Current (\(B) | OSL (\)B) Gap ($B) Gap % BIS Inc Hlth Priority
Military (discretionary) 850 459 -391 -46% 0.50 195
Non-military discretionary 915 1,350 +435 +48% 0.65 ++ ++ 283
- Pragmatic clinical trials 0.5 50 +49.5 +9,900% 0.90 ++ +++ 44.6
- Education 80 120 +40 +50% 0.75 +++ + 30
- Health (research) 50 100 +50 +100% 0.80 + +++ 40
- Vaccinations 8 35 +27 +338% 0.95 + +++ 26
- Basic research 45 90 +45 +100% 0.70 ++ ++ 32
- Infrastructure 100 150 +50 +50% 0.60 ++ + 30
- Early childhood 50 70 +20 +40% 0.85 +++ + 17
Agricultural subsidies 25 0 -25 -100% 0.90 23

Inc = effect on real after-tax median income growth. Hlth = effect on median healthy life years. Scale: +++ strong positive, ++ moderate, + weak, − negative.

Key findings:

  1. Extreme underinvestment in pragmatic trials: At 9,900% below OSL with 637:1 BCR, this is the single largest misallocation in the federal budget (see Section 6 for full derivation)
  2. Overinvestment in military spending: Military spending is ~$391B (46%) above OSL estimates based on cross-country benchmarking
  3. Underinvestment in research and prevention: Vaccinations, basic research, and health research sit far below evidence-optimal levels
  4. Negative-return spending: Agricultural subsidies produce negative welfare effects per the cost-effectiveness literature
  5. Reallocation potential: The direction of reallocation runs from military and subsidies toward research, health, and education

Corresponding Author: Mike P. Sinn, Decentralized Institutes of Health ([email protected])

Conflicts of Interest: The author declares no conflicts of interest.

Funding: This work received no external funding.

Data Availability: All data sources referenced in this paper are publicly available: OECD iLibrary (education, health spending), World Bank WDI (cross-country indicators), SIPRI Military Expenditure Database (defense spending), and CDC vaccination cost data. URLs are provided in the Data Sources section. A complete replication package including analysis code, data extraction scripts, and worked example calculations will be deposited in a public repository (GitHub/Zenodo) upon publication.

Ethics Statement: This is a methodological specification. No human subjects research was conducted.