Extensions System 2 Intermediate

Bandit Scoring for Task Prioritization

A practical heuristic for allocating creative attention across competing projects, inspired by multi-armed bandit theory. This is not a formal reinforcement learning implementation—it's a pragmatic decision framework that balances exploitation of proven ideas with exploration of new concepts.

2025-12-23 - 10 min read

Bandit Scoring for Task Prioritization

What is the Bandit Score?

The Bandit Score answers one question: “Which task should I work on next when I have multiple options?”

It balances two competing goals:

  • Exploitation: Focus on work with proven high value
  • Exploration: Try new approaches that might be hidden gems

The ε-Greedy Strategy

A useful starting point:

  • 90% of the time: Pick the highest Bandit Score (exploit known winners)
  • 10% of the time: Deliberately work on something with the “Exploration” flag (explore new territory)

Adjust based on your workload and risk tolerance. This helps prevent teams from over-optimizing for short-term wins while giving strategic bets a better chance of getting tested.

Bandit scoring visualization showing the balance between exploration and exploitation
Fig 1. How bandit indices balance recent success with uncertainty

When to Score Tasks

On Task Creation

Set initial estimates based on best available information. It’s okay to guess—scores evolve.

After Task Completion

Update retrospectively based on actual outcomes. This calibrates the system over time.

During Weekly Review

Adjust when new information emerges:

  • Client showed unexpected enthusiasm → increase Engagement
  • Revenue path became clearer → increase Revenue Potential
  • Tried it and learned something → mark as no longer Exploration

The Five Scoring Signals

1. Revenue Potential (0-3 scale)

Question: Could this task generate direct or indirect revenue?

ScoreDefinitionExamples
0 = NonePure R&D, no revenue path visibleInternal process docs, personal learning, speculative research
1 = LowIndirect revenue (brand building → future sales)Instagram Stories, Substack essays, brand development
2 = MediumDirect but uncertain revenue pathPilot commission (unconfirmed), experimental offering, untested price point
3 = HighConfirmed or highly probable revenuePaid client deposit, qualified lead with budget, proven offering

Examples from practice:

  • Client commission inquiry → 3 (confirmed interest, potential £3,500)
  • Multi-Clock Whitepaper → 0 (no direct revenue, IP development)
  • DM outreach to supercar owners → 1 (indirect, may generate future leads)

2. Portfolio Value (0-3 scale)

Question: Does this build long-term capability, brand equity, or showcase value?

ScoreDefinitionExamples
0 = NoneOne-off task, no reuse or showcase valueAdministrative cleanup, one-time email response
1 = LowMinor reusable asset or small improvementTemplate creation, minor process tweak, single social post
2 = MediumSignificant capability or quality showcase workMIR system implementation, pilot commission, marketing campaign
3 = HighFoundational IP or flagship work that defines brandCore methodology documentation, first gallery exhibition, signature technique

Examples:

  • Pilot Commission → 2 (first Survivor, proves concept, showcase work)
  • Multi-Clock Whitepaper → 3 (foundational IP, publishable research)
  • Email template updates → 1 (useful but minor asset)

3. Engagement Signal (0-3 scale)

Question: Has this received external validation or interest?

ScoreDefinitionExamples
0 = NoneNo external feedback yetBrand new idea, not yet shared, internal-only work
1 = LowPolite interest, minimal engagement1-2 likes, “interesting concept” comment, single polite inquiry
2 = MediumClear market interestMultiple DMs/inquiries, decent engagement (20+ likes), shared by others
3 = HighStrong demand or viral signalMultiple qualified leads, viral post (100+ engagements), waitlist forming

Important: This signal updates over time. Start at 0, increase as feedback accumulates.

4. Exploration Bonus (+2 or 0)

Question: Have we done this type of work before?

ValueDefinitionWhen to Apply
+2Exploration: Never tried this beforeNew market, new format, new medium, new client type
0Exploitation: Familiar territorySimilar to past work, proven format, repeat client type

Key Rule: The first time you try something = Exploration. Second time onwards = Exploitation (even if details differ).

Examples:

  • First PAID commission → +2 (new territory, even if pilot existed)
  • First long-form whitepaper → +2 (new format)
  • Second commission → 0 (no longer exploring “paid commission” space)

Why This Matters: The +2 bonus nudges you to try new things by making them competitive with established high-performers.

5. Urgency Multiplier (×1 or ×2)

Question: Is there a hard deadline within 7 days?

MultiplierConditionExamples
×2Deadline within 7 daysClient response needed, event date, publication deadline
×1No deadline or more than 7 days outFlexible timing, internal deadlines, “whenever” tasks

Applied Last: This multiplier doubles the entire base score.

Note: Deadlines create artificial urgency. Use sparingly—not everything needs one.

Calculating the Final Score

Formula

Bandit Score = (Revenue + Portfolio + Engagement + Exploration) × Urgency

Score Ranges & Interpretation

RangeInterpretationAction
16-22Critical priorityWork on immediately (HF today)
12-15High prioritySchedule in next 2-3 days (HF this week)
8-11Medium priorityGood LF refresh candidate
4-7Low priorityDormant or far-future LF
0-3Minimal signalConsider archiving unless strategic

Worked Examples

Example 1: Client Commission Inquiry

Context: Client inquiry received, wants commission for their supercar, asks about pricing.

Scoring:

  • Revenue Potential: 3 (confirmed interest, £3,500 if closes)
  • Portfolio Value: 2 (showcase work, builds client portfolio)
  • Engagement Signal: 2 (referenced specific details = engaged)
  • Exploration Bonus: +2 (first paid commission, new client type)
  • Urgency Multiplier: ×2 (respond same-day to maintain momentum)

Calculation: (3 + 2 + 2 + 2) × 2 = 18

Decision: HIGH PRIORITY → Draft response immediately, create follow-up tasks in MIR.

Example 2: Framework Whitepaper

Context: Research project documenting the operational framework.

Scoring:

  • Revenue Potential: 0 (no direct revenue, IP development)
  • Portfolio Value: 3 (foundational IP, publishable)
  • Engagement Signal: 1 (internal validation, not yet public)
  • Exploration Bonus: +2 (first long-form whitepaper format)
  • Urgency Multiplier: ×1 (no hard deadline)

Calculation: (0 + 3 + 1 + 2) × 1 = 6

Decision: MEDIUM PRIORITY → Good LF refresh work. Schedule 2-3 hour blocks weekly.

Example 3: Behind-the-Scenes Content

Context: Post behind-the-scenes footage from a shoot.

Scoring:

  • Revenue Potential: 0 (brand awareness, no direct revenue)
  • Portfolio Value: 1 (content library, minor asset)
  • Engagement Signal: 2 (proven format, consistent 30-50 views)
  • Exploration Bonus: 0 (familiar format, done many times)
  • Urgency Multiplier: ×1 (flexible timing)

Calculation: (0 + 1 + 2 + 0) × 1 = 3

Decision: LOW PRIORITY → Fill-in work when you have 15-20 min gaps. Don’t prioritize over higher-scoring tasks.

Example 4: Outreach Target List

Context: Build spreadsheet of potential clients to DM.

Scoring:

  • Revenue Potential: 1 (indirect, may generate future leads)
  • Portfolio Value: 1 (reusable list, but not showcase work)
  • Engagement Signal: 0 (not yet executed, no validation)
  • Exploration Bonus: 0 (DM outreach is familiar tactic)
  • Urgency Multiplier: ×2 (part of 30-day sprint, due soon)

Calculation: (1 + 1 + 0 + 0) × 2 = 4

Decision: MEDIUM-LOW PRIORITY → Must do because of deadline, but intrinsically low-signal. Schedule a focused block, get it done, move on.

Example bandit score calculation
Fig 2. Worked example showing how signals combine into final score

Using Bandit Scores in Practice

Daily HF Work: “HF Priority Queue”

Filter: Clock = HF, Status ≠ Done
Sort: Bandit Score DESC, then Next Trigger ASC
Decision Rule: Pick highest score, unless mid-burst (finish-to-switch)

Weekly LF Refresh: “LF Candidates”

Filter: Clock = LF
Sort: Bandit Score DESC
Decision Rule:

  • Select 1× highest Bandit Score (exploit)
  • Select 1× random “Is Exploration” (explore)

Fortnightly Dormant Review

Filter: Clock = Dormant, Age Points above 15
Decision Rule: Promote if Bandit Score above 5, otherwise archive or extend dormancy

Exploration Budget Management

Target

10-20% of active WIP should have “Is Exploration” ✓

Weekly Check

Count tasks:

  • Total active (HF + LF): Example = 15
  • Exploration tasks (✓): Example = 3
  • Exploration %: 3/15 = 20% ✅

If Under 10% Exploration

Action: Force-promote a Dormant exploration item to LF or HF, even if Bandit Score is medium-low.

Why: Prevents getting stuck in local maxima. Innovation requires trying new things, even when current work is “good enough.”

Common Pitfalls & How to Avoid Them

Pitfall 1: Everything Gets High Scores

Problem: If you score too generously, everything looks equally important.

Fix:

  • Use 0 liberally—most tasks have no revenue path
  • Reserve 3s for truly exceptional cases
  • Calibrate by comparing: “Is Task A really more valuable than Task B?”

Pitfall 2: Ignoring Exploration Bonus

Problem: Only working on proven winners, missing new opportunities.

Fix:

  • Track exploration % weekly
  • If under 10%, force yourself to try something new
  • Remember: Today’s exploit was yesterday’s exploration

Pitfall 3: Overusing Urgency Multiplier

Problem: Giving everything ×2 defeats the purpose.

Fix:

  • Reserve ×2 for true time-sensitive deadlines (within 7 days)
  • Most internal deadlines are flexible—be honest
  • If everything is urgent, nothing is urgent

Pitfall 4: Never Updating Scores

Problem: Scores get stale, lose predictive value.

Fix:

  • Update after task completion (retrospective calibration)
  • Adjust when new information emerges
  • Review scoring quality during weekly reviews

Integration with Multi-Clock Mechanics

Aging & Bandit Score

  • Aging increases urgency (promotes old LF/Dormant work)
  • Bandit Score prioritizes within a Clock (which HF task to work on today?)
  • They work together: Aging says “don’t forget me,” Bandit says “work on me next”

WIP Limits & Exploration

  • WIP caps prevent overload
  • Exploration % ensures we don’t just exploit
  • Together: Disciplined focus + strategic experimentation

Kill Criteria

If a task has:

  • Bandit Score = 0 (no signals)
  • Age Points exceeds Threshold × 2
  • 2+ reviews without progress

Consider archiving. Low score + high age + no momentum = probably not viable.

Scoring Workflow Checklist

When Creating a Task

  • Assign Revenue Potential (0-3)
  • Assign Portfolio Value (0-3)
  • Assign Engagement Signal (0, initially)
  • Check “Is Exploration” if never tried before
  • Add Due Date if hard deadline exists (within 7 days)
  • System auto-calculates Bandit Score

During Weekly Review

  • Update Engagement Signals based on new feedback
  • Adjust Revenue/Portfolio if new info emerged
  • Remove “Is Exploration” if we’ve now tried it once
  • Check overall exploration % (target 10-20%)

After Task Completion

  • Retrospective scoring: Was Revenue/Portfolio accurate?
  • Document learnings in Notes field
  • Use findings to calibrate future similar tasks

Advanced: Tuning Signal Weights

For use after 4-6 weeks of scoring data

Retrospective Analysis

After completing 20+ scored tasks, analyze:

  1. Did high-Bandit tasks actually deliver high value?
  2. Which signal was most predictive? (Revenue? Portfolio? Engagement?)
  3. Did we explore enough (10-20%)?

Potential Adjustments

If Revenue consistently dominates, consider:

  • Weighting Portfolio Value ×1.5 (encourages long-term thinking)
  • Increasing Exploration Bonus to +3 (forces more experimentation)

Implementation Notes

Notion Formula (Reference)

The Bandit Score can be implemented as a Notion formula field:

((prop("Revenue Potential") + prop("Portfolio Value") + prop("Engagement Signal") + 
  if(prop("Is Exploration"), 2, 0)) * 
  if(dateBetween(prop("Due Date"), now(), "days") <= 7, 2, 1))

View Configuration

Today View Filter:

Clock = HF AND 
Status ≠ Done AND 
(Bandit Score >= 6 OR Promotion Flag = "Promote" OR Days Until Deadline within 7)

Sort: Bandit Score DESC, Next Trigger ASC

Summary

The Bandit Score helps you decide “What should I work on next?” by scoring tasks 0-22 based on Revenue potential, Portfolio value, Engagement signals, Exploration bonus, and Urgency.

Pick the highest score 90% of the time (exploit proven winners), but deliberately try something new 10% of the time (explore hidden gems). Update scores as you learn.

It’s not perfect, but it can help reduce decision paralysis while keeping you from getting stuck only doing familiar work.


For questions or implementation support, see the MIR Database Schema Reference or consult the Multi-Clock Work whitepaper.