Extensions System 2 Intermediate

Bandit Scoring for Task Prioritization

A practical heuristic for allocating creative attention across competing projects, inspired by multi-armed bandit theory. This is not a formal reinforcement learning implementation—it's a pragmatic decision framework that balances exploitation of proven ideas with exploration of new concepts.

2025-12-23 - 10 min read

What is the Bandit Score?

The Bandit Score answers one question: “Which task should I work on next when I have multiple options?”

It balances two competing goals:

Exploitation: Focus on work with proven high value
Exploration: Try new approaches that might be hidden gems

The ε-Greedy Strategy

A useful starting point:

90% of the time: Pick the highest Bandit Score (exploit known winners)
10% of the time: Deliberately work on something with the “Exploration” flag (explore new territory)

Adjust based on your workload and risk tolerance. This helps prevent teams from over-optimizing for short-term wins while giving strategic bets a better chance of getting tested.

Bandit scoring visualization showing the balance between exploration and exploitation — Fig 1. How bandit indices balance recent success with uncertainty

When to Score Tasks

On Task Creation

Set initial estimates based on best available information. It’s okay to guess—scores evolve.

After Task Completion

Update retrospectively based on actual outcomes. This calibrates the system over time.

During Weekly Review

Adjust when new information emerges:

Client showed unexpected enthusiasm → increase Engagement
Revenue path became clearer → increase Revenue Potential
Tried it and learned something → mark as no longer Exploration

The Five Scoring Signals

1. Revenue Potential (0-3 scale)

Question: Could this task generate direct or indirect revenue?

Score	Definition	Examples
0 = None	Pure R&D, no revenue path visible	Internal process docs, personal learning, speculative research
1 = Low	Indirect revenue (brand building → future sales)	Instagram Stories, Substack essays, brand development
2 = Medium	Direct but uncertain revenue path	Pilot commission (unconfirmed), experimental offering, untested price point
3 = High	Confirmed or highly probable revenue	Paid client deposit, qualified lead with budget, proven offering

Examples from practice:

Client commission inquiry → 3 (confirmed interest, potential £3,500)
Multi-Clock Whitepaper → 0 (no direct revenue, IP development)
DM outreach to supercar owners → 1 (indirect, may generate future leads)

2. Portfolio Value (0-3 scale)

Question: Does this build long-term capability, brand equity, or showcase value?

Score	Definition	Examples
0 = None	One-off task, no reuse or showcase value	Administrative cleanup, one-time email response
1 = Low	Minor reusable asset or small improvement	Template creation, minor process tweak, single social post
2 = Medium	Significant capability or quality showcase work	MIR system implementation, pilot commission, marketing campaign
3 = High	Foundational IP or flagship work that defines brand	Core methodology documentation, first gallery exhibition, signature technique

Examples:

Pilot Commission → 2 (first Survivor, proves concept, showcase work)
Multi-Clock Whitepaper → 3 (foundational IP, publishable research)
Email template updates → 1 (useful but minor asset)

3. Engagement Signal (0-3 scale)

Question: Has this received external validation or interest?

Score	Definition	Examples
0 = None	No external feedback yet	Brand new idea, not yet shared, internal-only work
1 = Low	Polite interest, minimal engagement	1-2 likes, “interesting concept” comment, single polite inquiry
2 = Medium	Clear market interest	Multiple DMs/inquiries, decent engagement (20+ likes), shared by others
3 = High	Strong demand or viral signal	Multiple qualified leads, viral post (100+ engagements), waitlist forming

Important: This signal updates over time. Start at 0, increase as feedback accumulates.

4. Exploration Bonus (+2 or 0)

Question: Have we done this type of work before?

Value	Definition	When to Apply
+2	Exploration: Never tried this before	New market, new format, new medium, new client type
0	Exploitation: Familiar territory	Similar to past work, proven format, repeat client type

Key Rule: The first time you try something = Exploration. Second time onwards = Exploitation (even if details differ).

Examples:

First PAID commission → +2 (new territory, even if pilot existed)
First long-form whitepaper → +2 (new format)
Second commission → 0 (no longer exploring “paid commission” space)

Why This Matters: The +2 bonus nudges you to try new things by making them competitive with established high-performers.

5. Urgency Multiplier (×1 or ×2)

Question: Is there a hard deadline within 7 days?

Multiplier	Condition	Examples
×2	Deadline within 7 days	Client response needed, event date, publication deadline
×1	No deadline or more than 7 days out	Flexible timing, internal deadlines, “whenever” tasks

Applied Last: This multiplier doubles the entire base score.

Note: Deadlines create artificial urgency. Use sparingly—not everything needs one.

Calculating the Final Score

Formula

Bandit Score = (Revenue + Portfolio + Engagement + Exploration) × Urgency

Score Ranges & Interpretation

Range	Interpretation	Action
16-22	Critical priority	Work on immediately (HF today)
12-15	High priority	Schedule in next 2-3 days (HF this week)
8-11	Medium priority	Good LF refresh candidate
4-7	Low priority	Dormant or far-future LF
0-3	Minimal signal	Consider archiving unless strategic

Worked Examples

Example 1: Client Commission Inquiry

Context: Client inquiry received, wants commission for their supercar, asks about pricing.

Scoring:

Revenue Potential: 3 (confirmed interest, £3,500 if closes)
Portfolio Value: 2 (showcase work, builds client portfolio)
Engagement Signal: 2 (referenced specific details = engaged)
Exploration Bonus: +2 (first paid commission, new client type)
Urgency Multiplier: ×2 (respond same-day to maintain momentum)

Calculation: (3 + 2 + 2 + 2) × 2 = 18

Decision: HIGH PRIORITY → Draft response immediately, create follow-up tasks in MIR.

Example 2: Framework Whitepaper

Context: Research project documenting the operational framework.

Scoring:

Revenue Potential: 0 (no direct revenue, IP development)
Portfolio Value: 3 (foundational IP, publishable)
Engagement Signal: 1 (internal validation, not yet public)
Exploration Bonus: +2 (first long-form whitepaper format)
Urgency Multiplier: ×1 (no hard deadline)

Calculation: (0 + 3 + 1 + 2) × 1 = 6

Decision: MEDIUM PRIORITY → Good LF refresh work. Schedule 2-3 hour blocks weekly.

Example 3: Behind-the-Scenes Content

Context: Post behind-the-scenes footage from a shoot.

Scoring:

Revenue Potential: 0 (brand awareness, no direct revenue)
Portfolio Value: 1 (content library, minor asset)
Engagement Signal: 2 (proven format, consistent 30-50 views)
Exploration Bonus: 0 (familiar format, done many times)
Urgency Multiplier: ×1 (flexible timing)

Calculation: (0 + 1 + 2 + 0) × 1 = 3

Decision: LOW PRIORITY → Fill-in work when you have 15-20 min gaps. Don’t prioritize over higher-scoring tasks.

Example 4: Outreach Target List

Context: Build spreadsheet of potential clients to DM.

Scoring:

Revenue Potential: 1 (indirect, may generate future leads)
Portfolio Value: 1 (reusable list, but not showcase work)
Engagement Signal: 0 (not yet executed, no validation)
Exploration Bonus: 0 (DM outreach is familiar tactic)
Urgency Multiplier: ×2 (part of 30-day sprint, due soon)

Calculation: (1 + 1 + 0 + 0) × 2 = 4

Decision: MEDIUM-LOW PRIORITY → Must do because of deadline, but intrinsically low-signal. Schedule a focused block, get it done, move on.

Example bandit score calculation — Fig 2. Worked example showing how signals combine into final score

Using Bandit Scores in Practice

Daily HF Work: “HF Priority Queue”

Filter: Clock = HF, Status ≠ Done
Sort: Bandit Score DESC, then Next Trigger ASC
Decision Rule: Pick highest score, unless mid-burst (finish-to-switch)

Weekly LF Refresh: “LF Candidates”

Filter: Clock = LF
Sort: Bandit Score DESC
Decision Rule:

Select 1× highest Bandit Score (exploit)
Select 1× random “Is Exploration” (explore)

Fortnightly Dormant Review

Filter: Clock = Dormant, Age Points above 15
Decision Rule: Promote if Bandit Score above 5, otherwise archive or extend dormancy

Exploration Budget Management

Target

10-20% of active WIP should have “Is Exploration” ✓

Weekly Check

Count tasks:

Total active (HF + LF): Example = 15
Exploration tasks (✓): Example = 3
Exploration %: 3/15 = 20% ✅

If Under 10% Exploration

Action: Force-promote a Dormant exploration item to LF or HF, even if Bandit Score is medium-low.

Why: Prevents getting stuck in local maxima. Innovation requires trying new things, even when current work is “good enough.”

Common Pitfalls & How to Avoid Them

Pitfall 1: Everything Gets High Scores

Problem: If you score too generously, everything looks equally important.

Fix:

Use 0 liberally—most tasks have no revenue path
Reserve 3s for truly exceptional cases
Calibrate by comparing: “Is Task A really more valuable than Task B?”

Pitfall 2: Ignoring Exploration Bonus

Problem: Only working on proven winners, missing new opportunities.

Fix:

Track exploration % weekly
If under 10%, force yourself to try something new
Remember: Today’s exploit was yesterday’s exploration

Pitfall 3: Overusing Urgency Multiplier

Problem: Giving everything ×2 defeats the purpose.

Fix:

Reserve ×2 for true time-sensitive deadlines (within 7 days)
Most internal deadlines are flexible—be honest
If everything is urgent, nothing is urgent

Pitfall 4: Never Updating Scores

Problem: Scores get stale, lose predictive value.

Fix:

Update after task completion (retrospective calibration)
Adjust when new information emerges
Review scoring quality during weekly reviews

Integration with Multi-Clock Mechanics

Aging & Bandit Score

Aging increases urgency (promotes old LF/Dormant work)
Bandit Score prioritizes within a Clock (which HF task to work on today?)
They work together: Aging says “don’t forget me,” Bandit says “work on me next”

WIP Limits & Exploration

WIP caps prevent overload
Exploration % ensures we don’t just exploit
Together: Disciplined focus + strategic experimentation

Kill Criteria

If a task has:

Bandit Score = 0 (no signals)
Age Points exceeds Threshold × 2
2+ reviews without progress

Consider archiving. Low score + high age + no momentum = probably not viable.

Scoring Workflow Checklist

When Creating a Task

Assign Revenue Potential (0-3)
Assign Portfolio Value (0-3)
Assign Engagement Signal (0, initially)
Check “Is Exploration” if never tried before
Add Due Date if hard deadline exists (within 7 days)
System auto-calculates Bandit Score

During Weekly Review

Update Engagement Signals based on new feedback
Adjust Revenue/Portfolio if new info emerged
Remove “Is Exploration” if we’ve now tried it once
Check overall exploration % (target 10-20%)

After Task Completion

Retrospective scoring: Was Revenue/Portfolio accurate?
Document learnings in Notes field
Use findings to calibrate future similar tasks

Advanced: Tuning Signal Weights

For use after 4-6 weeks of scoring data

Retrospective Analysis

After completing 20+ scored tasks, analyze:

Did high-Bandit tasks actually deliver high value?
Which signal was most predictive? (Revenue? Portfolio? Engagement?)
Did we explore enough (10-20%)?

Potential Adjustments

If Revenue consistently dominates, consider:

Weighting Portfolio Value ×1.5 (encourages long-term thinking)
Increasing Exploration Bonus to +3 (forces more experimentation)

Implementation Notes

Notion Formula (Reference)

The Bandit Score can be implemented as a Notion formula field:

((prop("Revenue Potential") + prop("Portfolio Value") + prop("Engagement Signal") + 
  if(prop("Is Exploration"), 2, 0)) * 
  if(dateBetween(prop("Due Date"), now(), "days") <= 7, 2, 1))

View Configuration

Today View Filter:

Clock = HF AND 
Status ≠ Done AND 
(Bandit Score >= 6 OR Promotion Flag = "Promote" OR Days Until Deadline within 7)

Sort: Bandit Score DESC, Next Trigger ASC

Summary

The Bandit Score helps you decide “What should I work on next?” by scoring tasks 0-22 based on Revenue potential, Portfolio value, Engagement signals, Exploration bonus, and Urgency.

Pick the highest score 90% of the time (exploit proven winners), but deliberately try something new 10% of the time (explore hidden gems). Update scores as you learn.

It’s not perfect, but it can help reduce decision paralysis while keeping you from getting stuck only doing familiar work.

Multi-Clock Work - Task scheduling across time horizons
Governance Protocols - AI-delegated work patterns
Campaign Framework - Marketing orchestration

For questions or implementation support, see the MIR Database Schema Reference or consult the Multi-Clock Work whitepaper.

Bandit Scoring for Task Prioritization

What is the Bandit Score?

The ε-Greedy Strategy

When to Score Tasks

On Task Creation

After Task Completion

During Weekly Review

The Five Scoring Signals

1. Revenue Potential (0-3 scale)

2. Portfolio Value (0-3 scale)

3. Engagement Signal (0-3 scale)

4. Exploration Bonus (+2 or 0)

5. Urgency Multiplier (×1 or ×2)

Calculating the Final Score

Formula

Score Ranges & Interpretation

Worked Examples

Example 1: Client Commission Inquiry

Example 2: Framework Whitepaper

Example 3: Behind-the-Scenes Content

Example 4: Outreach Target List

Using Bandit Scores in Practice

Daily HF Work: “HF Priority Queue”

Weekly LF Refresh: “LF Candidates”

Fortnightly Dormant Review

Exploration Budget Management

Target

Weekly Check

If Under 10% Exploration

Common Pitfalls & How to Avoid Them

Pitfall 1: Everything Gets High Scores

Pitfall 2: Ignoring Exploration Bonus

Pitfall 3: Overusing Urgency Multiplier

Pitfall 4: Never Updating Scores

Integration with Multi-Clock Mechanics

Aging & Bandit Score

WIP Limits & Exploration

Kill Criteria

Scoring Workflow Checklist

When Creating a Task

During Weekly Review

After Task Completion

Advanced: Tuning Signal Weights

Retrospective Analysis

Potential Adjustments

Implementation Notes

Notion Formula (Reference)

View Configuration

Summary

Related Frameworks