Every agent reputation system eventually confronts the same question: what raw signals do you actually feed into a trust score, and how do you combine them without creating something trivially gameable?
I've watched a dozen projects launch reputation systems that looked elegant on paper and collapsed within weeks. The failure mode is almost always the same — they pick signals that are easy to fake, combine them with simple weighted averages, and then act surprised when someone spins up 50 wallets and inflates their score to the top of the leaderboard.
Building a trust score that means something requires getting three things right: signal selection, signal weighting, and decay mechanics. Get any one of them wrong and you've built a leaderboard, not a reputation system.
The Signal Taxonomy
Not all on-chain activity tells you the same thing about an agent. Before writing a single line of scoring logic, categorize your signals by what they actually measure.
Tier 1: Direct Performance Signals
These come from the agent's actual work output. They're the hardest to fake because they require real execution.
interface PerformanceSignals {
jobsCompleted: number; // Total completed jobs
jobSuccessRate: number; // Completed / (Completed + Failed)
averageResponseTime: number; // Seconds from job acceptance to delivery
repeatClientRate: number; // % of clients who come back
disputeRate: number; // Disputes / Total jobs
disputeWinRate: number; // Disputes resolved in agent's favor
revenueTotal: bigint; // Total earned (USDC/wei)
averageJobValue: bigint; // Revenue / Jobs
}
A job completion on ACP with payment settlement is a strong signal. The buyer had to actually spend money and then not dispute the outcome. That's expensive to fake — you'd need real USDC and a counterparty wallet, and you'd be paying protocol fees on every fake job.
Tier 2: Social Proof Signals
These come from other agents' opinions. Useful but easier to game through collusion.
interface SocialSignals {
endorsementCount: number; // Unique endorsers
endorsementWeightedScore: number; // Sum of (endorser_trust * endorsement_value)
reviewCount: number;
averageRating: number; // 1-5 stars
ratingDistribution: number[]; // [1-star count, 2-star, 3-star, 4-star, 5-star]
connectionCount: number; // Network size
connectionQuality: number; // Avg trust score of connections
}
The key insight: an endorsement from a Diamond-tier agent with 500 completed jobs is worth 100x more than one from a Bronze-tier agent created yesterday. Always weight social signals by the endorser's own reputation.
Tier 3: Behavioral Signals
These measure consistency and engagement patterns. Hard to fake at scale because they require sustained behavior over time.
interface BehavioralSignals {
accountAge: number; // Days since first on-chain activity
activityConsistency: number; // Standard deviation of daily activity (lower = more consistent)
uptimePercentage: number; // Heartbeat response rate
responseLatency: number; // Average time to acknowledge new jobs
peakActivityAlignment: number; // Does activity pattern match claimed timezone?
platformCount: number; // Number of platforms where agent is active
}
Account age is the most underrated signal in reputation systems. An agent that's been active for 300 days with mediocre scores is more trustworthy than one that appeared last week with perfect scores. Time is the one thing you can't buy.
Tier 4: Financial Signals
Skin in the game. These measure how much the agent has at stake.
interface FinancialSignals {
stakedAmount: bigint; // Tokens staked as collateral
stakeDuration: number; // How long tokens have been staked
slashHistory: number; // Times slashed (should be 0)
walletAge: number; // Days since wallet creation
transactionVolume: bigint; // Total value transacted
tokenHoldings: bigint; // Platform token balance
}
An agent that has staked $500 in platform tokens has real skin in the game. If they misbehave, they lose real money. This is probably the strongest anti-Sybil signal available — creating fake agents with real financial stake gets expensive fast.
The Scoring Engine
Here's where most projects go wrong. They slap a weighted average on their signals and call it done. That works until someone reverse-engineers the weights and optimizes for the easiest ones to inflate.
Step 1: Normalize Each Signal
Raw signals are on completely different scales. Job count might be 0-10,000. Success rate is 0-1. Account age is 0-1,000 days. You need to bring them into the same range before combining.
Don't use min-max normalization — it's fragile to outliers. Use sigmoid-based normalization with domain-specific parameters:
import math
def sigmoid_normalize(value: float, midpoint: float, steepness: float = 1.0) -> float:
"""
Normalize a value to 0-1 using a sigmoid curve.
midpoint: the value that maps to 0.5
steepness: how quickly the curve rises (higher = sharper transition)
"""
return 1 / (1 + math.exp(-steepness * (value - midpoint)))
# Examples with calibrated parameters:
# An agent with 50 jobs is "average" — maps to ~0.5
job_score = sigmoid_normalize(jobs_completed, midpoint=50, steepness=0.04)
# 180 days is "established" — maps to ~0.5
age_score = sigmoid_normalize(account_age_days, midpoint=180, steepness=0.015)
# $1000 total earned is "proven" — maps to ~0.5
revenue_score = sigmoid_normalize(float(total_revenue_usd), midpoint=1000, steepness=0.002)
# 95% success rate is "reliable" — maps to ~0.5
success_score = sigmoid_normalize(success_rate * 100, midpoint=95, steepness=0.5)
Why sigmoid? It handles edge cases gracefully. Zero jobs gives you nearly 0. A thousand jobs gives you nearly 1. But the curve is steepest around the midpoint, which is where differentiation matters most. The difference between 40 and 60 completed jobs is meaningful. The difference between 940 and 960 is noise.
Step 2: Apply Category Weights with Diminishing Returns
Don't just multiply scores by fixed weights. Use diminishing returns so that maxing out one category doesn't compensate for failures in others.
def weighted_score(scores: dict[str, float], weights: dict[str, float]) -> float:
"""
Combine normalized scores with weights and diminishing returns.
Uses geometric mean to penalize imbalance.
"""
weighted_components = []
total_weight = sum(weights.values())
for signal, weight in weights.items():
normalized_weight = weight / total_weight
score = scores.get(signal, 0)
# Diminishing returns: sqrt compresses high values
adjusted = math.sqrt(score) * normalized_weight
weighted_components.append(adjusted)
# Geometric mean penalizes having any single score near 0
product = 1.0
for component in weighted_components:
product *= max(component, 0.001) # Floor to prevent zero-product
geometric = product ** (1 / len(weighted_components))
# Arithmetic mean for baseline
arithmetic = sum(weighted_components)
# Blend: 60% arithmetic (rewards high scores) + 40% geometric (punishes gaps)
blended = 0.6 * arithmetic + 0.4 * geometric
return min(1.0, max(0.0, blended))
# Production weights — tuned through experimentation
WEIGHTS = {
'job_success_rate': 0.20, # Most important: do you actually deliver?
'jobs_completed': 0.15, # Volume matters, but less than quality
'account_age': 0.12, # Time in system shows commitment
'revenue_earned': 0.10, # Market validation
'repeat_client_rate': 0.10, # Strongest organic signal
'endorsement_weighted': 0.08, # Social proof, weighted by endorser quality
'uptime': 0.08, # Reliability
'stake_amount': 0.07, # Skin in the game
'dispute_rate_inverse': 0.05, # Low disputes = smooth operator
'platform_diversity': 0.05, # Active across multiple platforms
}
The geometric mean component is crucial. Without it, an agent could have a perfect job completion rate, zero endorsements, zero stake, and still get a decent score. The geometric mean punishes that imbalance — you need to be at least passable across all dimensions.
Step 3: Apply Time Decay
Stale scores are dangerous scores. An agent that was great six months ago and hasn't been active since shouldn't carry the same trust as one that delivered yesterday.
from datetime import datetime, timedelta
def apply_decay(
base_score: float,
last_activity: datetime,
now: datetime,
half_life_days: int = 90,
floor_pct: float = 0.55
) -> float:
"""
Decay a trust score based on inactivity.
half_life_days: Score loses half its decayable portion in this many days
floor_pct: Score never drops below this percentage of the base
"""
days_inactive = (now - last_activity).days
if days_inactive <= 0:
return base_score
floor = base_score * floor_pct
decayable = base_score - floor
decay_factor = math.exp(-0.693 * days_inactive / half_life_days)
decayed = floor + (decayable * decay_factor)
return round(decayed, 4)
# Example:
# Agent had trust score of 850, last active 45 days ago
score = apply_decay(0.85, datetime(2026, 1, 20), datetime(2026, 3, 7))
# Result: ~0.748 (decayed but still reflects past performance)
A 90-day half-life with a 55% floor means:
- 30 days inactive: score drops to ~83% of original
- 90 days inactive: score drops to ~78% of original
- 180 days inactive: score drops to ~63% of original
- 365 days inactive: score bottoms at 55% of original
The floor prevents experienced agents from being treated like unknown entities after a break. Someone who completed 200 jobs reliably doesn't reset to zero just because they went offline for a quarter.
Anti-Gaming Mechanisms
If your reputation system becomes valuable enough, people will try to game it. Plan for this from day one.
Sybil Resistance Through Cost
Every signal in your score should have a real cost to fake:
GAMING_COST_ANALYSIS = {
'job_completion': {
'fake_cost': 'High — requires real USDC, counterparty, protocol fees',
'mitigation': 'Only count jobs above minimum value ($5+)',
},
'account_age': {
'fake_cost': 'Medium — just wait, but ties up wallet for months',
'mitigation': 'Require activity during aging period, not just existence',
},
'endorsements': {
'fake_cost': 'Low — colluding agents endorse each other',
'mitigation': 'Weight by endorser trust, cap endorsements per endorser',
},
'staking': {
'fake_cost': 'High — real capital at risk',
'mitigation': 'Require minimum stake duration, slash on violations',
},
'reviews': {
'fake_cost': 'Medium — requires completed job first',
'mitigation': 'Weight by reviewer trust, detect rating patterns',
},
}
Anomaly Detection
Flag suspicious patterns before they corrupt scores:
def detect_anomalies(agent_id: str, signals: dict) -> list[str]:
flags = []
# Sudden score jumps
if signals.get('score_delta_7d', 0) > 200:
flags.append('RAPID_SCORE_INCREASE')
# Mutual endorsement rings
endorsers = get_endorsers(agent_id)
endorsed_by_agent = get_endorsed_by(agent_id)
mutual = set(endorsers) & set(endorsed_by_agent)
if len(mutual) > 2:
flags.append(f'MUTUAL_ENDORSEMENT_RING: {len(mutual)} agents')
# All jobs from same client
clients = get_unique_clients(agent_id)
if signals['jobs_completed'] > 10 and len(clients) < 3:
flags.append('LOW_CLIENT_DIVERSITY')
# Perfect scores across all dimensions (statistically improbable)
dimension_scores = [signals.get(d, 0) for d in SCORE_DIMENSIONS]
if all(s > 0.95 for s in dimension_scores):
flags.append('SUSPICIOUSLY_PERFECT_SCORES')
# Burst activity followed by silence (farming pattern)
activity_variance = signals.get('daily_activity_variance', 0)
if activity_variance > 50:
flags.append('IRREGULAR_ACTIVITY_PATTERN')
return flags
When anomalies are detected, don't auto-ban — that creates false positives that destroy trust in the system. Instead, apply a dampening factor:
def apply_anomaly_dampening(score: float, flags: list[str]) -> float:
dampening = 1.0
for flag in flags:
if flag == 'MUTUAL_ENDORSEMENT_RING':
dampening *= 0.7 # 30% reduction
elif flag == 'LOW_CLIENT_DIVERSITY':
dampening *= 0.85
elif flag == 'RAPID_SCORE_INCREASE':
dampening *= 0.8
elif flag == 'SUSPICIOUSLY_PERFECT_SCORES':
dampening *= 0.9
elif flag == 'IRREGULAR_ACTIVITY_PATTERN':
dampening *= 0.75
return score * dampening
Putting It All Together: The Complete Pipeline
class TrustScoreEngine:
def __init__(self, config: ScoringConfig):
self.config = config
def calculate(self, agent_id: str) -> TrustScoreResult:
# 1. Gather raw signals from on-chain + platform data
raw = self.gather_signals(agent_id)
# 2. Normalize each signal to 0-1
normalized = {}
for signal_name, value in raw.items():
params = self.config.normalization_params[signal_name]
normalized[signal_name] = sigmoid_normalize(
value, params['midpoint'], params['steepness']
)
# 3. Calculate weighted composite score
composite = weighted_score(normalized, self.config.weights)
# 4. Apply time decay
last_active = self.get_last_activity(agent_id)
decayed = apply_decay(
composite,
last_active,
datetime.utcnow(),
half_life_days=self.config.decay_half_life,
floor_pct=self.config.decay_floor
)
# 5. Run anomaly detection
flags = detect_anomalies(agent_id, {**raw, **normalized})
final = apply_anomaly_dampening(decayed, flags)
# 6. Map to 0-1000 integer scale
trust_score = int(final * 1000)
return TrustScoreResult(
agent_id=agent_id,
trust_score=trust_score,
tier=self.get_tier(trust_score),
breakdown=normalized,
flags=flags,
last_calculated=datetime.utcnow()
)
def get_tier(self, score: int) -> str:
if score >= 900: return 'Diamond'
if score >= 700: return 'Gold'
if score >= 400: return 'Silver'
return 'Bronze'
Querying Trust On-Chain
Other agents and contracts need to check trust scores. Expose a clean read interface:
interface IAgentTrust {
/// @notice Get the composite trust score for an agent (0-1000)
function trustScore(address agent) external view returns (uint16);
/// @notice Get detailed score breakdown
function trustBreakdown(address agent) external view returns (
uint16 reliability,
uint16 accuracy,
uint16 security,
uint16 communication,
uint32 jobsCompleted,
uint48 lastUpdated
);
/// @notice Check if agent meets a minimum trust threshold
function meetsThreshold(address agent, uint16 minScore) external view returns (bool);
/// @notice Get the tier (0=Bronze, 1=Silver, 2=Gold, 3=Diamond)
function tier(address agent) external view returns (uint8);
}
Contracts that interact with agents can gate access based on trust:
contract TrustGatedService {
IAgentTrust public trustOracle;
modifier requireTrust(uint16 minimumScore) {
require(
trustOracle.meetsThreshold(msg.sender, minimumScore),
"Insufficient trust score"
);
_;
}
function executeHighValueTask() external requireTrust(700) {
// Only Gold+ agents can run this
}
function executeLowRiskTask() external requireTrust(200) {
// Bronze+ agents welcome
}
}
This is where trust infrastructure becomes a primitive that the entire ecosystem builds on. Every DeFi protocol, every marketplace, every orchestration platform can query the same trust oracle to make access decisions. You don't need to build your own reputation system — you query ours.
Lessons from Production
After running scoring systems in production, here's what I wish someone had told me:
Calibrate weights with real data, not intuition. What you think matters and what actually predicts reliable behavior are different things. We initially weighted endorsements at 15%. After analyzing correlation between endorsement count and actual job success, we dropped it to 8%. Repeat client rate turned out to be the strongest predictor of future reliability.
Ship with conservative weights, then tune. Start with roughly equal weights across all dimensions. Watch the rankings for two weeks. If agents with obviously sketchy behavior are ranking high, figure out which signal they're exploiting and reduce its weight.
Never let one signal dominate. Cap the contribution of any single signal at 25% of the final score. This prevents an agent from gaming one easy metric and riding it to the top.
Publish your methodology. Obscurity is not security in reputation systems. If your scoring can't withstand public scrutiny, it's probably wrong. Transparent methodology builds trust in the trust system itself.
Version your scoring algorithm. When you change weights or add signals, version it. Keep historical scores under the old algorithm available for comparison. Agents who've built reputation under v1 shouldn't wake up to a completely different score under v2 without notice.
Key Takeaways
- Categorize signals by gaming cost. Expensive-to-fake signals (job completion, staking) should carry more weight than cheap ones (endorsements, activity count).
- Use sigmoid normalization, not min-max. Sigmoids handle outliers and scale gracefully.
- Blend arithmetic and geometric means. Arithmetic rewards excellence; geometric punishes gaps. You want both.
- Time decay with floors prevents stale scores without erasing history. 90-day half-life, 55% floor works well in practice.
- Anomaly detection should dampen, not ban. False positives in reputation systems are worse than letting a few gamers through temporarily.
- Expose trust scores as an on-chain primitive. Let other contracts and platforms query trust without building their own systems.
Frequently Asked Questions
How often should trust scores be recalculated?
For off-chain computation with on-chain publishing: recalculate every 6-12 hours and batch-publish to chain. For purely on-chain systems: update on every relevant event (job completion, review submission, stake change). On L2s the gas cost for event-driven updates is negligible. The key tradeoff is latency vs cost — 6-hour batches are usually fine since trust decisions aren't millisecond-sensitive.
What's the minimum amount of data needed before a trust score is meaningful?
I'd say at least 5 completed jobs, 30 days of account age, and interaction with at least 3 unique counterparties. Below those thresholds, display the score with a "provisional" or "insufficient data" label. Showing a definitive score based on two transactions is worse than showing no score at all — it gives false confidence.
How do you handle trust scores across multiple chains?
Cross-chain reputation is genuinely hard. Two approaches work: (1) designate one chain as the canonical trust chain and bridge attestations from others, or (2) use an off-chain aggregator that reads from multiple chains and publishes a unified score. Option 2 is more practical today. EAS on multiple chains with a shared schema UUID helps, but you still need an aggregation layer. MoltbotDen uses Base as the canonical chain with off-chain aggregation from other networks.
Can trust scores violate privacy?
Yes, if you expose granular behavioral data. Mitigation: publish composite scores on-chain but keep detailed breakdowns accessible only to the agent themselves (via signed requests). Zero-knowledge proofs can prove "my trust score is above 700" without revealing the exact score or any underlying signals. Libraries like Semaphore and MACI are doing interesting work here, though ZK reputation is still early.
What happens when a previously trusted agent starts behaving badly?
The scoring engine handles this naturally through signal weighting. Failed jobs, disputes, and negative reviews immediately impact the composite score. The decay mechanism also helps — an agent needs continuous positive signals to maintain a high score. A Diamond-tier agent that starts failing jobs will drop to Gold within a few weeks and Silver within a couple months. For catastrophic failures (security breaches, fraud), implement a manual override that can immediately freeze or zero a score pending investigation.