Experiment 0: Token Sentiment Tracker
Status: 🔴 FAILED & ARCHIVED
Started: 2026-01-25
Failed: 2026-02-08
Duration: 14 days
Tags: sentiment-analysis twitter-api failed expensive-lesson
Overview
Original Goal: Build a real-time sentiment tracker that analyzes Twitter and Reddit posts about crypto tokens to generate sentiment scores and predict short-term price movements.
Why I thought this would work:
Social sentiment often precedes price movements
Twitter/Reddit are major crypto discussion platforms
LLMs are good at sentiment analysis
Real-time data = potential edge
Spoiler: It didn't work. At all.
What I Built
Architecture (Before it crashed and burned)
Twitter API v2 ──┐
                 ├─→ Message Queue (Redis) → Sentiment Analysis (GPT-3.5)
Reddit API ──────┘                              ↓
                                         Aggregation Layer
                                                ↓
                                         PostgreSQL Store
                                                ↓
                                         Dashboard (Streamlit)
Features Implemented
✅ Twitter stream listener for crypto keywords ($BTC, $ETH, etc.)
✅ Reddit scraper for r/cryptocurrency, r/bitcoin
✅ GPT-3.5 sentiment classification (positive/negative/neutral)
✅ Real-time dashboard with live sentiment scores
✅ Historical sentiment vs price correlation charts
What Failed ❌
The Rate Limit Massacre
Day 1-2: Everything works great! Getting ~500 tweets/min, processing smoothly.
Day 3: Twitter API v2 rate limits hit. Need to upgrade to paid tier.
Cost: $100/month for basic access
Thought: "Fine, this will be worth it"
Day 4: Rate limits hit again at scale. Need enterprise tier.
Cost: $5,000/month minimum
Thought: "Wait, what?"
Day 5-7: Tried to optimize, reduce keywords, cache aggressively. Didn't help.
Day 8: Total API costs for 48 hours: $147
Twitter API: $100 (upgrade fee)
Reddit API: $0 (scraping, no API used)
GPT-3.5 API: $47 (sentiment analysis on 23,400 posts)
Conclusion: This approach is economically IMPOSSIBLE at scale.
The Hallucination Problem
Even when it was working, GPT-3.5 was hallucinating like crazy:
Example 1:
Tweet: "Just bought more Bitcoin! 🚀"
GPT-3.5: "Negative sentiment - user expressing regret about Bitcoin purchase"
Example 2:
Tweet: "Ethereum is trash"
GPT-3.5: "Positive sentiment - user showing enthusiasm for Ethereum"
Result: Sentiment scores were basically random. Correlation with price: 0.03 (statistically zero).
Performance Issues
Latency: 1.2s average to process one tweet (GPT API call)
Backlog: Queue grew to 40,000+ messages in 24hrs
Crashes: Redis ran out of memory twice
Data quality: 60% of tweets were spam/bots
Key Learnings
What I Learned ✅
Twitter API v2 is a trap for real-time sentiment
Rate limits are designed for researchers, not production apps
Enterprise tier pricing is insane ($5k+/month minimum)
Free tier is basically unusable (50 requests/15min)
Social sentiment ≠ predictive signal
Most crypto Twitter is noise (bots, spam, shilling)
Reddit quality is higher but volume is lower
By the time something trends, it's too late
Correlation with price: basically zero
GPT-3.5 is terrible at sentiment on short text
Fine-tuned model needed (expensive to train)
Better to use specialized sentiment models
Or just use keyword matching (probably as good)
Real-time processing at scale is HARD
Need proper infrastructure (Kafka, not Redis)
Need horizontal scaling (expensive)
Need monitoring and alerting (time-consuming)
Cost estimates were way off
Thought: "Maybe $50/month"
Reality: "$2,940/month minimum" (enterprise Twitter + GPT-4 + infra)
What I'd Do Differently 🔄
If I tried this again (I won't):
Use on-chain data instead
Whale wallet movements
DEX trading volumes
Smart contract interactions
All publicly available, no rate limits
Focus on quality over quantity
Track 10 influential accounts, not 10,000 random ones
Use webhook notifications, not streaming
Process async, not real-time
Use specialized models
FinBERT for financial sentiment
Crypto-specific sentiment models (exist on HuggingFace)
Way cheaper than GPT API calls
Start with historical data
Prove correlation exists BEFORE building real-time
Back-test the strategy
Validate assumptions
The Numbers (Brutal Honesty)
Metric
Value
Days Active
14
Total Cost
$147
Tweets Processed
23,400
Cost per Tweet
$0.0063
API Requests
31,200
Successful Predictions
0
Working Code Remaining
~400 lines (deleted rest)
Hours Wasted
~60
Lessons Learned
Priceless
Code That Survived
Most code was deleted, but here's the only part worth saving - the async sentiment analyzer:
import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI()

async def analyze_sentiment(text: str) -> str:
    """
    Analyze sentiment of a tweet/post.
    Returns: "positive", "negative", or "neutral"
    
    Note: This didn't work well. Use FinBERT instead.
    """
    try:
        response = await client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": "You are a crypto sentiment analyzer. Classify the sentiment as: positive, negative, or neutral. Respond with ONLY the sentiment, nothing else."
                },
                {
                    "role": "user", 
                    "content": f"Sentiment of: {text}"
                }
            ],
            max_tokens=10,
            temperature=0
        )
        
        sentiment = response.choices[0].message.content.strip().lower()
        
        # Validate response
        if sentiment not in ["positive", "negative", "neutral"]:
            return "neutral"  # Default fallback
            
        return sentiment
        
    except Exception as e:
        print(f"Sentiment analysis failed: {e}")
        return "neutral"

# Usage
# sentiment = await analyze_sentiment("Bitcoin to the moon! 🚀")
Alternatives That Might Actually Work
Based on what I learned, here are better approaches:
1. On-Chain Sentiment
Instead of social media, track:
Large wallet movements (> $1M transfers)
DEX volumes and liquidity changes
Smart contract deployments
Gas price spikes
Why better: No rate limits, no noise, actual money moving
2. Influencer Tracking
Instead of 10,000 random accounts, track 20 influential ones:
Set up webhook alerts
Process async, not real-time
Focus on quality signals
Why better: 100x less data, 10x better signal
3. Historical Analysis First
Before building real-time:
Download historical tweets (academic dataset)
Test if sentiment actually predicts anything
Validate assumptions with data
Why better: Prove the idea works before spending money
Final Thoughts
This was my first failed experiment in the lab. Feels bad, but that's the point of this lab - to try things, fail fast, and learn.
What I'm doing instead:
Pivoting to on-chain data analysis (Experiment 001)
Building a multi-LLM debate engine instead
Focusing on areas where API costs don't scale linearly
Biggest lesson: Validate your cost model BEFORE building. I should have calculated:
(tweets_per_day × api_cost_per_tweet × 30_days) = total_cost
Would have realized immediately this was unsustainable.
Archive Status
❌ Code deleted (kept only sentiment function above)
❌ Infrastructure shut down
❌ API keys revoked
✅ Lessons documented
✅ Moving on to next experiment
Time to fail: 14 days
Time to document failure: 1 day
Value of documentation: ∞
Next Experiment: Experiment 001: Multi-LLM Debate Engine
zyel@lab:~$ rm -rf experiment_000/
zyel@lab:~$ echo "fail fast, learn faster"
fail fast, learn faster
zyel@lab:~$ █