Bluesky Post Explainer
AI Agent that Explains Social Media Posts with Web Context and Source Citations
Project Overview
Making sense of social media posts through AI-powered context retrieval
Bluesky Post Explainer is an AI agent that takes any Bluesky social media post and generates clear, sourced explanations by searching the web for relevant context, reranking results using ML, and synthesizing explanatory bullets with citations.
The system streams results progressively via Server-Sent Events, giving users real-time feedback as each pipeline stage completes — from post extraction to final explanation.
Problem Statement
Social media posts are often cryptic, reference niche topics, or assume context the reader doesn't have. Manually researching every unfamiliar post is time-consuming. This agent automates the entire process: extract → search → rank → explain.
Key Features
🔍 Smart Context Retrieval
LLM rewrites posts into focused search queries, then Tavily retrieves relevant web context
🎯 ML Reranking
Cross-encoder (ms-marco-MiniLM-L-6-v2) reranks results by semantic relevance to the post
⚡ Real-time Streaming
SSE delivers progressive updates at each pipeline stage — no waiting for full completion
🤖 Multi-LLM Routing
LiteLLM routes to Claude, GPT-4o, or Gemini based on availability and cost
System Architecture
Hexagonal (Ports & Adapters) with clean domain separation
Architecture Overview
┌─────────────────────────────────────────────────────────────────┐
│ ENTRYPOINTS │
│ React Frontend ──SSE──▶ FastAPI + SSE │ CLI (argparse) │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ DOMAIN │
│ ExplainPostUseCase │ Ports (Abstract) │ Entities (Models) │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ ADAPTERS │
│ BlueskyExtractor │ TavilySearcher │ CrossEncoderReranker │
│ QueryRewriter │ LiteLLMRouter │ ExplanationGenerator │
└─────────────────────────────────────────────────────────────────┘
The hexagonal architecture ensures the domain logic (use cases, entities) has zero external dependencies. All I/O goes through abstract ports, making the system highly testable and easy to swap components (e.g., replace Tavily with another search engine without touching business logic).
Pipeline Flow
1. Post Extraction
AT Protocol fetches the Bluesky post content (text, author, images)
2. Query Rewriting
LLM transforms the post into focused search queries for better retrieval
3. Web Search
Tavily API retrieves relevant web pages and articles
4. Cross-Encoder Reranking
ms-marco-MiniLM-L-6-v2 scores and reranks results by relevance
5. Explanation Generation
Multi-LLM generates 3-5 explanatory bullets with source citations
💡 Streaming Architecture
Each pipeline stage emits an SSE event as it completes. The React frontend renders progressively — users see the post content immediately, then sources appear, then explanation bullets stream in one by one.
Technical Deep Dive
Key implementation details and engineering decisions
Cross-Encoder Reranking
Initial search results from Tavily are good but noisy. The cross-encoder provides a second-pass relevance scoring that dramatically improves explanation quality.
from sentence_transformers import CrossEncoder
class CrossEncoderReranker:
def __init__(self):
self.model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")
def rerank(self, query: str, results: list[SearchResult]) -> list[SearchResult]:
pairs = [(query, r.content) for r in results]
scores = self.model.predict(pairs)
scored_results = list(zip(results, scores))
scored_results.sort(key=lambda x: x[1], reverse=True)
return [r for r, _ in scored_results[:TOP_K]]
Why Cross-Encoder over Bi-Encoder?
Cross-encoders process query-document pairs jointly, achieving higher accuracy than bi-encoders (which encode separately). The trade-off is speed — but since we only rerank ~20 search results, latency is negligible (~50ms).
Multi-LLM Routing with LiteLLM
The system uses LiteLLM to abstract away provider differences, enabling seamless routing between Claude, GPT-4o, and Gemini with a unified interface.
import litellm
class LiteLLMRouter:
"""Unified interface for multiple LLM providers."""
def __init__(self, model_list: list[dict]):
self.router = litellm.Router(model_list=model_list)
async def generate(self, messages: list[dict], **kwargs) -> str:
response = await self.router.acompletion(
model="gpt-4o", # Routes based on availability
messages=messages,
**kwargs
)
return response.choices[0].message.content
Claude
Best for nuanced explanations
GPT-4o
Fast + multimodal support
Gemini
Cost-effective fallback
Server-Sent Events (SSE) Streaming
The FastAPI backend streams events to the React frontend as each pipeline stage completes:
from fastapi.responses import StreamingResponse
@app.post("/api/explain")
async def explain_post(request: ExplainRequest):
async def event_stream():
# Stage 1: Extract post
post = await extractor.extract(request.url)
yield f"event: post\ndata: {post.json()}\n\n"
# Stage 2: Search for context
sources = await searcher.search(post.text)
yield f"event: sources\ndata: {json.dumps(sources)}\n\n"
# Stage 3: Rerank results
ranked = reranker.rerank(post.text, sources)
# Stage 4: Generate explanation bullets
async for bullet in generator.stream_bullets(post, ranked):
yield f"event: bullet\ndata: {bullet.json()}\n\n"
yield f"event: done\ndata: {{}}\n\n"
return StreamingResponse(event_stream(), media_type="text/event-stream")
Query Rewriting Strategy
Raw social media text makes poor search queries. The query rewriter transforms posts into focused, search-optimized queries:
❌ Raw post as query:
"wow the new paper from deepmind is insane, AGI is closer than we think 🤯"
✅ Rewritten queries:
1. "DeepMind latest research paper 2024"
2. "DeepMind AGI progress recent publication"
The system also supports a local CPU-based rewriter (Qwen2.5-0.5B) for environments without cloud API access, trading quality for zero-cost operation.
Project Structure
Clean separation following hexagonal architecture principles
src/
├── domain/ # Core logic — zero external dependencies
│ ├── entities.py # Pure dataclasses (PostContent, SearchResult, etc.)
│ ├── exceptions.py # Domain exception hierarchy
│ ├── ports.py # Abstract interfaces (PostExtractorPort, SearchPort, etc.)
│ └── use_cases.py # ExplainPostUseCase — orchestrates the pipeline
├── adapters/ # Concrete implementations of ports
│ ├── bluesky_extractor.py # AT Protocol post extraction
│ ├── tavily_searcher.py # Tavily web search
│ ├── query_rewriter.py # LLM-based search query rewriting (cloud)
│ ├── local_query_rewriter.py# CPU-based query rewriting (Qwen2.5-0.5B)
│ ├── crossencoder_reranker.py # ML reranking (sentence-transformers)
│ ├── litellm_router.py # Multi-LLM routing via LiteLLM
│ └── explanation_generator.py # Bullet generation with citations
└── entrypoints/ # Interface layer
├── api.py # FastAPI + Server-Sent Events streaming
└── cli.py # CLI interface
tests/ # Unit tests (pytest)
frontend/ # React (Vite + TypeScript)
eval/ # Evaluation harness (test cases, judge, reports)
Quick Start
Get the agent running locally in minutes
Prerequisites
- Python 3.11+
- Node.js 18+
- API keys: Tavily (search), Anthropic or OpenAI (LLM)
Setup & Run
# Clone and setup backend
git clone https://github.com/raulprocha/bluesky-post-explainer.git
cd bluesky-post-explainer
python3 -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
# Setup frontend
cd frontend && npm install && cd ..
# Terminal 1: Backend (FastAPI)
uvicorn src.entrypoints.api:app --host 0.0.0.0 --port 8000
# Terminal 2: Frontend (React/Vite)
cd frontend && npm run dev
Open the frontend URL, paste your API keys and a Bluesky post URL, and watch the explanation stream in real-time.
Key Design Decisions
Engineering trade-offs and rationale
Why Hexagonal Architecture?
Enables swapping any component (search engine, LLM provider, reranker) without touching domain logic. Critical for a system that depends on multiple third-party APIs that may change or fail.
Why SSE over WebSockets?
The data flow is unidirectional (server → client). SSE is simpler, works natively with HTTP/2, auto-reconnects, and doesn't require a persistent bidirectional connection.
Why Cross-Encoder Reranking?
Search APIs return keyword-relevant results, but not always semantically relevant to the post's intent. The cross-encoder reranks by actual meaning overlap, significantly improving explanation quality.
Why Multi-LLM Routing?
No single LLM provider guarantees 100% uptime. Routing provides resilience, cost optimization (use cheaper models when quality is sufficient), and flexibility to adopt better models as they release.
Why Local Query Rewriter Option?
Qwen2.5-0.5B runs on CPU with no API cost. Useful for development, testing, and environments with limited API budgets. Quality is lower but acceptable for most posts.