Building Real-Time AI Memory with Redis
Building Real-Time AI Memory with Redis
We needed sub-millisecond coordination between AI agents. Redis made it possible. Here's how we built a dual-layer AI memory system using Redis for real-time state and git-based patterns for long-term knowledge.
The Problem: Stateless AI
Every AI conversation starts from scratch. Your AI assistant doesn't remember:
- The architecture decisions from yesterday
- The bugs you fixed last month
- What other agents on your team are working on
This isn't just inconvenient—it's expensive. You waste tokens re-explaining context. You lose team knowledge. You can't coordinate multiple AI agents.
We needed a memory system for AI.
Why Redis?
We evaluated several options for real-time AI memory:
| Option | Latency | Simplicity | Pub/Sub | Verdict |
|---|---|---|---|---|
| PostgreSQL | ~10ms | Medium | No | Too slow for real-time |
| MongoDB | ~5ms | Medium | Change streams | Possible, but complex |
| SQLite | <1ms | High | No | No coordination |
| Redis | <1ms | High | Yes | Perfect fit |
Redis won because:
- Sub-millisecond latency — AI decisions need to be fast
- Simple key-value model — Memory contexts map naturally to keys
- Pub/sub built-in — Agents can notify each other instantly
- Battle-tested — We didn't want to debug infrastructure
The Architecture
We built a dual-layer memory system:
Layer 1: Git-based patterns — Long-term knowledge that persists across sessions. Bug fixes, security decisions, architectural patterns. Version-controlled, zero infrastructure for individuals.
Layer 2: Redis — Real-time coordination for active sessions. What agents are working on, shared context, instant notifications. This is where Redis shines.
Redis Use Cases in Our System
1. Session Context Storage
When a user starts a session, we store their context in Redis:
# Store session context
redis_client.hset(
f"session:{session_id}",
mapping={
"user_id": user_id,
"project": project_path,
"started_at": datetime.now().isoformat(),
"context": json.dumps(initial_context)
}
)
# Set TTL for automatic cleanup
redis_client.expire(f"session:{session_id}", 3600) # 1 hour
Why Redis? We need instant access (<1ms) and automatic expiration. Sessions are ephemeral—they shouldn't clutter persistent storage.
2. Multi-Agent Coordination
When multiple AI agents work together, they need shared state:
# Agent claims a task
redis_client.set(
f"task:{task_id}:owner",
agent_id,
nx=True, # Only if not exists
ex=300 # 5 minute lock
)
# Agent shares findings
redis_client.lpush(
f"task:{task_id}:findings",
json.dumps(finding)
)
# Other agents check findings
findings = redis_client.lrange(f"task:{task_id}:findings", 0, -1)
This enables patterns like:
- Task claiming — Prevent duplicate work
- Result sharing — Agents build on each other's findings
- Conflict resolution — Detect when agents disagree
3. Real-Time Notifications (Pub/Sub)
Agents need to react to events instantly:
# Publisher: Agent completes analysis
redis_client.publish(
"agent:events",
json.dumps({
"event": "analysis_complete",
"agent_id": agent_id,
"task_id": task_id,
"findings_count": len(findings)
})
)
# Subscriber: Orchestrator listens
pubsub = redis_client.pubsub()
pubsub.subscribe("agent:events")
for message in pubsub.listen():
if message["type"] == "message":
event = json.loads(message["data"])
handle_agent_event(event)
This enables:
- Instant coordination — No polling, no delays
- Event-driven architecture — Agents react to changes
- Loose coupling — Agents don't need to know about each other
4. Short-Term Memory (Conversation Context)
AI needs to remember recent conversation context:
# Store recent messages (sliding window)
redis_client.lpush(f"memory:{user_id}:messages", json.dumps(message))
redis_client.ltrim(f"memory:{user_id}:messages", 0, 99) # Keep last 100
# Retrieve for context injection
recent = redis_client.lrange(f"memory:{user_id}:messages", 0, 9)
context = [json.loads(m) for m in recent]
This provides:
- Fast retrieval — Context ready in <1ms
- Automatic pruning — Old messages fall off
- Per-user isolation — Each user has their own memory
Performance Results
We benchmarked our Redis integration:
| Operation | Latency | Throughput |
|---|---|---|
| Session context read | 0.3ms | 10,000/sec |
| Agent coordination write | 0.4ms | 8,000/sec |
| Pub/sub message | 0.1ms | 50,000/sec |
| Memory context retrieval | 0.5ms | 6,000/sec |
Key insight: Redis is fast enough that memory lookups don't impact AI response latency. The LLM call (100-2000ms) dominates; Redis adds <1ms overhead.
Lessons Learned
1. Use Appropriate TTLs
Memory should expire. We learned this the hard way:
# Bad: No expiration
redis_client.set(f"session:{id}", data) # Leaks memory!
# Good: Always set TTL
redis_client.set(f"session:{id}", data, ex=3600)
2. Namespace Your Keys
With multiple concerns in one Redis instance, namespacing prevents collisions:
# Bad: Flat keys
redis_client.set("context", data)
# Good: Namespaced
redis_client.set(f"empathy:session:{id}:context", data)
3. Use Hash Types for Structured Data
Instead of multiple keys, use hashes:
# Bad: Multiple keys
redis_client.set(f"session:{id}:user", user_id)
redis_client.set(f"session:{id}:project", project)
redis_client.set(f"session:{id}:started", timestamp)
# Good: Single hash
redis_client.hset(f"session:{id}", mapping={
"user": user_id,
"project": project,
"started": timestamp
})
4. Graceful Degradation
Redis should be optional for basic functionality:
class MemorySystem:
def __init__(self, redis_url=None):
self.redis = None
if redis_url:
try:
self.redis = redis.from_url(redis_url)
self.redis.ping()
except redis.ConnectionError:
logger.warning("Redis unavailable, using local-only mode")
def get_context(self, session_id):
if self.redis:
return self._get_from_redis(session_id)
return self._get_from_local(session_id)
This way, students can use the framework without Redis, while teams get full coordination features.
What's Next
We're exploring additional Redis capabilities:
- Redis Stack + Vector Search — Semantic memory retrieval
- Redis Streams — Durable event logs for audit trails
- Redis Cluster — Scaling for enterprise deployments
- RediSearch — Full-text search over conversation history
Try It Yourself
# Install
pip install empathy-framework
# Start memory server (auto-starts Redis)
empathy-memory serve
# Check status
empathy-memory status
The full source is available: github.com/Smart-AI-Memory/empathy
Conclusion
Redis is the perfect fit for real-time AI memory:
- Fast enough that it doesn't slow down AI interactions
- Simple enough that integration is straightforward
- Powerful enough for multi-agent coordination
If you're building AI systems that need to remember and coordinate, consider Redis as your memory layer.
Built by Smart AI Memory — AI collaboration with persistent memory, powered by Redis.