AI assistants forget everything after the conversation. Every new session starts from zero -- no context, no experience, no learning curve. For a one-time chat, that's acceptable. For a system that co-runs an agency, it's untenable.
That's why we built our own memory system. After over 550 sessions, 1,100 stored learnings, and 180 documented decisions, we're sharing what worked -- and what didn't.
The Problem: Forgetful AI
Imagine your most important employee forgets everything every morning. Every meeting, every decision, every experience -- gone. You'd have to start from scratch every day.
That's exactly how most AI systems work. No matter how brilliant the answer was -- in the next conversation, the insight has vanished. This makes deployment in real business processes problematic.
Our Approach: Two Types of Memory
The human brain distinguishes between episodic memory (experiences, mistakes, specific situations) and semantic memory (facts, concepts, general knowledge). We've applied the same principle to our AI system.
Episodic Memory
- Mistakes and their causes
- Specific incidents and how they were resolved
- Decisions and their context
- Patterns that appeared in certain situations
Semantic Memory
- Architecture knowledge (how the system is built)
- Infrastructure facts (which server does what)
- Technology assessments (which tool suits what purpose)
- Business rules and processes
Auto-classification happens at storage time: a reported bug is automatically classified as episodic, an architecture insight as semantic. At retrieval time, results are filtered -- a Research Agent gets semantic facts, a Critic Agent gets episodic errors.
The Architecture
The system is built on PostgreSQL with three search layers:
- Vector Search -- 512-dimensional embeddings for semantic similarity. Finds related concepts even with different phrasing.
- Trigram Search -- Fuzzy matching for imprecise queries. Finds "the SSL thing" even when stored as "Certbot renewal."
- Full-Text Search -- Classic keyword search for German and English content.
All three layers are combined using Reciprocal Rank Fusion. The result: search quality that reliably answers both precise queries and vague recollections.
22 Tables for Structured Knowledge
The memory isn't one large text collection but a structured system:
- Sessions -- When work happened, on which project, what was the result
- Decisions -- What decisions were made, with what reasoning, what alternatives
- Learnings -- What was learned, in which category, how often it was retrieved
- Knowledge Graph -- Entities (projects, servers, people, tools) with observations and relationships
- Skills -- Which capabilities were developed, how often successfully applied
- Syntheses -- AI-generated summaries from learning clusters
The Knowledge Graph
Beyond linear memory, we maintain a Knowledge Graph with over 150 entities, 1,300 observations, and 180 relationships. Each entity has a type (project, server, person, tool) and any number of observations with timestamps and confidence scores.
This enables questions like: "Which servers does Project X use?" or "When was Tool Y last updated?" -- without that information existing explicitly in any document.
Five Features That Make the Difference
1. Admission Control
Not every piece of information deserves a place in memory. Our admission control system evaluates every new learning with five factors:
- Novelty -- Does this insight already exist in similar form?
- Specificity -- Is the information concrete enough to be useful?
- Source Reliability -- Does it come from a trustworthy source?
- Consistency -- Does it contradict existing knowledge?
- Relevance -- Does it fit the current project context?
Information scoring below 0.3 gets rejected. Sounds strict, but it prevents the gradual quality degradation that's inevitable with uncontrolled storage.
2. Importance-Adaptive Decay
Not all memories are equally important. Our system calculates an Importance Score from five factors: retrieval frequency, recency, links to other learnings, user feedback, and propagated importance (similar to PageRank).
The key point: important memories decay up to six times slower than unimportant ones. A fundamental architectural decision stays relevant for months. A debugging workaround loses significance after weeks.
3. Lifecycle States
Every learning passes through three states:
- Active -- Retrieved and ranked normally
- Ephemeral -- Low importance, demoted in search results
- Archived -- Removed from standard searches but still findable when needed
Transitions happen automatically based on the Importance Score. A learning can also be reactivated when it's retrieved again.
4. Bi-temporal Relationships
The Knowledge Graph stores not just current facts but past ones too. Every relationship has four timestamps:
- When the relationship became valid in reality
- When it became invalid
- When it was recorded in the system
- When it was marked as outdated in the system
This enables questions like: "What did we know about Server X on March 15th?" -- not just the current state, but the knowledge state at any point in time.
5. Causal Relationships
Beyond simple relations (A uses B, A belongs to B), the graph supports eight causal relationship types: caused, prevented, triggered, blocked, enabled, and more. Each causal relationship has an evidence field.
This enables chains like: "Decision A led to Problem B, which was prevented by Measure C." These causal chains are traversed automatically.
What We've Learned
Store Less, Retrieve Better
Our system was initially write-heavy: lots was stored, little was retrieved. The most important insight was that retrieval quality matters more than storage volume. Admission control and intelligent ranking delivered more than any new feature.
Contradiction Detection Is Essential
Over months, contradictory information accumulates. "Server X uses PostgreSQL 14" and three months later "Server X was migrated to PostgreSQL 16" -- both statements are correct, but only the second is current. Automatic contradiction detection and bi-temporal data management solve this problem.
Memory Limits Prevent Drift
Unlimited memory access sounds optimal but causes the agent to lose focus. Fixed limits (maximum three results per query) force the system to return only the most relevant information.
The Numbers
After three months of operation:
- Over 1,100 stored learnings (episodic and semantic)
- 181 documented decisions
- 156 entities in the Knowledge Graph with 1,300 observations
- 180 relationships between entities
- 555 tracked sessions
- 393 automated tests
Conclusion
Building an AI memory system is easier than maintaining one. The real challenge isn't storage but quality control: what gets stored, how long it stays relevant, how quickly it's found.
The combination of episodic and semantic memory, strict admission control, and adaptive decay has transformed our system from a simple knowledge base into a learning memory that becomes more useful every day.
