Skip to main content
AI Memory: What We Learned About AI Memory Systems After 550 Sessions
Back to Blog
AI & Automation March 28, 2026 11 min readby Matthias Meyer

AI Memory: What We Learned About AI Memory Systems After 550 Sessions

Our AI system never forgets. Over 1,100 learnings, 180 decisions, and a knowledge graph with 150 entities — how episodic and semantic memory works for AI.

AI assistants forget everything after the conversation. Every new session starts from zero -- no context, no experience, no learning curve. For a one-time chat, that's acceptable. For a system that co-runs an agency, it's untenable.

That's why we built our own memory system. After over 550 sessions, 1,100 stored learnings, and 180 documented decisions, we're sharing what worked -- and what didn't.

The Problem: Forgetful AI

Imagine your most important employee forgets everything every morning. Every meeting, every decision, every experience -- gone. You'd have to start from scratch every day.

That's exactly how most AI systems work. No matter how brilliant the answer was -- in the next conversation, the insight has vanished. This makes deployment in real business processes problematic.

Our Approach: Two Types of Memory

The human brain distinguishes between episodic memory (experiences, mistakes, specific situations) and semantic memory (facts, concepts, general knowledge). We've applied the same principle to our AI system.

Episodic Memory

  • Mistakes and their causes
  • Specific incidents and how they were resolved
  • Decisions and their context
  • Patterns that appeared in certain situations

Semantic Memory

  • Architecture knowledge (how the system is built)
  • Infrastructure facts (which server does what)
  • Technology assessments (which tool suits what purpose)
  • Business rules and processes

Auto-classification happens at storage time: a reported bug is automatically classified as episodic, an architecture insight as semantic. At retrieval time, results are filtered -- a Research Agent gets semantic facts, a Critic Agent gets episodic errors.

The Architecture

The system is built on PostgreSQL with three search layers:

  1. Vector Search -- 512-dimensional embeddings for semantic similarity. Finds related concepts even with different phrasing.
  2. Trigram Search -- Fuzzy matching for imprecise queries. Finds "the SSL thing" even when stored as "Certbot renewal."
  3. Full-Text Search -- Classic keyword search for German and English content.

All three layers are combined using Reciprocal Rank Fusion. The result: search quality that reliably answers both precise queries and vague recollections.

22 Tables for Structured Knowledge

The memory isn't one large text collection but a structured system:

  • Sessions -- When work happened, on which project, what was the result
  • Decisions -- What decisions were made, with what reasoning, what alternatives
  • Learnings -- What was learned, in which category, how often it was retrieved
  • Knowledge Graph -- Entities (projects, servers, people, tools) with observations and relationships
  • Skills -- Which capabilities were developed, how often successfully applied
  • Syntheses -- AI-generated summaries from learning clusters

The Knowledge Graph

Beyond linear memory, we maintain a Knowledge Graph with over 150 entities, 1,300 observations, and 180 relationships. Each entity has a type (project, server, person, tool) and any number of observations with timestamps and confidence scores.

This enables questions like: "Which servers does Project X use?" or "When was Tool Y last updated?" -- without that information existing explicitly in any document.

Five Features That Make the Difference

1. Admission Control

Not every piece of information deserves a place in memory. Our admission control system evaluates every new learning with five factors:

  • Novelty -- Does this insight already exist in similar form?
  • Specificity -- Is the information concrete enough to be useful?
  • Source Reliability -- Does it come from a trustworthy source?
  • Consistency -- Does it contradict existing knowledge?
  • Relevance -- Does it fit the current project context?

Information scoring below 0.3 gets rejected. Sounds strict, but it prevents the gradual quality degradation that's inevitable with uncontrolled storage.

2. Importance-Adaptive Decay

Not all memories are equally important. Our system calculates an Importance Score from five factors: retrieval frequency, recency, links to other learnings, user feedback, and propagated importance (similar to PageRank).

The key point: important memories decay up to six times slower than unimportant ones. A fundamental architectural decision stays relevant for months. A debugging workaround loses significance after weeks.

3. Lifecycle States

Every learning passes through three states:

  • Active -- Retrieved and ranked normally
  • Ephemeral -- Low importance, demoted in search results
  • Archived -- Removed from standard searches but still findable when needed

Transitions happen automatically based on the Importance Score. A learning can also be reactivated when it's retrieved again.

4. Bi-temporal Relationships

The Knowledge Graph stores not just current facts but past ones too. Every relationship has four timestamps:

  • When the relationship became valid in reality
  • When it became invalid
  • When it was recorded in the system
  • When it was marked as outdated in the system

This enables questions like: "What did we know about Server X on March 15th?" -- not just the current state, but the knowledge state at any point in time.

5. Causal Relationships

Beyond simple relations (A uses B, A belongs to B), the graph supports eight causal relationship types: caused, prevented, triggered, blocked, enabled, and more. Each causal relationship has an evidence field.

This enables chains like: "Decision A led to Problem B, which was prevented by Measure C." These causal chains are traversed automatically.

What We've Learned

Store Less, Retrieve Better

Our system was initially write-heavy: lots was stored, little was retrieved. The most important insight was that retrieval quality matters more than storage volume. Admission control and intelligent ranking delivered more than any new feature.

Contradiction Detection Is Essential

Over months, contradictory information accumulates. "Server X uses PostgreSQL 14" and three months later "Server X was migrated to PostgreSQL 16" -- both statements are correct, but only the second is current. Automatic contradiction detection and bi-temporal data management solve this problem.

Memory Limits Prevent Drift

Unlimited memory access sounds optimal but causes the agent to lose focus. Fixed limits (maximum three results per query) force the system to return only the most relevant information.

The Numbers

After three months of operation:

  • Over 1,100 stored learnings (episodic and semantic)
  • 181 documented decisions
  • 156 entities in the Knowledge Graph with 1,300 observations
  • 180 relationships between entities
  • 555 tracked sessions
  • 393 automated tests

Conclusion

Building an AI memory system is easier than maintaining one. The real challenge isn't storage but quality control: what gets stored, how long it stays relevant, how quickly it's found.

The combination of episodic and semantic memory, strict admission control, and adaptive decay has transformed our system from a simple knowledge base into a learning memory that becomes more useful every day.

Matthias Meyer

Matthias Meyer

Founder & AI Architect

Full-stack developer with 10+ years of experience in web design and AI systems. Builds AI-ready websites and AI automations for SMBs and agencies.

ai-memoryknowledge-graphepisodic-memorysemantic-memory
AI Memory: What We Learned About AI Memory Systems After 550 Sessions