LLM Integration · Agentic AI

LLMs that actually do work. Not just talk.

Function calling and tool use with Claude and GPT. Your AI agent books appointments, sends mail, writes invoices, queries your database, calls external APIs. With guardrails, approval workflows for sensitive actions and complete audit trail. This is what sets us apart from a chatbot shop.

What you get What it's used for

What you get

Six building blocks for a production-grade LLM agent

An agent in a demo is easy. An agent that works correctly over months and doesn't suddenly send wrong invoices is engineering. We build the full stack including safety net.

Tool inventory with permission map: We walk through which actions the agent is allowed to take. Reading is usually okay, writing with approval, deleting never without a human. Per tool we define scope, inputs, rate limits and sandbox behaviour.
Function schemas (OpenAI plus Anthropic): Clean JSON schemas for every tool. Clear descriptions so the agent picks the right tool at the right time. Schema tests prevent hallucinations on parameters.
Guardrails and approval workflows: What the agent can do alone, what needs approval. Example: appointment booking without asking, invoice send with one-click approval, money transfer never without human sign-off. Configurable per tool.
Multi-tool orchestration: Complex tasks need multiple tools in sequence. We build the orchestration logic plus error handling, when tool 3 fails, the agent knows how to undo tools 1 and 2.
Audit trail and logging: Every agent action is logged with timestamp, input, output, confidence. You can trace why the agent took action Y in situation X. Important for compliance.
Fallback to human: When the agent is unsure, hits complexity or runs into an unusual constellation, it escalates to you or a defined employee. With all data and its reasoning so far.

What it's used for

Five concrete agentic AI setups

Travel agency with complex multi-step bookings

Agent books flight plus hotel plus rental car plus transfer, sends confirmation to the customer, creates the booking in the CRM, schedules the follow-up call. All in one conversation with the customer, human steps in only for special cases.

Law firm with client onboarding

Agent registers new client in the accounting system, creates the initial data processing agreement from template, sends the document for signature, schedules the kick-off meeting. Sensitive actions (invoice) need approval from the responsible lawyer.

Online shop with order cancellation

Customer says cancel my order. Agent cancels in the shop, initiates refund via Stripe or Klarna, notifies the warehouse, writes confirmation mail. All in 30 seconds, no human needed.

Medical practice software with appointment pipeline

Agent registers new patient, checks insurance status via insurance API, books matching slot in the practice calendar, sends SMS reminder 24h before. Escalates to reception for private patients.

Trade business with material ordering

Site manager says order material X for site Y. Agent checks stock, looks up the part at three suppliers, compares price and delivery time, places the order, schedules delivery in the site plan. Owner approval for orders above 1,000 EUR.

How it works

Four phases from action inventory to production

01
Action inventory with risk assessment
One week. We walk through all actions the agent could take with you and one to two key employees. Per action: value if right, harm if wrong, reversibility. From there we derive the approval logic.
02
Function schemas and guardrails
We build the function schemas for all approved actions. Each schema gets tests against typical mis-calls (wrong parameters, missing fields, hallucinations). Guardrails configuration defines what's alone vs with approval vs never.
03
Sandbox test with real data
We run the agent in a sandbox with test data. You give us 20-50 scenarios (including edge cases), we test each. Success criterion: 90%+ correct actions, 0% harmful actions without approval.
04
Production rollout with audit trail
Agent goes live. Audit trail runs from day one. First 14 days we stay close (daily log review), then we move to weekly monitoring. On drift or anomalies we tune guardrails.

FAQ

Common questions about agentic AI

What's the difference to a normal chatbot?

A chatbot answers with text. An agent executes actions. The chatbot tells you the office hours, the agent actually books you an appointment. Function calling is the technical foundation, the LLM gets tool schemas and can call them instead of just generating text. The latest models from Anthropic (Claude) and OpenAI (GPT) handle this well.

Which models do function calling cleanly?

For production we recommend the latest top-tier models from Anthropic (Claude) and OpenAI (GPT). They deliver reliable multi-tool orchestration. Local models can handle simple agent workflows but with higher hallucination rates on complex multi-tool setups. We choose based on data sensitivity and complexity.

What if the agent takes a wrong action?

Three safety layers: 1) Guardrails determine which actions are allowed at all 2) Approval workflows require human sign-off for sensitive actions 3) Audit trail documents every action for tracking and possibly reversal. Plus: we build reversible operations wherever possible (e.g. soft-delete instead of hard-delete).

How does approval workflow work concretely?

Example: agent wants to send an invoice to a customer. Instead of sending directly, it sends a Slack or Telegram or mail message to the responsible employee with the invoice details. They answer *yes* or *no* or *change X*. Only then does the agent execute the action. Approval latency per action: seconds to minutes depending on setup.

Who's liable when the agent does something wrong?

From a GDPR and liability perspective the agent is a tool, you are the responsible party. That's why guardrails plus approval workflows are so important, they show you took reasonable measures. We document the setup auditable. For specific industries (tax, medical, law) we bring in a specialist lawyer before production, that's part of the package.

GDPR with external tool calls?

When the agent calls OpenAI or Anthropic, their data processing agreements plus EU hosting options apply. For API calls to your own systems, data stays in your infrastructure. For third-party APIs (e.g. external booking system) data goes there, we do GDPR mapping per tool before production and lay open the data processing agreements.

Next step

Free 30-minute intro call.

We look at which workflows you want to automate, which actions are sensitive and whether agentic AI is the right lever for your case. Honest assessment instead of sales pitch.

Other category · Consulting

Potential Analysis

We analyse your business for AI leverage: where does automation pay off, where doesn't it? Written report with recommendations. From €990.

Continue with · Integration

CRM and ERP Connection

Integration with HubSpot, Salesforce, Pipedrive, Shopify, SAP, DATEV or Lexware. Via MCP or REST API, with audit trail. Setup from €2,500.