agentic chat bot for the matrix protocol
Find a file
2026-05-01 18:56:49 -04:00
docs Update docs for Manifold rebrand and feature changes 2026-05-01 18:56:49 -04:00
src Rebrand to Manifold, drop music/instagram, add docker compose 2026-05-01 15:57:47 -04:00
.dockerignore Rebrand to Manifold, drop music/instagram, add docker compose 2026-05-01 15:57:47 -04:00
.gitignore initial commit 2026-02-28 23:18:26 -05:00
biome.json initial commit 2026-02-28 23:18:26 -05:00
bun.lock initial commit 2026-02-28 23:18:26 -05:00
config.example.json Update docs for Manifold rebrand and feature changes 2026-05-01 18:56:49 -04:00
custom.example.txt initial commit 2026-02-28 23:18:26 -05:00
docker-compose.yml Rebrand to Manifold, drop music/instagram, add docker compose 2026-05-01 15:57:47 -04:00
Dockerfile Rebrand to Manifold, drop music/instagram, add docker compose 2026-05-01 15:57:47 -04:00
package.json initial commit 2026-02-28 23:18:26 -05:00
README.md Update docs for Manifold rebrand and feature changes 2026-05-01 18:56:49 -04:00
tsconfig.json initial commit 2026-02-28 23:18:26 -05:00

Manifold

A Matrix chat bot with LLM integration, moderation tools, and extensible tool execution. Built in TypeScript for the Bun runtime.

One interface, many capabilities.


Features

LLM Chat

Responds to mentions and replies in Matrix rooms using any OpenAI-compatible API. Supports multi-model configurations — separate models for chat, analysis, interest extraction, and summarisation. Maintains per-room conversation history and a configurable context window.

Tool Execution

The LLM can invoke tools automatically based on user prompts:

  • Web search via a self-hosted 4get instance
  • Exa search for semantic/neural web search (with optional --deep flag for multi-query expansion)
  • Exa fetch for fetching and summarising URL contents
  • Profile metadata — fetch display name, pronouns, and timezone on demand
  • Vision — image analysis and description
  • Crypto price lookup — single, multi-coin, or market overview, triggered inline from conversation

Automatically embeds rich metadata for links posted in rooms. Supported platforms: Twitter/X, YouTube, TikTok, Reddit, Twitch clips, Redgifs, and Fediverse (Mastodon, Pleroma, Akkoma, Misskey, Firefish). Media (images, video) is downloaded, processed with Sharp, and re-uploaded to the Matrix content repository with blurhash placeholders.

Moderation

  • Mute/unmute users by setting power level to -1, with auto-restore on expiry
  • Timed mutes with human-readable duration syntax (10s, 5m, 3h, 7d)
  • Room lockdown (!lock / !unlock) for emergency situations
  • Per-room antispam: rapid message detection, duplicate detection, media flood detection, mass-redaction detection
  • Whitelist-based room access control with configurable join enforcement

Harm Reduction

Drug information via the TripSit API: substance info (!drug) and interaction checking (!combo). Non-judgemental, accuracy-focused.

User Analytics

LLM-based extraction of per-user interests from conversation history, stored in PostgreSQL. Room sentiment and mood tracking on a configurable cooldown.

SauceNAO integration for finding image sources (!sauce).

Crypto Prices

CoinGecko price lookups with currency conversion and detailed market data (!crypto).


Requirements

  • Bun >= 1.x (or Docker — see below)
  • PostgreSQL 14+
  • A Matrix homeserver account with an access token
  • An OpenAI-compatible LLM API endpoint (e.g. NanoGPT, OpenRouter, any OpenAI-compatible provider)

Optional integrations (configured per-section in config.json):

  • SauceNAO API key (reverse image search)
  • 4get instance (web search)
  • Exa API key (semantic search + URL fetch)
  • Vision-capable LLM model (image analysis)
  • HTTP proxies (round-robin rotation with circuit breaking)

Setup

A self-contained docker-compose.yml brings up Postgres + the bot:

cp config.example.json config.json
# edit config.json — leave database fields as defaults; compose overrides them
MATRIX_ACCESS_TOKEN=... LLM_API_KEY=... docker compose up -d --build

The bot service mounts ./config.json and ./sys-deepseek.txt read-only and uses DATABASE_HOST=postgres automatically. Postgres data persists in a named volume (pgdata).

Bare metal

git clone <repo-url> manifold
cd manifold
bun install
cp config.example.json config.json
# edit config.json with your credentials
bun run start

The database schema is initialised automatically on first start. No migration tool required.

Environment Overrides

Sensitive and deployment-specific values can be set via environment variables instead of config.json:

Variable Config field
MATRIX_ACCESS_TOKEN matrix.accessToken
LLM_API_KEY llm.apiKey
DATABASE_HOST database.host
DATABASE_PORT database.port
DATABASE_NAME database.database
DATABASE_USER database.user
DATABASE_PASSWORD database.password

Hot Reload

Send SIGHUP to reload config.json and your system prompt without restarting:

kill -HUP <pid>

Configuration

All config lives in config.json (Zod-validated on startup). Copy config.example.json as your starting point.

database

PostgreSQL connection. Standard host, port, database, user, password, maxConnections.

matrix

Field Description
homeserverUrl Your homeserver base URL
userId Full Matrix ID of the bot account (@bot:server)
accessToken Matrix access token
deviceId Optional device ID

llm

Field Description
apiUrl OpenAI-compatible completions endpoint
apiKey API key
model Default chat model
preset temperature, top_p, max_tokens, frequency_penalty, presence_penalty, optional thinking
systemPromptFile Path to system prompt file (e.g. sys-deepseek.txt)
systemPrompt Inline system prompt (used if no file is set)
analysisModel Model for room mood/sentiment analysis
extractionModel Model for interest extraction
summaryModel Model for summarisation
interestAnalysis.model Model for interest tracking
interestAnalysis.cooldown Cooldown between analyses in ms (default: 7200000)
interestAnalysis.maxTokens Max tokens for interest analysis (default: 6000)
maxRetries Max LLM request retries (default: 2)
retryDelay Delay between retries in ms (default: 1000)

bot

Field Description
displayName Bot display name in Matrix
respondToMentions Respond when the bot is @-mentioned (default: true)
respondToReplies Respond to direct replies to bot messages (default: true)
interactiveMode Global default for interactive mode (per-room override via !interactive)
analyticsOnlyMode Only run analytics, no LLM responses (default: false)
commandPrefix Command trigger character (default: !)
adminUsers List of Matrix IDs with full admin access
autoAcceptInvites Auto-join rooms when invited (default: false)
allowedInviters If autoAcceptInvites is true, only accept from these IDs
maxContextMessages Max messages passed to LLM as history (default: 10)
maxBotContextLength Max length of bot messages in context (default: 500)
messageHistoryRetention Messages to retain in memory per room (default: 100)
ignoreBots Matrix IDs to ignore entirely
enableRoomMemory Persist room memory across restarts (default: true)
interestAnalysisInterval Interest analysis cooldown in ms (default: 7200000)
interestAnalysisMessageThreshold Messages before triggering analysis (default: 50)
notifyNonWhitelisted Notify when a non-whitelisted user joins (default: false)
linkPreview.enabled Enable/disable link preview module (default: true)
linkPreview.timeout HTTP timeout for preview fetches in ms (default: 5000)
linkPreview.maxDescriptionLength Truncate descriptions at this length (default: 300)
linkPreview.maxVideoSize Max video download size in bytes (default: 52428800)
linkPreview.platforms Toggle individual platform previewers (twitter, youtube, mastodon, misskey, generic)

exa

Field Description
apiKey Exa API key for semantic/neural web search
maxResults Max results per search (default: 10)
deepSearch.enabled Enable deep search mode (default: false); when enabled, users opt in per-query by appending --deep to their message
deepSearch.maxQueries Max additional query variants the extraction model may generate (default: 3)

fourget

Field Description
apiUrl URL of your self-hosted 4get instance
apiKey Optional API key
maxResults Max web results (default: 5)
imageResults Max image results (default: 1)

vision

Field Description
visionModel Model to use for image analysis
apiKey Optional separate API key (falls back to llm.apiKey)
imageGenModel Model for image generation
imageSize Generated image dimensions (default: 1024x1024)
thumbnailSize Thumbnail resize dimension in px (default: 512)

saucenao

Field Description
apiKey SauceNAO API key for reverse image search

antispam

Field Description
enabled Global antispam toggle (default: false)
messageThreshold Messages per window before mute (default: 5)
messageWindow Rapid message detection window in ms (default: 2000)
muteDuration Auto-mute duration in ms (default: 300000)
redactionThreshold Duplicate messages before redaction (default: 5)
redactionWindow Duplicate detection window in ms (default: 10000)
lockdownThreshold Spammers per window before room lockdown (default: 5)
mediaWindow Media flood detection window in ms (default: 2000)
silentMode Suppress antispam notification messages (default: true)

Per-room overrides are available via !antispam set <field> <value>.

analytics

Field Description
enableSentimentAnalysis Enable sentiment tracking (default: true)
enableRoomMoodTracking Enable room mood analysis (default: true)
enablePsychProfile Enable psychological profiling (default: false)
enableStylometry Enable writing style analysis (default: false)
enableSarcasmDetection Enable sarcasm detection (default: false)
enableGenderEstimation Enable gender estimation (default: false)
enablePersonalityEstimation Enable personality estimation (default: false)
minMessageSampleSize Min messages required for analysis (default: 15)
fetchHistoryOnDemand Fetch history when sample is too small (default: true)
maxHistoryFetch Max messages to fetch for analysis (default: 500)
quickAnalysisModel Optional separate model for quick analyses

rateLimit

Field Description
enabled Enable per-user rate limiting (default: true)
maxRequests Max requests per window (default: 15)
windowMs Rate limit window in ms (default: 86400000 / 24h)

commandTimeout

Field Description
userCooldownMs Per-user command cooldown in ms (default: 5000)
roomThreshold Commands per room before timeout (default: 10)
roomWindowMs Room threshold window in ms (default: 60000)
roomTimeoutMs Room timeout duration in ms (default: 30000)

contextOptimization

Field Description
enabled Enable context window optimization (default: true)
minMessages Minimum messages to keep in context (default: 5)
maxMessages Maximum messages in context (default: 20)
summarizeOldContext Summarise older context (default: false)

proxy

Field Description
enabled Enable proxy rotation (default: false)
strategy Rotation strategy: round-robin, random, or sticky (default: round-robin)
failureWindow Circuit breaker failure window in ms (default: 300000)
maxFailures Failures before circuit opens (default: 3)
exclusions Domains to bypass proxy for
urlBypass URL patterns to bypass proxy for
previewerBypass Per-previewer domain bypass lists
proxies Array of { protocol, host, port, username?, password? } entries

logging

Field Description
level Log level: ERROR, WARN, INFO, DEBUG, TRACE (default: INFO)
colors Coloured log output (default: true)

mediaCacheRetentionDays

Days to retain cached media before cleanup (default: 90).


Commands

Default prefix is !. Configurable via bot.commandPrefix.

Command Access Description
!ping All Check bot latency
!help All List available commands
!drug <substance> All Substance info via TripSit
!combo <sub1> [sub2...] All Drug interaction check
!crypto <symbol> [currency] [--full] All Cryptocurrency price
!search <query> All Web search
!sauce All Reverse image search (reply to an image)
!mute <user> [duration] [reason] Moderator Mute a user
!unmute <user> Moderator Unmute a user
!lock Moderator Enable room lockdown
!unlock Moderator Disable room lockdown
!whitelist <subcommand> Admin Manage room whitelist
!antispam <subcommand> Admin Configure antispam per room
!interactive <on|off> Admin Toggle LLM interactive mode
!status Admin Show bot status and model config
!cwd <room_id> Admin Set working room for admin commands
!loglevel <LEVEL> Admin Set runtime log level
!rooms Admin List all rooms the bot is in
!leave [reason] Admin Leave the current working room
!testpreview <url> Admin Test link preview generation
!clearmediacache [platform|all] confirm Admin Clear media cache
!mediacachestats [platform] Admin Show media cache statistics
!clearpreviewcache <url> Admin Clear preview metadata cache for a URL

Custom Persona

The bot's personality, tone, and domain knowledge are configured in your system prompt file (path set via llm.systemPromptFilesys-deepseek.txt is the conventional name).

The system prompt sent to the LLM is composed of two layers:

┌─────────────────────────────────────────┐
│  BASE PROMPT (hardcoded)                  │  Matrix protocol format, mention syntax,
│                                           │  reply structure, tool injection, security
├─────────────────────────────────────────┤
│  CUSTOM PROMPT (your file)                │  Identity, tone, domain focus, restrictions
└─────────────────────────────────────────┘

You only need to write the identity layer. The operational context is always injected automatically and kept in sync with the code.

See docs/system-prompt-architecture.md for full rationale.


Architecture

src/
├── Bot/               # Matrix client lifecycle, message processing pipeline,
│                      #   mute management, room roster
├── Commands/          # Command registry and handlers
├── Database/          # PostgreSQL connection, schema, per-feature query modules
├── Interests/         # LLM-based interest extraction and room mood analysis
├── Llm/               # OpenAI-compatible LLM client, base prompt
├── Antispam/          # Rapid message, duplicate, media flood detection
├── Tools/             # LLM tool definitions: search, fetch, vision, crypto,
│                      #   profile metadata, query resolution
├── CryptoPrice/       # CoinGecko client, formatter, market overview
├── LinkPreview/       # Link embed generator (platform fetchers, media upload, sender)
│   └── platforms/     #   twitter, youtube, reddit, fediverse, misc
├── Proxy/             # Round-robin HTTP proxy rotation with circuit breaking
├── Tripsit/           # TripSit drug info client
├── cache.ts           # In-memory TTL cache
├── config.ts          # Zod config schema and loader
├── formatter.ts       # Matrix message formatting utilities
├── historyFetcher.ts  # Room message history for LLM context
├── index.ts           # Entry point: init, connect, register commands, graceful shutdown
├── logger.ts          # Structured logger with Matrix SDK noise suppression
├── queue.ts           # Per-room serialised message processing queue
├── resolver.ts        # User ID and room alias resolution
├── utils.ts           # Duration parsing, mention extraction
└── whitelist.ts       # Whitelist enforcement logic

Message Processing Pipeline

Incoming event
    → Deduplication (processed_events table)
    → Antispam check
    → Command routing (!prefix)
    → Link preview detection
    → LLM response (if interactive + mention/reply)
        → Tool execution (search, fetch, profile, vision, crypto)
        → Response formatting + send

Database Schema

Tables: rooms, room_members, processed_events, room_user_interests, global_user_interests, user_notes, reactions, message_edits, redactions, rate_limits, rate_limit_exemptions, global_rate_exemptions, room_whitelist_config, whitelist_entries, room_interactive_mode, room_antispam_config, command_timeouts, room_command_timeouts, active_mutes, media_url_cache, room_interest_analysis_tracking, room_profile_refresh_tracking, profile_fetch_suppressions, room_analysis_tracking, preview_metadata_cache, image_whitelist.


Development

bun run start          # run bot
bun run dev            # run with --watch (auto-restart on file changes)
bun run check          # typecheck + lint (run after every change)
bun run typecheck      # tsc --noEmit only
bun run lint           # biome check --fix --unsafe
bun run format         # biome format --fix