agentic chat bot for the matrix protocol

Find a file

ky fc01862d10 Update docs for Manifold rebrand and feature changes		2026-05-01 18:56:49 -04:00
docs	Update docs for Manifold rebrand and feature changes	2026-05-01 18:56:49 -04:00
src	Rebrand to Manifold, drop music/instagram, add docker compose	2026-05-01 15:57:47 -04:00
.dockerignore	Rebrand to Manifold, drop music/instagram, add docker compose	2026-05-01 15:57:47 -04:00
.gitignore	initial commit	2026-02-28 23:18:26 -05:00
biome.json	initial commit	2026-02-28 23:18:26 -05:00
bun.lock	initial commit	2026-02-28 23:18:26 -05:00
config.example.json	Update docs for Manifold rebrand and feature changes	2026-05-01 18:56:49 -04:00
custom.example.txt	initial commit	2026-02-28 23:18:26 -05:00
docker-compose.yml	Rebrand to Manifold, drop music/instagram, add docker compose	2026-05-01 15:57:47 -04:00
Dockerfile	Rebrand to Manifold, drop music/instagram, add docker compose	2026-05-01 15:57:47 -04:00
package.json	initial commit	2026-02-28 23:18:26 -05:00
README.md	Update docs for Manifold rebrand and feature changes	2026-05-01 18:56:49 -04:00
tsconfig.json	initial commit	2026-02-28 23:18:26 -05:00

README.md

Manifold

A Matrix chat bot with LLM integration, moderation tools, and extensible tool execution. Built in TypeScript for the Bun runtime.

One interface, many capabilities.

Features

LLM Chat

Responds to mentions and replies in Matrix rooms using any OpenAI-compatible API. Supports multi-model configurations — separate models for chat, analysis, interest extraction, and summarisation. Maintains per-room conversation history and a configurable context window.

Tool Execution

The LLM can invoke tools automatically based on user prompts:

Web search via a self-hosted 4get instance
Exa search for semantic/neural web search (with optional --deep flag for multi-query expansion)
Exa fetch for fetching and summarising URL contents
Profile metadata — fetch display name, pronouns, and timezone on demand
Vision — image analysis and description
Crypto price lookup — single, multi-coin, or market overview, triggered inline from conversation

Link Previews

Automatically embeds rich metadata for links posted in rooms. Supported platforms: Twitter/X, YouTube, TikTok, Reddit, Twitch clips, Redgifs, and Fediverse (Mastodon, Pleroma, Akkoma, Misskey, Firefish). Media (images, video) is downloaded, processed with Sharp, and re-uploaded to the Matrix content repository with blurhash placeholders.

Moderation

Mute/unmute users by setting power level to -1, with auto-restore on expiry
Timed mutes with human-readable duration syntax (10s, 5m, 3h, 7d)
Room lockdown (!lock / !unlock) for emergency situations
Per-room antispam: rapid message detection, duplicate detection, media flood detection, mass-redaction detection
Whitelist-based room access control with configurable join enforcement

Harm Reduction

Drug information via the TripSit API: substance info (!drug) and interaction checking (!combo). Non-judgemental, accuracy-focused.

User Analytics

LLM-based extraction of per-user interests from conversation history, stored in PostgreSQL. Room sentiment and mood tracking on a configurable cooldown.

Reverse Image Search

SauceNAO integration for finding image sources (!sauce).

Crypto Prices

CoinGecko price lookups with currency conversion and detailed market data (!crypto).

Requirements

Bun >= 1.x (or Docker — see below)
PostgreSQL 14+
A Matrix homeserver account with an access token
An OpenAI-compatible LLM API endpoint (e.g. NanoGPT, OpenRouter, any OpenAI-compatible provider)

Optional integrations (configured per-section in config.json):

SauceNAO API key (reverse image search)
4get instance (web search)
Exa API key (semantic search + URL fetch)
Vision-capable LLM model (image analysis)
HTTP proxies (round-robin rotation with circuit breaking)

Setup

Docker (recommended for testing)

A self-contained docker-compose.yml brings up Postgres + the bot:

cp config.example.json config.json
# edit config.json — leave database fields as defaults; compose overrides them
MATRIX_ACCESS_TOKEN=... LLM_API_KEY=... docker compose up -d --build

The bot service mounts ./config.json and ./sys-deepseek.txt read-only and uses DATABASE_HOST=postgres automatically. Postgres data persists in a named volume (pgdata).

Bare metal

git clone <repo-url> manifold
cd manifold
bun install
cp config.example.json config.json
# edit config.json with your credentials
bun run start

The database schema is initialised automatically on first start. No migration tool required.

Environment Overrides

Sensitive and deployment-specific values can be set via environment variables instead of config.json:

Variable	Config field
`MATRIX_ACCESS_TOKEN`	`matrix.accessToken`
`LLM_API_KEY`	`llm.apiKey`
`DATABASE_HOST`	`database.host`
`DATABASE_PORT`	`database.port`
`DATABASE_NAME`	`database.database`
`DATABASE_USER`	`database.user`
`DATABASE_PASSWORD`	`database.password`

Hot Reload

Send SIGHUP to reload config.json and your system prompt without restarting:

kill -HUP <pid>

Configuration

All config lives in config.json (Zod-validated on startup). Copy config.example.json as your starting point.

`database`

PostgreSQL connection. Standard host, port, database, user, password, maxConnections.

`matrix`

Field	Description
`homeserverUrl`	Your homeserver base URL
`userId`	Full Matrix ID of the bot account (`@bot:server`)
`accessToken`	Matrix access token
`deviceId`	Optional device ID

`llm`

Field	Description
`apiUrl`	OpenAI-compatible completions endpoint
`apiKey`	API key
`model`	Default chat model
`preset`	`temperature`, `top_p`, `max_tokens`, `frequency_penalty`, `presence_penalty`, optional `thinking`
`systemPromptFile`	Path to system prompt file (e.g. `sys-deepseek.txt`)
`systemPrompt`	Inline system prompt (used if no file is set)
`analysisModel`	Model for room mood/sentiment analysis
`extractionModel`	Model for interest extraction
`summaryModel`	Model for summarisation
`interestAnalysis.model`	Model for interest tracking
`interestAnalysis.cooldown`	Cooldown between analyses in ms (default: `7200000`)
`interestAnalysis.maxTokens`	Max tokens for interest analysis (default: `6000`)
`maxRetries`	Max LLM request retries (default: `2`)
`retryDelay`	Delay between retries in ms (default: `1000`)

`bot`

Field	Description
`displayName`	Bot display name in Matrix
`respondToMentions`	Respond when the bot is @-mentioned (default: `true`)
`respondToReplies`	Respond to direct replies to bot messages (default: `true`)
`interactiveMode`	Global default for interactive mode (per-room override via `!interactive`)
`analyticsOnlyMode`	Only run analytics, no LLM responses (default: `false`)
`commandPrefix`	Command trigger character (default: `!`)
`adminUsers`	List of Matrix IDs with full admin access
`autoAcceptInvites`	Auto-join rooms when invited (default: `false`)
`allowedInviters`	If `autoAcceptInvites` is true, only accept from these IDs
`maxContextMessages`	Max messages passed to LLM as history (default: `10`)
`maxBotContextLength`	Max length of bot messages in context (default: `500`)
`messageHistoryRetention`	Messages to retain in memory per room (default: `100`)
`ignoreBots`	Matrix IDs to ignore entirely
`enableRoomMemory`	Persist room memory across restarts (default: `true`)
`interestAnalysisInterval`	Interest analysis cooldown in ms (default: `7200000`)
`interestAnalysisMessageThreshold`	Messages before triggering analysis (default: `50`)
`notifyNonWhitelisted`	Notify when a non-whitelisted user joins (default: `false`)
`linkPreview.enabled`	Enable/disable link preview module (default: `true`)
`linkPreview.timeout`	HTTP timeout for preview fetches in ms (default: `5000`)
`linkPreview.maxDescriptionLength`	Truncate descriptions at this length (default: `300`)
`linkPreview.maxVideoSize`	Max video download size in bytes (default: `52428800`)
`linkPreview.platforms`	Toggle individual platform previewers (`twitter`, `youtube`, `mastodon`, `misskey`, `generic`)

`exa`

Field	Description
`apiKey`	Exa API key for semantic/neural web search
`maxResults`	Max results per search (default: `10`)
`deepSearch.enabled`	Enable deep search mode (default: `false`); when enabled, users opt in per-query by appending `--deep` to their message
`deepSearch.maxQueries`	Max additional query variants the extraction model may generate (default: `3`)

`fourget`

Field	Description
`apiUrl`	URL of your self-hosted 4get instance
`apiKey`	Optional API key
`maxResults`	Max web results (default: `5`)
`imageResults`	Max image results (default: `1`)

`vision`

Field	Description
`visionModel`	Model to use for image analysis
`apiKey`	Optional separate API key (falls back to `llm.apiKey`)
`imageGenModel`	Model for image generation
`imageSize`	Generated image dimensions (default: `1024x1024`)
`thumbnailSize`	Thumbnail resize dimension in px (default: `512`)

`saucenao`

Field	Description
`apiKey`	SauceNAO API key for reverse image search

`antispam`

Field	Description
`enabled`	Global antispam toggle (default: `false`)
`messageThreshold`	Messages per window before mute (default: `5`)
`messageWindow`	Rapid message detection window in ms (default: `2000`)
`muteDuration`	Auto-mute duration in ms (default: `300000`)
`redactionThreshold`	Duplicate messages before redaction (default: `5`)
`redactionWindow`	Duplicate detection window in ms (default: `10000`)
`lockdownThreshold`	Spammers per window before room lockdown (default: `5`)
`mediaWindow`	Media flood detection window in ms (default: `2000`)
`silentMode`	Suppress antispam notification messages (default: `true`)

Per-room overrides are available via !antispam set <field> <value>.

`analytics`

Field	Description
`enableSentimentAnalysis`	Enable sentiment tracking (default: `true`)
`enableRoomMoodTracking`	Enable room mood analysis (default: `true`)
`enablePsychProfile`	Enable psychological profiling (default: `false`)
`enableStylometry`	Enable writing style analysis (default: `false`)
`enableSarcasmDetection`	Enable sarcasm detection (default: `false`)
`enableGenderEstimation`	Enable gender estimation (default: `false`)
`enablePersonalityEstimation`	Enable personality estimation (default: `false`)
`minMessageSampleSize`	Min messages required for analysis (default: `15`)
`fetchHistoryOnDemand`	Fetch history when sample is too small (default: `true`)
`maxHistoryFetch`	Max messages to fetch for analysis (default: `500`)
`quickAnalysisModel`	Optional separate model for quick analyses

`rateLimit`

Field	Description
`enabled`	Enable per-user rate limiting (default: `true`)
`maxRequests`	Max requests per window (default: `15`)
`windowMs`	Rate limit window in ms (default: `86400000` / 24h)

`commandTimeout`

Field	Description
`userCooldownMs`	Per-user command cooldown in ms (default: `5000`)
`roomThreshold`	Commands per room before timeout (default: `10`)
`roomWindowMs`	Room threshold window in ms (default: `60000`)
`roomTimeoutMs`	Room timeout duration in ms (default: `30000`)

`contextOptimization`

Field	Description
`enabled`	Enable context window optimization (default: `true`)
`minMessages`	Minimum messages to keep in context (default: `5`)
`maxMessages`	Maximum messages in context (default: `20`)
`summarizeOldContext`	Summarise older context (default: `false`)

`proxy`

Field	Description
`enabled`	Enable proxy rotation (default: `false`)
`strategy`	Rotation strategy: `round-robin`, `random`, or `sticky` (default: `round-robin`)
`failureWindow`	Circuit breaker failure window in ms (default: `300000`)
`maxFailures`	Failures before circuit opens (default: `3`)
`exclusions`	Domains to bypass proxy for
`urlBypass`	URL patterns to bypass proxy for
`previewerBypass`	Per-previewer domain bypass lists
`proxies`	Array of `{ protocol, host, port, username?, password? }` entries

`logging`

Field	Description
`level`	Log level: `ERROR`, `WARN`, `INFO`, `DEBUG`, `TRACE` (default: `INFO`)
`colors`	Coloured log output (default: `true`)

`mediaCacheRetentionDays`

Days to retain cached media before cleanup (default: 90).

Commands

Default prefix is !. Configurable via bot.commandPrefix.

Command	Access	Description
`!ping`	All	Check bot latency
`!help`	All	List available commands
`!drug <substance>`	All	Substance info via TripSit
`!combo <sub1> [sub2...]`	All	Drug interaction check
`!crypto <symbol> [currency] [--full]`	All	Cryptocurrency price
`!search <query>`	All	Web search
`!sauce`	All	Reverse image search (reply to an image)
`!mute <user> [duration] [reason]`	Moderator	Mute a user
`!unmute <user>`	Moderator	Unmute a user
`!lock`	Moderator	Enable room lockdown
`!unlock`	Moderator	Disable room lockdown
`!whitelist <subcommand>`	Admin	Manage room whitelist
`!antispam <subcommand>`	Admin	Configure antispam per room
`!interactive <on\|off>`	Admin	Toggle LLM interactive mode
`!status`	Admin	Show bot status and model config
`!cwd <room_id>`	Admin	Set working room for admin commands
`!loglevel <LEVEL>`	Admin	Set runtime log level
`!rooms`	Admin	List all rooms the bot is in
`!leave [reason]`	Admin	Leave the current working room
`!testpreview <url>`	Admin	Test link preview generation
`!clearmediacache [platform\|all] confirm`	Admin	Clear media cache
`!mediacachestats [platform]`	Admin	Show media cache statistics
`!clearpreviewcache <url>`	Admin	Clear preview metadata cache for a URL

Custom Persona

The bot's personality, tone, and domain knowledge are configured in your system prompt file (path set via llm.systemPromptFile — sys-deepseek.txt is the conventional name).

The system prompt sent to the LLM is composed of two layers:

┌─────────────────────────────────────────┐
│  BASE PROMPT (hardcoded)                  │  Matrix protocol format, mention syntax,
│                                           │  reply structure, tool injection, security
├─────────────────────────────────────────┤
│  CUSTOM PROMPT (your file)                │  Identity, tone, domain focus, restrictions
└─────────────────────────────────────────┘

You only need to write the identity layer. The operational context is always injected automatically and kept in sync with the code.

See docs/system-prompt-architecture.md for full rationale.

Architecture

src/
├── Bot/               # Matrix client lifecycle, message processing pipeline,
│                      #   mute management, room roster
├── Commands/          # Command registry and handlers
├── Database/          # PostgreSQL connection, schema, per-feature query modules
├── Interests/         # LLM-based interest extraction and room mood analysis
├── Llm/               # OpenAI-compatible LLM client, base prompt
├── Antispam/          # Rapid message, duplicate, media flood detection
├── Tools/             # LLM tool definitions: search, fetch, vision, crypto,
│                      #   profile metadata, query resolution
├── CryptoPrice/       # CoinGecko client, formatter, market overview
├── LinkPreview/       # Link embed generator (platform fetchers, media upload, sender)
│   └── platforms/     #   twitter, youtube, reddit, fediverse, misc
├── Proxy/             # Round-robin HTTP proxy rotation with circuit breaking
├── Tripsit/           # TripSit drug info client
├── cache.ts           # In-memory TTL cache
├── config.ts          # Zod config schema and loader
├── formatter.ts       # Matrix message formatting utilities
├── historyFetcher.ts  # Room message history for LLM context
├── index.ts           # Entry point: init, connect, register commands, graceful shutdown
├── logger.ts          # Structured logger with Matrix SDK noise suppression
├── queue.ts           # Per-room serialised message processing queue
├── resolver.ts        # User ID and room alias resolution
├── utils.ts           # Duration parsing, mention extraction
└── whitelist.ts       # Whitelist enforcement logic

Message Processing Pipeline

Incoming event
    → Deduplication (processed_events table)
    → Antispam check
    → Command routing (!prefix)
    → Link preview detection
    → LLM response (if interactive + mention/reply)
        → Tool execution (search, fetch, profile, vision, crypto)
        → Response formatting + send

Database Schema

Tables: rooms, room_members, processed_events, room_user_interests, global_user_interests, user_notes, reactions, message_edits, redactions, rate_limits, rate_limit_exemptions, global_rate_exemptions, room_whitelist_config, whitelist_entries, room_interactive_mode, room_antispam_config, command_timeouts, room_command_timeouts, active_mutes, media_url_cache, room_interest_analysis_tracking, room_profile_refresh_tracking, profile_fetch_suppressions, room_analysis_tracking, preview_metadata_cache, image_whitelist.

Development

bun run start          # run bot
bun run dev            # run with --watch (auto-restart on file changes)
bun run check          # typecheck + lint (run after every change)
bun run typecheck      # tsc --noEmit only
bun run lint           # biome check --fix --unsafe
bun run format         # biome format --fix