Files
10nmusicbot/spec.md
2025-12-13 05:46:49 +00:00

26 KiB

Discord Music Bot with Web DJ Interface

Project Overview

A self-hosted Discord bot that plays local MP3 files in a voice channel on continuous loop, paired with a web-based DJ interface that allows users to view and control playback.


Implementation Status

Current Status: Phase 1 (MVP) - COMPLETE

Last Updated: 2025-12-12

Phase 1 (MVP) - COMPLETE

All Phase 1 features have been implemented and are fully functional:

Feature Status Implementation Notes
Bot auto-join Complete Bot automatically joins configured voice channel on startup
Continuous playback Complete Loops through all MP3s in music directory, reloads library when empty
Now playing display Complete Web UI shows track title, artist, duration with real-time progress bar
Skip track Complete Working skip button with WebSocket notification
Pause/Resume Complete Full pause/resume functionality with state persistence
Track list Complete Full library display with track metadata, duration, and file size
MP3 upload Complete Drag-and-drop upload with validation, progress, and batch support
Volume control Complete Real-time volume adjustment (0-100%) with visual slider
Shuffle mode Complete Queue shuffling with toggle state
Real-time updates Complete WebSocket connection for live playback state changes

Technology Stack Chosen:

  • Bot Service: Node.js + discord.js + @discordjs/voice
  • API Backend: Node.js + Express + ws (WebSocket) + multer + music-metadata
  • Web Frontend: React 18 + Vite + Tailwind CSS
  • Deployment: Docker Compose with 3 services

Architecture Decisions:

  • Bot-API Communication: HTTP Internal API (Option A from spec)
  • Music metadata: Extracted on upload and library scan using music-metadata
  • File upload: Multer with configurable size limits and duplicate handling
  • Real-time updates: WebSocket server with 2-second polling of bot state

Phase 2 - NOT STARTED

The following Phase 2 features are planned but not yet implemented:

Feature Status Notes
Queue reordering Not started Requires drag-and-drop UI and queue management in bot
Search Not started Filter/search tracks in library
Track progress seek Not started Seekable progress bar (discord.js limitation may apply)
Previous track Not started History tracking and previous button

Challenges for Phase 2:

  • Seek functionality may be limited by Discord.js voice implementation
  • Queue reordering requires persistent queue state

Phase 3 - NOT STARTED

Feature Status Notes
Multi-channel support Not started Multiple bots or channel switching
User authentication Not started Password protection for DJ interface
Discord slash commands Not started Control bot via Discord commands
Listening history Not started Requires database for logging
Favorites/playlists Not started Custom playlist management

Goals

  • Always-on music: Bot joins a designated voice channel and plays music 24/7
  • Local library: All audio files are stored locally on the server (no streaming services)
  • Web control: Browser-based interface for viewing now-playing and controlling the queue
  • Simple deployment: Dockerized for easy VPS hosting

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Docker Host                          │
│  ┌───────────────────────────────────────────────────────┐  │
│  │                   Docker Compose                       │  │
│  │                                                        │  │
│  │  ┌─────────────────┐      ┌─────────────────────────┐ │  │
│  │  │   Bot Service   │◄────►│    Web Backend (API)    │ │  │
│  │  │   (Python/JS)   │      │     (Node/Python)       │ │  │
│  │  └────────┬────────┘      └───────────┬─────────────┘ │  │
│  │           │                           │               │  │
│  │           ▼                           ▼               │  │
│  │  ┌─────────────────┐      ┌─────────────────────────┐ │  │
│  │  │    Discord      │      │    Web Frontend         │ │  │
│  │  │    Voice API    │      │    (React/Vue/Static)   │ │  │
│  │  └─────────────────┘      └─────────────────────────┘ │  │
│  │                                                        │  │
│  │  ┌─────────────────────────────────────────────────┐  │  │
│  │  │              Shared Volume: /music               │  │  │
│  │  │                 (MP3 files)                      │  │  │
│  │  └─────────────────────────────────────────────────┘  │  │
│  └───────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

Components

1. Discord Bot Service

Responsibilities:

  • Connect to Discord and join a configured voice channel
  • Play MP3 files from local storage
  • Loop through playlist continuously
  • Accept commands from the web backend via internal API or message queue
  • Report current playback state

Technology Options:

  • Python: discord.py with PyNaCl for voice
  • Node.js: discord.js with @discordjs/voice

2. Web Backend (API Server)

Responsibilities:

  • Serve REST API for frontend
  • Provide WebSocket connection for real-time updates
  • Communicate with bot service to control playback
  • Scan and manage the music library

Technology Options:

  • Node.js: Express or Fastify
  • Python: FastAPI or Flask

3. Web Frontend

Responsibilities:

  • Display now-playing information (track name, progress, album art if available)
  • Show upcoming tracks in queue
  • Provide DJ controls (skip, pause, reorder, shuffle)
  • Playlist management

Technology Options:

  • React, Vue, Svelte, or plain HTML/JS
  • Tailwind CSS for styling

4. Shared Storage

  • Docker volume mounted to both bot and backend
  • Contains all MP3 files
  • Optionally: SQLite database for metadata/state

Features

MVP (Phase 1)

Feature Description
Bot auto-join Bot joins configured voice channel on startup
Continuous playback Loops through all MP3s in the music directory
Now playing display Web UI shows current track name and artist (from ID3 tags)
Skip track Web UI button to skip to next track
Pause/Resume Web UI controls for pausing and resuming playback
Track list Web UI displays full playlist
MP3 upload Upload MP3 files via web interface

Phase 2

Feature Description
Queue reordering Drag-and-drop to reorder upcoming tracks
Shuffle mode Toggle shuffle on/off
Search Filter/search tracks in the library
Track progress Show playback progress bar with seek capability
Album art Display embedded album art from MP3 metadata
Volume control Adjust bot playback volume from web UI

Phase 3 (Nice-to-Have)

Feature Description
Multi-channel support Configure multiple channels with separate playlists
User authentication Password-protect the DJ interface
Discord slash commands Control bot via Discord commands (e.g., /skip, /nowplaying)
Listening history Log of recently played tracks
Favorites/playlists Create and manage multiple playlists

API Specification

REST Endpoints

Library

Method Endpoint Description
GET /api/tracks List all tracks in library
GET /api/tracks/:id Get track metadata
GET /api/tracks/:id/art Get album art image
POST /api/tracks/upload Upload new MP3 file(s)
DELETE /api/tracks/:id Delete a track from library
POST /api/tracks/scan Rescan music directory for new files

Playback

Method Endpoint Description
GET /api/playback Get current playback state
POST /api/playback/play Resume playback
POST /api/playback/pause Pause playback
POST /api/playback/skip Skip to next track
POST /api/playback/previous Go to previous track
POST /api/playback/seek Seek to position (body: { position: seconds })
POST /api/playback/volume Set volume (body: { volume: 0-100 })

Queue

Method Endpoint Description
GET /api/queue Get current queue
POST /api/queue/add Add track to queue (body: { trackId })
POST /api/queue/remove Remove track from queue (body: { index })
POST /api/queue/reorder Reorder queue (body: { from, to })
POST /api/queue/shuffle Shuffle the queue
POST /api/queue/clear Clear queue (except now playing)

Bot Status

Method Endpoint Description
GET /api/status Bot connection status, channel info

WebSocket Events

Server → Client:

{ "event": "trackChange", "data": { "track": {...}, "queue": [...] } }
{ "event": "playbackUpdate", "data": { "state": "playing", "position": 45.2 } }
{ "event": "queueUpdate", "data": { "queue": [...] } }
{ "event": "volumeChange", "data": { "volume": 75 } }
{ "event": "libraryUpdate", "data": { "action": "added", "tracks": [...] } }
{ "event": "libraryUpdate", "data": { "action": "removed", "trackIds": [...] } }
{ "event": "uploadProgress", "data": { "filename": "song.mp3", "progress": 45 } }

Client → Server:

{ "event": "subscribe" }

File Upload Specification

Upload Endpoint Details

POST /api/tracks/upload

Accepts multipart/form-data with one or more MP3 files.

Request:

Content-Type: multipart/form-data

file: <binary MP3 data>
-- or for multiple files --
files[]: <binary MP3 data>
files[]: <binary MP3 data>

Response (Success):

{
  "success": true,
  "uploaded": [
    {
      "id": "abc123",
      "filename": "song.mp3",
      "title": "Song Title",
      "artist": "Artist Name",
      "duration": 215
    }
  ],
  "errors": []
}

Response (Partial/Failure):

{
  "success": false,
  "uploaded": [],
  "errors": [
    {
      "filename": "badfile.txt",
      "error": "Invalid file type. Only MP3 files are accepted."
    },
    {
      "filename": "toobig.mp3",
      "error": "File exceeds maximum size of 50MB."
    }
  ]
}

Upload Validation Rules

Rule Constraint
File type Must be audio/mpeg MIME type or .mp3 extension
File size Maximum 50MB per file (configurable)
Filename Sanitized to remove special characters
Duplicates Option to skip, overwrite, or rename duplicates
Batch limit Maximum 20 files per upload request

Upload Processing Pipeline

┌──────────────┐     ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│   Receive    │────►│   Validate   │────►│   Extract    │────►│    Save      │
│    File      │     │  Type/Size   │     │   Metadata   │     │   to Disk    │
└──────────────┘     └──────────────┘     └──────────────┘     └──────────────┘
                                                                       │
                                                                       ▼
┌──────────────┐     ┌──────────────┐     ┌──────────────────────────────────┐
│   Notify     │◄────│    Index     │◄────│   Generate ID & Store in DB      │
│    Bot       │     │   Library    │     │                                  │
└──────────────┘     └──────────────┘     └──────────────────────────────────┘

Metadata Extraction

On upload, the following ID3 tags are extracted (if present):

  • Title (falls back to filename)
  • Artist
  • Album
  • Duration
  • Album art (stored separately for serving)
  • Track number
  • Year
  • Genre

Recommended library: music-metadata (Node.js) or mutagen (Python)

Frontend Upload Component

The web UI should include:

Feature Description
Drag & drop zone Drop MP3 files directly onto the page
File browser Traditional file picker button
Progress indicator Upload progress bar for each file
Batch upload Support multiple files at once
Validation feedback Show errors inline before/during upload
Success confirmation Show newly added tracks with option to queue

Upload Configuration

# Additional environment variables
MAX_UPLOAD_SIZE_MB=50
MAX_BATCH_SIZE=20
DUPLICATE_HANDLING=rename  # skip | overwrite | rename
ALLOWED_EXTENSIONS=.mp3    # Could expand to .mp3,.flac,.wav

Track

interface Track {
  id: string;           // Unique identifier (hash of filepath or UUID)
  filename: string;     // Original filename
  filepath: string;     // Path relative to music directory
  title: string;        // From ID3 tag or filename
  artist: string;       // From ID3 tag
  album: string;        // From ID3 tag
  duration: number;     // Duration in seconds
  hasArt: boolean;      // Whether album art is available
}

PlaybackState

interface PlaybackState {
  currentTrack: Track | null;
  state: 'playing' | 'paused' | 'stopped';
  position: number;     // Current position in seconds
  volume: number;       // 0-100
}

Queue

interface Queue {
  tracks: Track[];      // Ordered list of upcoming tracks
  history: Track[];     // Recently played (optional)
  loopMode: 'all' | 'one' | 'off';
  shuffled: boolean;
}

Configuration

Environment Variables

# Discord
DISCORD_BOT_TOKEN=your_bot_token_here
DISCORD_GUILD_ID=your_server_id
DISCORD_CHANNEL_ID=voice_channel_id

# Paths
MUSIC_DIRECTORY=/music

# Web Server
WEB_PORT=3000
API_PORT=3001

# Optional
ADMIN_PASSWORD=optional_password
LOG_LEVEL=info

Docker Compose Example

version: '3.8'

services:
  bot:
    build: ./bot
    restart: unless-stopped
    environment:
      - DISCORD_BOT_TOKEN=${DISCORD_BOT_TOKEN}
      - DISCORD_GUILD_ID=${DISCORD_GUILD_ID}
      - DISCORD_CHANNEL_ID=${DISCORD_CHANNEL_ID}
      - API_URL=http://api:3001
    volumes:
      - ./music:/music:ro
    depends_on:
      - api

  api:
    build: ./api
    restart: unless-stopped
    ports:
      - "3001:3001"
    environment:
      - MUSIC_DIRECTORY=/music
      - BOT_INTERNAL_URL=http://bot:8080
      - MAX_UPLOAD_SIZE_MB=50
      - MAX_BATCH_SIZE=20
    volumes:
      - ./music:/music          # Read-write for uploads
      - ./data:/data

  web:
    build: ./web
    restart: unless-stopped
    ports:
      - "3000:80"
    depends_on:
      - api

Directory Structure

discord-dj-bot/
├── docker-compose.yml
├── .env
├── .env.example
├── README.md
│
├── bot/
│   ├── Dockerfile
│   ├── package.json (or requirements.txt)
│   └── src/
│       ├── index.js (or main.py)
│       ├── discord/
│       │   ├── client.js
│       │   └── voice.js
│       └── api/
│           └── internal.js
│
├── api/
│   ├── Dockerfile
│   ├── package.json (or requirements.txt)
│   └── src/
│       ├── index.js
│       ├── routes/
│       │   ├── tracks.js
│       │   ├── playback.js
│       │   ├── queue.js
│       │   └── upload.js
│       ├── services/
│       │   ├── library.js
│       │   ├── bot-bridge.js
│       │   └── metadata.js
│       ├── middleware/
│       │   └── upload.js      # Multer/busboy config
│       └── websocket/
│           └── handler.js
│
├── web/
│   ├── Dockerfile
│   ├── package.json
│   ├── public/
│   └── src/
│       ├── App.jsx
│       ├── components/
│       │   ├── NowPlaying.jsx
│       │   ├── Queue.jsx
│       │   ├── Controls.jsx
│       │   ├── TrackList.jsx
│       │   └── UploadZone.jsx
│       └── hooks/
│           ├── useWebSocket.js
│           └── useUpload.js
│
├── music/              # Mount your MP3s here
│   ├── song1.mp3
│   ├── song2.mp3
│   └── ...
│
└── data/               # Persistent data (optional)
    └── library.db

Bot-API Communication

The bot and API need to communicate bidirectionally:

  • Bot exposes a simple HTTP server on internal network
  • API calls bot endpoints to send commands
  • Bot polls API or uses WebSocket for queue updates

Option B: Redis Pub/Sub

  • Both services connect to a shared Redis instance
  • Commands and state updates flow through Redis channels

Option C: Shared SQLite + File Watching

  • State stored in SQLite database
  • Bot watches for changes and reacts
  • Simple but less real-time

Security Considerations

  1. Bot Token: Never commit to version control; use environment variables
  2. Web Interface: Consider adding basic auth for public-facing deployments
  3. Internal Network: Bot and API should communicate over Docker internal network only
  4. File Access: Mount music directory as read-only for bot; API needs write access for uploads

Upload Security

  1. File Type Validation: Validate MIME type AND file extension; optionally verify MP3 magic bytes
  2. File Size Limits: Enforce maximum file size to prevent disk exhaustion attacks (100mb)
  3. Filename Sanitization: Strip path traversal characters (../, ./), null bytes, and special characters
  4. Rate Limiting: Limit uploads per IP/session to prevent abuse
  5. Disk Space Monitoring: Check available disk space before accepting uploads

Node.js:

// Using multer with limits
const upload = multer({
  dest: '/tmp/uploads',
  limits: {
    fileSize: 50 * 1024 * 1024, // 50MB
    files: 20
  },
  fileFilter: (req, file, cb) => {
    if (file.mimetype !== 'audio/mpeg') {
      return cb(new Error('Only MP3 files allowed'));
    }
    cb(null, true);
  }
});

Python:

# Using FastAPI with validation
from fastapi import UploadFile

async def validate_upload(file: UploadFile):
    if file.content_type != "audio/mpeg":
        raise HTTPException(400, "Only MP3 files allowed")
    if file.size > 50 * 1024 * 1024:
        raise HTTPException(400, "File too large")

Discord Setup Guide

Before deploying the bot, you'll need to create a Discord Application and Bot account. This is free and takes about 5-10 minutes.

Step 1: Create a Discord Application

  1. Go to the Discord Developer Portal
  2. Log in with your Discord account
  3. Click "New Application" (top right)
  4. Enter a name for your application (e.g., "DJ Bot")
  5. Accept the terms of service and click "Create"

You'll land on the "General Information" page. You can optionally add a description and app icon here.

Step 2: Create the Bot User

  1. In the left sidebar, click "Bot"
  2. Click "Add Bot" and confirm
  3. (Optional) Customize the bot's username and avatar—this is what users see in Discord

Important Bot Settings:

Setting Value Why
Public Bot Off (recommended) Prevents others from adding your bot to their servers
Requires OAuth2 Code Grant Off Not needed for this use case
Presence Intent Off Not needed
Server Members Intent Off Not needed
Message Content Intent Off Not needed (unless adding text commands later)

Step 3: Get Your Bot Token

  1. On the "Bot" page, find the "Token" section
  2. Click "Reset Token" (or "Copy" if visible)
  3. Copy and save this token securely—you won't be able to see it again
⚠️  CRITICAL: Never share your bot token or commit it to version control.
    Anyone with this token has full control of your bot.
    If leaked, immediately reset it in the Developer Portal.

This token goes in your .env file as DISCORD_BOT_TOKEN.

Step 4: Configure OAuth2 Permissions

  1. In the left sidebar, click "OAuth2""URL Generator"
  2. Under "Scopes", select:
    • bot
  3. Under "Bot Permissions", select:
    • Connect — Join voice channels
    • Speak — Play audio in voice channels
    • View Channels — See available channels

Minimal permissions integer: 3145728

The generated URL at the bottom will look like:

https://discord.com/api/oauth2/authorize?client_id=YOUR_CLIENT_ID&permissions=3145728&scope=bot

Step 5: Invite the Bot to Your Server

  1. Copy the generated OAuth2 URL from Step 4
  2. Paste it in your browser
  3. Select the server you want to add the bot to (you need "Manage Server" permission)
  4. Click "Authorize"
  5. Complete the CAPTCHA

The bot should now appear in your server's member list (it will be offline until you run it).

Step 6: Get Server and Channel IDs

You'll need the Guild (server) ID and Voice Channel ID for your configuration.

Enable Developer Mode first:

  1. Open Discord Settings (gear icon)
  2. Go to "App Settings""Advanced"
  3. Enable "Developer Mode"

Get the Guild ID:

  1. Right-click on your server icon in the server list
  2. Click "Copy Server ID"

This goes in your .env file as DISCORD_GUILD_ID.

Get the Voice Channel ID:

  1. Right-click on the voice channel where the bot should play music
  2. Click "Copy Channel ID"

This goes in your .env file as DISCORD_CHANNEL_ID.

Summary: What You Need

After completing setup, you should have these three values:

Value Example Environment Variable
Bot Token MTIzNDU2Nzg5MDEyMzQ1Njc4OQ.Gh7K2j.xxxxx... DISCORD_BOT_TOKEN
Guild ID 123456789012345678 DISCORD_GUILD_ID
Channel ID 987654321098765432 DISCORD_CHANNEL_ID

Troubleshooting

Issue Solution
Bot won't connect Verify token is correct and hasn't been reset
Bot can't join voice Check bot has Connect permission in that channel
Bot joins but no audio Check bot has Speak permission; verify audio encoding setup
"Bot is not in this guild" Confirm Guild ID is correct; re-invite bot if needed
Can't see channel Bot needs View Channels permission

Discord API Rate Limits

Be aware of Discord's rate limits when developing:

Action Limit
Global requests 50 requests/second
Gateway connects 1 per 5 seconds
Voice state updates ~5 per minute recommended

The bot libraries (discord.py, discord.js) handle most rate limiting automatically.


Deployment Checklist

Discord Setup (see detailed guide above)

  • Create Discord application at https://discord.com/developers
  • Create bot user and configure settings
  • Copy and securely store bot token
  • Generate OAuth2 invite URL with correct permissions
  • Invite bot to your server
  • Copy Guild ID and Voice Channel ID

Server Setup

  • Provision VPS (recommended: 1GB+ RAM, Ubuntu 22.04+)
  • Install Docker and Docker Compose
  • Configure firewall (allow ports 3000, 3001 or your chosen ports)
  • (Optional) Set up domain name and DNS

Application Deployment

  • Clone repository to VPS
  • Copy .env.example to .env
  • Fill in Discord credentials (token, guild ID, channel ID)
  • Configure upload limits and other settings
  • Upload initial MP3 files to ./music directory
  • Run docker-compose build
  • Run docker-compose up -d
  • Verify bot comes online in Discord
  • Test web interface loads correctly
  • Test playback controls work
  • Test file upload works

Production Hardening (Optional)

  • Set up reverse proxy (nginx/Caddy) with SSL
  • Configure ADMIN_PASSWORD for web interface
  • Set up log rotation
  • Configure automatic container restarts
  • Set up monitoring/alerting (optional)

Future Considerations

  • Horizontal scaling: If needed, the web frontend can scale independently
  • Mobile app: API is designed to support any client
  • Multiple bots: Could extend to support multiple Discord servers
  • Transcoding: Convert non-MP3 files on the fly
  • External storage: Support S3 or other cloud storage for music files