AI News HubLIVE
站内改写

Show HN: Dikaletus – meeting recording and transcription using Mistral AI

Dikaletus is an open-source meeting agent script that automates recording, transcription, and summarization using FFmpeg, PulseAudio, and the Mistral AI API. It features a TUI, context biasing, speaker diarization, and generates structured Markdown meeting notes.

Article intelligence

EngineersIntermediate

Key points

  • Leverages Mistral AI's speech-to-text and text generation models for automated meeting processing.
  • Supports live recording and existing audio/video file input with a terminal user interface.
  • Includes advanced transcription settings like context biasing, speaker diarization, and timestamp granularity.
  • Outputs recording, raw transcription, and structured meeting notes in timestamped directories.

Why it matters

This matters because leverages Mistral AI's speech-to-text and text generation models for automated meeting processing.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

MimosaDev/dikaletus: A meeting agent script to record, transcribe, and summarise meetings using FFmpeg, PulseAudio and the Mistral AI API. - Codeberg.org

MimosaDev/dikaletus

Fork

0

A meeting agent script to record, transcribe, and summarise meetings using FFmpeg, PulseAudio and the Mistral AI API.

R

100%

Find a file

phillc

ac226fd318

Add comprehensive test suite and fix duplicate function bug

Changes:

  • Fix duplicate workflow_help function definition in tui/app.R that was

incorrectly placed inside validate_config function

  • Fix line length lint issue in meeting_agent.R (line 388)
  • Remove explicit return() in read_context_bias for implicit return
  • Add 78 new tests across 9 test files covering:
  • Configuration loading (test_load_config.R)
  • Context bias parsing (test_context_bias.R)
  • User input handling (test_read_input.R)
  • Config validation (test_validate_config.R)
  • Output directory selection (test_select_output_dir.R)
  • Workflow functions (test_workflow_functions.R)
  • API integration with mocks (test_api_integration.R)
  • Terminal detection (test_isatty.R)
  • Help display (test_workflow_help.R)

Total test count: 159 tests (all passing) Test coverage: ~73% of functions (19/26 functions tested)

2026-05-09 10:52:48 +02:00

help

Update default template filename to meeting_blueprint.md

2026-05-09 01:19:58 +02:00

tests

Add comprehensive test suite and fix duplicate function bug

2026-05-09 10:52:48 +02:00

tui

Add comprehensive test suite and fix duplicate function bug

2026-05-09 10:52:48 +02:00

.gitignore

Add Help section to main menu with usage information

2026-05-09 00:57:24 +02:00

LICENSE

Add README and GPLv3 LICENSE

2026-05-07 21:23:33 +02:00

meeting_agent.R

Add comprehensive test suite and fix duplicate function bug

2026-05-09 10:52:48 +02:00

meeting_blueprint.md

Add comprehensive test suite and fix duplicate function bug

2026-05-09 10:52:48 +02:00

README.md

Update default template filename to meeting_blueprint.md

2026-05-09 01:19:58 +02:00

Meeting Agent

A script to record, transcribe, and generate structured meeting notes using FFmpeg, PulseAudio and the Mistral AI API.

Overview

The Meeting Agent automates the process of capturing, transcribing, and generating structured meeting notes. It records audio from both microphone and speaker outputs, transcribes the audio using Mistral's speech-to-text API, and generates structured meeting notes in Markdown format.

Features

Audio Recording: Captures audio from both microphone and speaker outputs using PulseAudio

Transcription: Uses Mistral's voxtral-mini-latest model to transcribe audio

Meeting Notes: Generates structured meeting notes using Mistral's mistral-medium-latest model

Configuration: Saves API key and settings between runs

Flexible Input: Record live audio or process existing audio/video files

Terminal UI: Interactive TUI mode for easy configuration and workflow selection

Breeze Dark Theme: TUI styled with dark background, light text, blue accents, matching KDE Breeze Dark

Context Biasing: Improve transcription accuracy by providing domain-specific terminology

Requirements

System Requirements

Linux with PulseAudio (for audio recording)

Bash shell (for TUI input handling)

FFmpeg (for audio recording and processing)

Standard terminal emulator with 256-color support

R Runtime

R (version 4.0 or later)

Required R Packages

httr - HTTP requests for Mistral API

jsonlite - JSON configuration handling

rmarkdown - Markdown processing

processx - Process management for audio recording

argparse - Command line argument parsing

cli - Terminal UI styling and formatting (for TUI mode)

Installation

  1. Clone or Download

Clone this repository or download the script files.

  1. Install Required R Packages

Rscript -e "install.packages(c('httr', 'jsonlite', 'rmarkdown', 'processx', 'argparse', 'cli'), repos = 'https://cloud.r-project.org')" "

3. Install System Dependencies

**Ubuntu/Debian:**

sudo apt-get install pulseaudio ffmpeg

Fedora/RHEL:

sudo dnf install pulseaudio ffmpeg

Arch Linux:

sudo pacman -S pulseaudio ffmpeg

Usage

Terminal User Interface (Recommended)

Launch the interactive TUI for guided setup and workflow selection:

Rscript meeting_agent.R

The TUI provides:

Main menu with workflow options (Record Audio, Use Existing File, Settings, Exit)

Settings menu for API key, template path, and output directory configuration

Real-time feedback with styled alerts (info, success, warning, danger)

Breeze Dark compatible color scheme

Command Line Mode

First Run (API key required)

Rscript meeting_agent.R --api-key YOUR_API_KEY --record

Subsequent Runs (uses saved config)

Rscript meeting_agent.R --record

Use Existing Audio/Video File

Rscript meeting_agent.R --file /path/to/audio.mp4

Custom Template and Output Directory

Rscript meeting_agent.R --template-path /path/to/meeting_blueprint.md --output-dir /path/to/output --record

Command Line Options

Option
Description

--api-key
Mistral API key (required on first run)

--tui
Launch interactive Terminal User Interface

--no-tui
Force command-line mode (disables TUI even if stdin is a terminal)

--record
Record audio from microphone and speakers

--file
Use specified audio/video file instead of recording

--template-path
Path to markdown template file

--output-dir
Base path for output directories

--context
Path to context bias text file for improved transcription accuracy

TUI Configuration

In TUI mode, use the Settings menu to configure:

API Key: Your Mistral AI API key (required for all API calls)

Template Path: Path to your meeting notes template (Markdown format)

Output Directory: Base directory for output files (timestamped subdirectories are created per workflow)

Context Bias File: Path to a text file containing domain-specific terms to improve transcription accuracy

Configuration is saved to config.json and loaded automatically on subsequent runs.

Context Bias File

The context bias feature improves transcription accuracy by providing the Mistral transcription API with domain-specific terminology that is likely to appear in your meetings.

Creating a Context Bias File

Create a plain text file containing words and phrases that are specific to your domain. Each entry should be on its own line or separated by commas. Multi-word entries should use underscores to join words (e.g., project_manager instead of project manager).

Example context bias file (bias.txt):

meeting,minutes,action_item,project_manager,scrum,sprint,retrospective,standup,dikaletus

Setting the Context Bias File

In TUI Mode:

Select "Settings" from the main menu

Select "Set Context Bias File"

Enter the path to your context bias file

The file path will be saved and used for all subsequent transcriptions

In Command Line Mode:

Rscript meeting_agent.R --context /path/to/bias.txt --record

What Context Bias Does

When a context bias file is configured, its contents are sent to the Mistral /audio/transcriptions API endpoint as the context_bias parameter. This helps the transcription model recognize and accurately transcribe domain-specific terms, proper nouns, technical jargon, and other terminology that might be uncommon in general speech but important for your use case. The model uses this information to bias its vocabulary selection during transcription.

Notes

The context bias file is optional. Without it, transcriptions will still work using the standard model vocabulary.

The file must exist at the specified path; if not found, a warning will be displayed but transcription will continue without context biasing.

Context bias phrases are joined with underscores internally, so multi-word entries should use underscores (e.g., action_item).

Diarization and Timestamp Granularities

The Mistral transcription API supports speaker diarization and configurable timestamp granularities to enhance your transcriptions.

Diarization

Speaker diarization identifies different speakers in the audio and attributes speech segments to each speaker in the output.

Default: Enabled (diarize = TRUE)

Purpose: Useful for meetings with multiple participants, interviews, or any multi-speaker audio

Output: When enabled, transcription output includes speaker labels

Timestamp Granularities

Controls the granularity of timestamps in the transcription output.

Default: segment - provides timestamps for each speech segment

Options:

segment - timestamps for each detected speech segment (most granular)

word - timestamps for each word

none - no timestamps (plain text output)

Important Note on Output Format

When timestamp granularity is set to anything other than none (i.e., segment or word), the transcription API returns output in JSON format instead of plain text. This JSON includes the timestamps and speaker information (if diarization is enabled). Your application handles this by writing the JSON directly to the transcription file. If you need plain text output, set timestamp granularities to none.

Setting Diarization and Timestamp Granularities

In TUI Mode:

Select "Settings" from the main menu

Select "Set Diarization" to enable/disable speaker diarization

Select "Set Timestamp Granularities" to choose between segment, word, or none

Settings are saved and applied to all subsequent transcriptions

In Command Line Mode:

# Enable diarization (default)
Rscript meeting_agent.R --record

# Disable diarization
Rscript meeting_agent.R --no-diarize --record

# Set timestamp granularities
Rscript meeting_agent.R --timestamp-granularities word --record

# Disable timestamps (plain text output)
Rscript meeting_agent.R --timestamp-granularities none --record

Workflows

Record Audio Workflow

Select "Record Audio" from main menu

Audio capture is set up using PulseAudio

Recording starts automatically

Press Ctrl+C when finished recording

Audio is transcribed using Mistral API

Meeting notes are generated and saved

Use Existing File Workflow

Select "Use Existing File" from main menu

Enter path to your audio/video file

File is transcribed using Mistral API

Meeting notes are generated and saved

Output

Each workflow run creates a new timestamped directory (format: YYYY-MM-DD_HH-MM-SS) under the configured output directory or current working directory. Output files are also named with the date and timestamp for easy identification:

recording.wav: Audio recording (if recorded live)

transcription.txt: Raw transcription

meeting_notes.md: Structured meeting notes in Markdown format

Configuration

On first run, the script creates a config.json file to store your API key and settings. This file is automatically loaded on subsequent runs.

Template Format

The meeting notes template should be a Markdown file with section headers. The AI will fill in content following the template structure. A default template meeting_blueprint.md is expected in the current directory.

License

This project is open-source and available for use under the GPLv3 License.