tellme/README.md
Felipe M. 54c3b65d4a
Complete Phase 4: Comprehensive preferences, localization, and UX polish
- Rename application from MenuWhisper to Tell me with new domain com.fmartingr.tellme
- Implement comprehensive preferences window with 6 tabs (General, Models, Text Insertion, Interface, Advanced, Permissions)
- Add full English/Spanish localization for all UI elements
- Create functional onboarding flow with model download capability
- Implement preview dialog for transcription editing
- Add settings export/import functionality
- Fix HUD content display issues and add comprehensive permission checking
- Enhance build scripts and app bundle creation for proper localization support
2025-09-19 13:55:46 +02:00

3.4 KiB

Tell me

A macOS menu bar application that provides offline speech-to-text transcription using Whisper-family models and automatically inserts the transcribed text into the currently focused application.

Overview

Tell me is designed to be a privacy-focused, offline-first speech recognition tool for macOS. It runs entirely locally on Apple Silicon machines, requiring no internet connection during normal operation (only for initial model downloads).

Key Features

  • 100% Offline Operation: Audio and text never leave your device
  • Apple Silicon Optimized: Built specifically for M1/M2/M3 processors with Metal acceleration
  • Global Hotkey Support: Default ⌘⇧V (configurable)
  • Smart Text Insertion: Clipboard paste with typing fallback
  • Secure Input Detection: Respects password fields and secure contexts
  • Multiple Models: Support for various Whisper model sizes and variants
  • Multilingual: Spanish and English interface and recognition

Requirements

  • macOS: 13.0 (Ventura) or later
  • Hardware: Apple Silicon (M1, M2, or M3 processor) - Intel Macs are not supported
  • Xcode: 15.0+ for building from source
  • Permissions: Microphone, Accessibility, and Input Monitoring access

Build Requirements

Development Environment

  • macOS 13+ with Xcode 15.0+
  • Swift 5.9+
  • Swift Package Manager (included with Xcode)

System Dependencies

  • AVFoundation framework (audio capture)
  • Carbon framework (global hotkeys)
  • AppKit/SwiftUI (UI components)

Third-party Dependencies

  • whisper.cpp (C/C++ library for speech recognition with Metal support)

Installation

Note: This project is currently in development. Pre-built binaries will be available as signed and notarized .dmg files once complete.

Building from Source

  1. Clone the repository:

    git clone <repository-url>
    cd tellme
    
  2. Open the project in Xcode or use Swift Package Manager:

    swift build -c release
    
  3. For development, open Package.swift in Xcode.

Architecture

The application is structured with modular components:

  • App: SwiftUI interface with AppKit bridges
  • Core/Audio: AVAudioEngine capture and processing
  • Core/STT: Speech-to-text engines (whisper.cpp, future Core ML)
  • Core/Models: Model management and downloads
  • Core/Injection: Text insertion with secure input handling
  • Core/Permissions: System permission management
  • Core/Settings: User preferences and configuration

Privacy & Security

  • No Telemetry: Zero data collection or remote analytics
  • Local Processing: All audio processing happens on-device
  • Secure Input Respect: Automatically detects and respects secure input contexts
  • Permission-Based: Requires explicit user consent for system access

Development Status

This project is currently in active development following a phased approach:

  • Phase 0: Project scaffolding ⬅️ Current
  • Phase 1: Hotkey + HUD + Audio capture
  • Phase 2: STT integration with whisper.cpp
  • Phase 3: Text insertion system
  • Phase 4: Preferences and UX polish
  • Phase 5: Distribution and packaging

See TODO.md for detailed development progress and TECHSPEC.md for complete technical specifications.

License

MIT License - see LICENSE for details.

Contributing

This project follows a structured development approach with clear phases and acceptance criteria. Please refer to the technical specification and TODO list before contributing.