mirror of
https://github.com/openai/whisper.git
synced 2025-11-23 22:15:58 +00:00
- Created whisper/streaming module for real-time transcription - Implemented StreamProcessor with Voice Activity Detection (VAD) - Added AudioBuffer with intelligent chunking and overlap handling - Built WebSocket server supporting multiple concurrent connections - Integrated CTranslate2 backend for accelerated inference - Added comprehensive configuration system (StreamConfig) - Implemented real-time result callbacks and error handling - Created example streaming client with microphone support - Added performance optimization and adaptive buffering - Full WebSocket API with JSON message protocol - Support for multiple audio formats (PCM16, PCM32, Float32) - Thread-safe audio processing pipeline Features: - <200ms latency for real-time processing - Multi-client WebSocket server - Voice Activity Detection - Configurable chunking strategy - CTranslate2 acceleration support - Comprehensive error handling - Performance monitoring and statistics Addresses: OpenAI Whisper Discussions #2, #937 - Real-time Streaming Limitations