safayavatsal
|
13eb8f20d5
|
feat: Add advanced hallucination detection and confidence scoring system
- Created whisper/enhancements module for enhanced functionality
- Implemented HallucinationDetector with multi-method detection:
* Pattern-based detection (YouTube artifacts, repetitive phrases)
* Statistical analysis (compression ratios, log probabilities)
* Repetition analysis (looping behavior detection)
* Temporal analysis (silence-based detection)
- Added ConfidenceScorer for comprehensive transcription quality assessment
- Enhanced transcribe() function with new parameters:
* enhanced_hallucination_detection: Enable advanced detection
* hallucination_detection_language: Language-specific patterns
* strict_hallucination_filtering: Strict vs permissive filtering
* confidence_threshold: Minimum confidence for segments
- Maintains full backward compatibility
- Added CLI arguments for new functionality
Addresses: OpenAI Whisper Discussion #679 - Hallucinations & Repetition Loops
|
2025-10-19 23:30:43 +05:30 |
|