graphai.core.voice.transcribe module
- class graphai.core.voice.transcribe.WhisperTranscriptionModel
Bases:
object
- load_model_whisper()
Lazy-loads a Whisper model into memory
- get_last_usage()
- unload_model(unload_period=43200.0)
- get_silence_thresholds(strict_silence=False)
- detect_audio_segment_lang_whisper(input_filename_with_path)
Detects the language of an audio file using a 30-second sample :param input_filename_with_path: Path to input file
- Returns:
Highest-scoring language code (e.g. ‘en’)
- transcribe_audio_whisper(input_filename_with_path, force_lang=None, verbose=False, strict_silence=False)
Transcribes an audio file using whisper :param input_filename_with_path: Path to input file :param force_lang: Whether to explicitly feed the model the language of the audio.
None results in automatic detection.
- Parameters:
verbose – Verbosity of the transcription
strict_silence – Whether silence detection is strict or lenient. Affects the logprob and no speech thresholds.
- Returns:
‘text’ contains the full transcript, ‘segments’ contains a JSON-like dict of translated segments which can be used as subtitles, and ‘language’ which contains the language code.
- Return type:
A dictionary with three keys
- graphai.core.voice.transcribe.extract_media_segment(input_filename_with_path, output_filename_with_path, output_token, start, length)
Extracts a segment of a given video or audio file indicated by the starting time and the length :param input_filename_with_path: Full path of input file :param output_filename_with_path: Full path of output file :param output_token: Output token :param start: Starting timestamp :param length: Length of segment
- Returns:
The output token if successful, None otherwise
- graphai.core.voice.transcribe.detect_language_retrieve_from_db_and_split(input_dict, file_manager, n_divs=5, segment_length=30)
- graphai.core.voice.transcribe.detect_language_parallel(tokens_dict, i, model, file_manager)
- graphai.core.voice.transcribe.detect_language_callback(token, results_list, force)
- graphai.core.voice.transcribe.transcribe_audio_to_text(input_dict, model, file_manager, strict_silence=False)
- graphai.core.voice.transcribe.transcribe_callback(token, results, force)
- graphai.core.voice.transcribe.cache_lookup_audio_transcript(token)
- graphai.core.voice.transcribe.cache_lookup_audio_fingerprint(token)
- graphai.core.voice.transcribe.cache_lookup_audio_language(token)