ava.preprocessing package¶
Submodules¶
ava.preprocessing.preprocess module¶
Make and save syllable spectrograms.
-
ava.preprocessing.preprocess.get_audio_filenames(audio_dir)[source]¶ Return a list of sorted audio files.
-
ava.preprocessing.preprocess.get_audio_seg_filenames(audio_dir, segment_dir, p)[source]¶ Return lists of sorted filenames.
-
ava.preprocessing.preprocess.get_syll_specs(onsets, offsets, audio_filename, p)[source]¶ Return the spectrograms corresponding to onsets and offsets.
Parameters: - onsets (list of floats) – Syllable onsets.
- offsets (list of floats) – Syllable offsets.
- audio_filename (str) – Audio filename.
- p (dict) – A dictionary mapping preprocessing parameters to their values. NOTE: ADD REFERENCE HERE
Returns: - specs (list of {numpy.ndarray, None}) – Spectrograms
- valid_syllables (list of int) – Indices of specs containing valid syllables.
-
ava.preprocessing.preprocess.process_sylls(audio_dir, segment_dir, save_dir, p, shuffle=True, verbose=True)[source]¶ Extract syllables from audio_dir and save to save_dir.
Parameters: - audio_dir (str) – Directory containing audio files.
- segment_dir (str) – Directory containing segmenting decisions.
- save_dir (str) – Directory to save processed syllables in.
- p (dict) – Preprocessing parameters. TO DO: add reference.
- shuffle (bool, optional) – Shuffle by filename. Defaults to
True. - verbose (bool, optional) – Defaults to
True.
-
ava.preprocessing.preprocess.read_onsets_offsets_from_file(txt_filename, p)[source]¶ Read a text file to collect onsets and offsets.
Note
- The text file must have two coulumns separated by whitespace and
#prepended to header and footer lines.
- The text file must have two coulumns separated by whitespace and
-
ava.preprocessing.preprocess.tune_syll_preprocessing_params(audio_dirs, seg_dirs, p, img_fn='temp.pdf')[source]¶ Flip through spectrograms and tune preprocessing parameters.
Parameters: - audio_dirs (list of str) – Audio directories
- seg_dirs (list of str) – Segment directories
- p (dict) – Preprocessing parameters ADD REFERENCE
Returns: p – Adjusted preprocessing parameters.
Return type: dict
-
ava.preprocessing.preprocess.tune_window_preprocessing_params(audio_dirs, p, img_fn='temp.pdf')[source]¶ Flip through spectrograms and tune preprocessing parameters.
Parameters: - audio_dirs (list of str) – Audio directories
- p (dict) – Preprocessing parameters ADD REFERENCE
- img_fn (str, optional) – Where to save images. Defaults to
'temp.pdf'.
Returns: p – Adjusted preprocessing parameters.
Return type: dict
ava.preprocessing.utils module¶
Useful functions for preprocessing.
-
ava.preprocessing.utils.get_spec(t1, t2, audio, p, fs=32000, target_freqs=None, target_times=None, fill_value=-1000000000000.0, max_dur=None, remove_dc_offset=True)[source]¶ Norm, scale, threshold, strech, and resize a Short Time Fourier Transform.
Notes
fill_valuenecessary?- Look at all references and see what can be simplified.
- Why is there a flag returned?
Parameters: - t1 (float) – Onset time.
- t2 (float) – Offset time.
- audio (numpy.ndarray) – Raw audio.
- p (dict) – Parameters. Must include keys: …
- fs (float) – Samplerate.
- target_freqs (numpy.ndarray or
None, optional) – Interpolated frequencies. - target_times (numpy.ndarray or
None, optional) – Intepolated times. - fill_value (float, optional) – Defaults to
-1/EPSILON. - max_dur (float, optional) – Maximum duration. Defaults to
None. - remove_dc_offset (bool, optional) – Whether to remove any DC offset from the audio. Defaults to
True.
Returns: - spec (numpy.ndarray) – Spectrogram.
- flag (bool) –
True