ava.preprocessing package¶

Submodules¶

ava.preprocessing.preprocess module¶

Make and save syllable spectrograms.

ava.preprocessing.preprocess.get_audio_filenames(audio_dir)[source]¶: Return a list of sorted audio files.

ava.preprocessing.preprocess.get_audio_seg_filenames(audio_dir, segment_dir, p)[source]¶: Return lists of sorted filenames.

ava.preprocessing.preprocess.get_syll_specs(onsets, offsets, audio_filename, p)[source]¶

Return the spectrograms corresponding to onsets and offsets.

Parameters:

onsets (list of floats) – Syllable onsets.
offsets (list of floats) – Syllable offsets.
audio_filename (str) – Audio filename.
p (dict) – A dictionary mapping preprocessing parameters to their values. NOTE: ADD REFERENCE HERE

Returns:

specs (list of {numpy.ndarray, None}) – Spectrograms
valid_syllables (list of int) – Indices of specs containing valid syllables.

ava.preprocessing.preprocess.is_audio_file(fn)[source]¶

ava.preprocessing.preprocess.process_sylls(audio_dir, segment_dir, save_dir, p, shuffle=True, verbose=True)[source]¶

Extract syllables from audio_dir and save to save_dir.

Parameters:

audio_dir (str) – Directory containing audio files.
segment_dir (str) – Directory containing segmenting decisions.
save_dir (str) – Directory to save processed syllables in.
p (dict) – Preprocessing parameters. TO DO: add reference.
shuffle (bool, optional) – Shuffle by filename. Defaults to True.
verbose (bool, optional) – Defaults to True.

ava.preprocessing.preprocess.read_onsets_offsets_from_file(txt_filename, p)[source]¶

Read a text file to collect onsets and offsets.

Note

The text file must have two coulumns separated by whitespace and # prepended to header and footer lines.

ava.preprocessing.preprocess.tune_syll_preprocessing_params(audio_dirs, seg_dirs, p, img_fn='temp.pdf')[source]¶

Flip through spectrograms and tune preprocessing parameters.

Parameters:	audio_dirs (list of str) – Audio directories seg_dirs (list of str) – Segment directories p (dict) – Preprocessing parameters ADD REFERENCE
Returns:	p – Adjusted preprocessing parameters.
Return type:	dict

ava.preprocessing.preprocess.tune_window_preprocessing_params(audio_dirs, p, img_fn='temp.pdf')[source]¶

Flip through spectrograms and tune preprocessing parameters.

Parameters:	audio_dirs (list of str) – Audio directories p (dict) – Preprocessing parameters ADD REFERENCE img_fn (str, optional) – Where to save images. Defaults to `'temp.pdf'`.
Returns:	p – Adjusted preprocessing parameters.
Return type:	dict

ava.preprocessing.utils module¶

Useful functions for preprocessing.

ava.preprocessing.utils.get_spec(t1, t2, audio, p, fs=32000, target_freqs=None, target_times=None, fill_value=-1000000000000.0, max_dur=None, remove_dc_offset=True)[source]¶

Norm, scale, threshold, strech, and resize a Short Time Fourier Transform.

Notes

fill_value necessary?
Look at all references and see what can be simplified.
Why is there a flag returned?

Parameters:

t1 (float) – Onset time.
t2 (float) – Offset time.
audio (numpy.ndarray) – Raw audio.
p (dict) – Parameters. Must include keys: …
fs (float) – Samplerate.
target_freqs (numpy.ndarray or None, optional) – Interpolated frequencies.
target_times (numpy.ndarray or None, optional) – Intepolated times.
fill_value (float, optional) – Defaults to -1/EPSILON.
max_dur (float, optional) – Maximum duration. Defaults to None.
remove_dc_offset (bool, optional) – Whether to remove any DC offset from the audio. Defaults to True.

Returns:

spec (numpy.ndarray) – Spectrogram.
flag (bool) – True

Module contents¶

AVA preprocessing module

Contains¶

ava.preprocessing.preprocess: Preprocess syllable spectrograms.
ava.preprocessing.utils: Useful functions for preprocessing.