ava.preprocessing package

Submodules

ava.preprocessing.preprocess module

Make and save syllable spectrograms.

ava.preprocessing.preprocess.get_audio_filenames(audio_dir)[source]

Return a list of sorted audio files.

ava.preprocessing.preprocess.get_audio_seg_filenames(audio_dir, segment_dir, p)[source]

Return lists of sorted filenames.

ava.preprocessing.preprocess.get_syll_specs(onsets, offsets, audio_filename, p)[source]

Return the spectrograms corresponding to onsets and offsets.

Parameters:
  • onsets (list of floats) – Syllable onsets.
  • offsets (list of floats) – Syllable offsets.
  • audio_filename (str) – Audio filename.
  • p (dict) – A dictionary mapping preprocessing parameters to their values. NOTE: ADD REFERENCE HERE
Returns:

  • specs (list of {numpy.ndarray, None}) – Spectrograms
  • valid_syllables (list of int) – Indices of specs containing valid syllables.

ava.preprocessing.preprocess.is_audio_file(fn)[source]
ava.preprocessing.preprocess.process_sylls(audio_dir, segment_dir, save_dir, p, shuffle=True, verbose=True)[source]

Extract syllables from audio_dir and save to save_dir.

Parameters:
  • audio_dir (str) – Directory containing audio files.
  • segment_dir (str) – Directory containing segmenting decisions.
  • save_dir (str) – Directory to save processed syllables in.
  • p (dict) – Preprocessing parameters. TO DO: add reference.
  • shuffle (bool, optional) – Shuffle by filename. Defaults to True.
  • verbose (bool, optional) – Defaults to True.
ava.preprocessing.preprocess.read_onsets_offsets_from_file(txt_filename, p)[source]

Read a text file to collect onsets and offsets.

Note

  • The text file must have two coulumns separated by whitespace and # prepended to header and footer lines.
ava.preprocessing.preprocess.tune_syll_preprocessing_params(audio_dirs, seg_dirs, p, img_fn='temp.pdf')[source]

Flip through spectrograms and tune preprocessing parameters.

Parameters:
  • audio_dirs (list of str) – Audio directories
  • seg_dirs (list of str) – Segment directories
  • p (dict) – Preprocessing parameters ADD REFERENCE
Returns:

p – Adjusted preprocessing parameters.

Return type:

dict

ava.preprocessing.preprocess.tune_window_preprocessing_params(audio_dirs, p, img_fn='temp.pdf')[source]

Flip through spectrograms and tune preprocessing parameters.

Parameters:
  • audio_dirs (list of str) – Audio directories
  • p (dict) – Preprocessing parameters ADD REFERENCE
  • img_fn (str, optional) – Where to save images. Defaults to 'temp.pdf'.
Returns:

p – Adjusted preprocessing parameters.

Return type:

dict

ava.preprocessing.utils module

Useful functions for preprocessing.

ava.preprocessing.utils.get_spec(t1, t2, audio, p, fs=32000, target_freqs=None, target_times=None, fill_value=-1000000000000.0, max_dur=None, remove_dc_offset=True)[source]

Norm, scale, threshold, strech, and resize a Short Time Fourier Transform.

Notes

  • fill_value necessary?
  • Look at all references and see what can be simplified.
  • Why is there a flag returned?
Parameters:
  • t1 (float) – Onset time.
  • t2 (float) – Offset time.
  • audio (numpy.ndarray) – Raw audio.
  • p (dict) – Parameters. Must include keys: …
  • fs (float) – Samplerate.
  • target_freqs (numpy.ndarray or None, optional) – Interpolated frequencies.
  • target_times (numpy.ndarray or None, optional) – Intepolated times.
  • fill_value (float, optional) – Defaults to -1/EPSILON.
  • max_dur (float, optional) – Maximum duration. Defaults to None.
  • remove_dc_offset (bool, optional) – Whether to remove any DC offset from the audio. Defaults to True.
Returns:

  • spec (numpy.ndarray) – Spectrogram.
  • flag (bool) – True

Module contents

AVA preprocessing module

Contains

ava.preprocessing.preprocess
Preprocess syllable spectrograms.
ava.preprocessing.utils
Useful functions for preprocessing.