wai.annotations release 0.7.8

A new release of wai.annotations is out now: 0.7.8

This release contains three major updates:

  • additional domain for audio classification, uses the -ac suffix for domain-specific plugins

  • wai.annotations.audio module added for processing and augmenting audio data

  • wai.annotations.generic module makes it easier for plugging in your own Python classes (derived from wai.annotations classes), as you don't have to write a lot of boiler-plate code to integrate it into the framework (the wrappers take care of that!) - check out the manual for examples

Here is a detailed overview of all the changes since the 0.7.7 release:

  • wai.annotations.core updated to 0.1.8

    • Added new audio domain for classification using suffix -ac

    • Added dataset reader for audio files: from-audio-files-sp, from-audio-files-ac

    • Added dataset writer for audio files: to-audio-files-sp, to-audio-files-ac

    • Added dummy sink for audio files: to-void-ac

    • Added ISP for selecting a sub-sample from the stream: sample

  • wai.annotations.subdir updated to 1.0.1

    • added reader/writer for audio classification: from-subdir-ac and to-subdir-ac

  • wai.annotations.audio added for audio support (currently at 1.0.1)

    • audio-info-ac: sink for collating/outputting information on the audio classification files

    • audio-info-sp: sink for collating/outputting information on the speech files

    • convert-to-mono: ISP for converting MP3/OGG/FLAC/WAV to mono WAV

    • convert-to-wav: ISP for converting MP3/OGG/FLAC to WAV

    • mel-spectrogram: XDC for generating plot from a mel spectrogram (outputs image classification instance)

    • mfcc-spectrogram: XDC for generating plots from Mel-frequency cepstral coefficients (outputs image classification instance).

    • pitch-shift: augmentation ISP for shifting the pitch

    • resample-audio: ISP for resampling MP3/OGG/FLAC/WAV

    • stft-spectrogram: XDC for generating plot from a short-time fourier-transform spectrogram (outputs image classification instance)

    • time-stretch: augmentation ISP for time-stretching audio (speed up/slow down)

    • trim-audio: ISP for trimming silence from audio

  • wai.annotations.generic added (currently at 1.0.0)

    • generic-source-ac: wrapper around a user-supplied source class for audio classification

    • generic-source-ic: wrapper around a user-supplied source class for image classification

    • generic-source-is: wrapper around a user-supplied source class for image segmentation

    • generic-source-od: wrapper around a user-supplied source class for object detection

    • generic-source-sp: wrapper around a user-supplied sourceclass for speech

    • generic-isp-ac: wrapper around a user-supplied ISP class for audio classification

    • generic-isp-ic: wrapper around a user-supplied ISP class for image classification

    • generic-isp-is: wrapper around a user-supplied ISP class for image segmentation

    • generic-isp-od: wrapper around a user-supplied ISP class for object detection

    • generic-isp-sp: wrapper around a user-supplied ISP class for speech

    • generic-sink-ac: wrapper around a user-supplied sink class for audio classification

    • generic-sink-ic: wrapper around a user-supplied sink class for image classification

    • generic-sink-is: wrapper around a user-supplied sink class for image segmentation

    • generic-sink-od: wrapper around a user-supplied sink class for object detection

    • generic-sink-sp: wrapper around a user-supplied sinkclass for speech