Tagged | audio-processing
-
Recommending music to new users
(deezer.io) -
Recreating Natural Voices for People with Speech Impairments
(ai.googleblog.com) -
SoundStream: An End-to-End Neural Audio Codec
(ai.googleblog.com) -
Integrating with Telephone Networks to Enable Real-Time AI Services
(developer.nvidia.com) -
Detecting explicit content in songs
(deezer.io) -
Improving Audio Quality in Duo with WaveNetEQ
(ai.googleblog.com)#deep-learning #machine-learning #audio-processing #research
-
How to Deploy Real-Time Text-to-Speech Applications on GPUs Using TensorRT
(devblogs.nvidia.com) -
LiTr: A lightweight video/audio transcoder for Android
(engineering.linkedin.com) -
The On-Device Machine Learning Behind Recorder
(ai.googleblog.com) -
How to Build Domain Specific Automatic Speech Recognition Models on GPUs
(devblogs.nvidia.com) -
Develop Smaller Speech Recognition Models with NVIDIA’s NeMo Framework
(devblogs.nvidia.com) -
DeepSpeech 0.6: Mozilla’s Speech-to-Text Engine Gets Fast, Lean, and Ubiquitous
(hacks.mozilla.org) -
SPICE: Self-Supervised Pitch Estimation
(ai.googleblog.com) -
Audio and Visual Quality Measurement using Fréchet Distance
(ai.googleblog.com)#data-science #algorithms #audio-processing #research #video-processing
-
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model
(ai.googleblog.com) -
Working with ESP32 Audio Sampling
(www.toptal.com) -
Generate Natural Sounding Speech from Text in Real-Time
(devblogs.nvidia.com) -
Assessing the Quality of Long-Form Synthesized Speech
(ai.googleblog.com) -
Joint Speech Recognition and Speaker Diarization via Sequence Transduction
(ai.googleblog.com) -
Project Euphonia’s Personalized Speech Recognition for Non-Standard Speech
(ai.googleblog.com) -
Presentation: Functional Composition
(www.infoq.com) -
Deep Active Noise Cancellation
(towardsdatascience.com) -
Presentation: Deep Learning with Audio Signals: Prepare, Process, Design, Expect
(www.infoq.com) -
Improving Instagram’s Music Audio Quality
(instagram-engineering.com) -
SoundCloud Is Playing the Oboe
(developers.soundcloud.com) -
Presentation: wav2letter++: Facebook's Fast Open-source Speech Recognition System
(www.infoq.com)#deep-learning #data-science #NLP #audio-processing #research
-
Web Audio for Electric Guitar: How to Connect Instrument
(itnext.io) -
Speech Emotion Recognition with Convolution Neural Network
(towardsdatascience.com)#signal-processing #machine-learning #NLP #neural-net #audio-processing
-
Introducing Translatotron: An End-to-End Speech-to-Speech Translation Model
(ai.googleblog.com) -
Engineering a Studio Quality Experience With High-Quality Audio at Netflix
(medium.com) -
SpecAugment: A New Data Augmentation Method for Automatic Speech Recognition
(ai.googleblog.com) -
Programming by voice in 2019
(blog.logrocket.com) -
How To Make A Speech Synthesis Editor
(www.smashingmagazine.com) -
An All-Neural On-Device Speech Recognizer
(ai.googleblog.com) -
Data Visualization in Music
(towardsdatascience.com) -
Implementing AudioWorklets with React
(hackernoon.com) -
Classify Songs Genres From Audio Data
(towardsdatascience.com) -
Real-Time Noise Suppression Using Deep Learning
(towardsdatascience.com)#deep-learning #signal-processing #AI #GPU #audio-processing
-
Introducing Wav2latter++
(towardsdatascience.com) -
Audio Classification using FastAI and On-the-Fly Frequency Transforms
(towardsdatascience.com) -
LPCNet: DSP-Boosted Neural Speech Synthesis
(hacks.mozilla.org) -
Accurate Online Speaker Diarization with Supervised Learning
(ai.googleblog.com) -
WaveNet: Google Assistant’s Voice Synthesizer.
(towardsdatascience.com) -
Making beats with generative design
(becominghuman.ai) -
Real-Time Noise Suppression Using Deep Learning
(devblogs.nvidia.com)#deep-learning #algorithms #mobile #real-time #audio-processing
-
Neural Networks For Music: A Journey Through Its History
(towardsdatascience.com) -
Significantly faster generation and training for AI-based audio systems
(code.fb.com) -
Introducing Oboe: A C++ library for low latency audio
(android-developers.googleblog.com) -
Mixed Precision Training for NLP and Speech Recognition with OpenSeq2Seq
(devblogs.nvidia.com) -
Speaker Diarization — The Squad Way
(hackernoon.com) -
Speech Classification Using Neural Networks: The Basics
(towardsdatascience.com) -
Streaming RNNs in TensorFlow
(hacks.mozilla.org) -
Google’s Next Generation Music Recognition
(ai.googleblog.com) -
Synesthesia: The Sound of Style
(multithreaded.stitchfix.com) -
Generating Music: when simple probabilities outperform neural networks
(towardsdatascience.com) -
Into a better Speech Synthesis Technology
(becominghuman.ai) -
Algorithmic Reverb and Web Audio API
(itnext.io) -
Hacking Facebook: Audio Focus for 360 Video
(hackernoon.com) -
Facebook researchers use AI to turn whistles into orchestral music, and power other musical “translations”
(research.fb.com) -
Expressive Speech Synthesis with Tacotron
(ai.googleblog.com) -
Looking to Listen: Audio-Visual Speech Separation
(ai.googleblog.com) -
Nv-Wavenet: Better Speech Synthesis Using GPU-Enabled WaveNet Inference
(devblogs.nvidia.com) -
Looking to Listen: Audio-Visual Speech Separation
(research.googleblog.com) -
Visualizing Beethoven’s Oeuvre, Part I: Scraping and cleaning data from IMSLP
(towardsdatascience.com) -
Expressive Speech Synthesis with Tacotron
(research.googleblog.com) -
What’s wrong with spectrograms and CNNs for audio processing?
(towardsdatascience.com) -
Getting Started With The Web MIDI API
(www.smashingmagazine.com) -
How to do Real Time Trigger Word Detection with Keras
(hackernoon.com) -
Neural Voice Cloning with a Few Samples
(research.baidu.com)#deep-learning #machine-learning #audio-processing #research
-
How To Build An Audio Processor In Your Browser
(hackernoon.com) -
The promise of AI in audio processing
(towardsdatascience.com) -
Tacotron 2: Generating Human-like Speech from Text
(research.googleblog.com) -
Improving End-to-End Models For Speech Recognition
(research.googleblog.com) -
Machine Learning WAVE Files with TensorFlow
(becominghuman.ai)#deep-learning #machine-learning #tensor-flow #audio-processing
-
A Journey to <10% Word Error Rate
(hacks.mozilla.org) -
Web Audio API Series 1 — Introduction
(hackernoon.com) -
Deep Speech 3:Exploring Neural Transducers for End-to-End Speech Recognition
(research.baidu.com)#AI #machine-learning #neural-net #audio-processing #research
-
Humming with the bot
(blog.buildo.io) -
RNNoise: Using Deep Learning for Noise Suppression
(hacks.mozilla.org)#deep-learning #machine-learning #neural-net #audio-processing
-
Introduction to the SHMAVPlayerInterface
(tech.showmax.com) -
A Brief Introduction to Audio and Video Encoding
(spin.atomicobject.com)