Audio Encoder and Decoder

APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding

Abstract: This paper introduces a novel neural audio codec targeting high waveform sampling rates and low bitrates named APCodec, which seamlessly integrates the strengths of parametric codecs and ...

Tech Times

Baidu OCR Breaks Long-Document Memory Wall: New Architecture Beats DeepSeek

Open-source OCR from Baidu eliminates the GPU memory wall that limits long-document parsing. Unlimited OCR uses a constant KV ...

Streaming Media

Miri Technologies Ships V410 Live 4K Video Encoder/Decoder

Miri Technologies Inc. has begun shipping its highly anticipated V410 live 4K video encoder/decoder for streaming, IP-based production workflows and AV-over-IP distribution. Winner of a 2026 NAB Show ...

Canada

Real People Using Fake People: Public Use of Deepfake Technology

Synthesizing realistic audio, images, and videos using algorithms has always been essential in Signal Processing, Computer Graphics, and Computer Vision. When using pre-artificial intelligence (AI) ...

note

Gemma 4 12B In-Depth: A New Model Bringing Full-Scale Multimodality to Laptops with an Encoder-Free Design

Gemma 4 12B is a new model in the Gemma 4 family announced by Google on June 3, 2026. It is positioned as an "encoder-free unified multimodal model optimized for laptops." The official blog (Google ...

the-decoder

With Nemotron 3 Nano Omni, Nvidia reveals what really goes into a modern multimodal model

Nvidia has released Nemotron 3 Nano Omni, an open AI model that processes text, images, video, and audio and is built for agentic applications. Training involved 717 billion tokens. Much of the ...

Radio World

Barix Extends Transport Options for Multi-Engine IP Encoder

Barix will unveil its latest Instreamer and Exstreamer devices for AoIP transport at the upcoming NAB Show. The manufacturer is highlighting flexible configurations for its MultiCoder M400 and LX400 ...

VentureBeat

Google releases Gemma 4 under Apache 2.0 — and that license change may matter more than benchmarks

For the past two years, enterprises evaluating open-weight models have faced an awkward trade-off. Google's Gemma line consistently delivered strong performance, but its custom license — with usage ...

How the Encoder–Decoder Architecture Actually Works

The encoder–decoder architecture sits quietly behind many of the most impactful AI systems we use today—machine translation, speech recognition, text summarization, and modern large language models.

GitHub

flinkerlab/neural_speech_decoding

Our ECoG to Speech decoding framework is initially described in A Neural Speech Decoding Framework Leveraging Deep Learning and Speech Synthesis. We present a novel deep learning-based neural speech ...

the-decoder

Lightricks open-sources AI video model LTX-2, challenges Sora and Veo

Israeli company Lightricks has open-sourced its 19-billion-parameter model LTX-2. The system generates synchronized audio-video content from text descriptions and claims to be faster than competitors.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results