Arxiv Papers

We propose Gaze-LLE, a transformer framework for gaze target estimation, utilizing a frozen DINOv2 encoder for streamlined feature extraction, achieving state-of-the-art performance across multiple benchmarks. https://arxiv.org/abs//2412.09586 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:34

Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders

Duration:00:12:07

[QA] Representing Long Volumetric Video with Temporal Gaussian Hierarchy

This paper introduces the Temporal Gaussian Hierarchy, a novel 4D representation for efficiently reconstructing long volumetric videos, optimizing memory usage and rendering quality compared to existing methods. https://arxiv.org/abs//2412.09608 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:08:08

Representing Long Volumetric Video with Temporal Gaussian Hierarchy

Duration:00:31:10

[QA] Uncertainty-aware Knowledge Tracing

The Uncertainty-Aware Knowledge Tracing model (UKT) improves student learning assessment by incorporating uncertainty in interactions, outperforming existing models in predicting knowledge states across various datasets. https://arxiv.org/abs//2501.05415 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:36

Uncertainty-aware Knowledge Tracing

Duration:00:20:48

[QA] The GAN is dead; long live the GAN! A Modern Baseline GAN

This paper challenges the notion that GANs are hard to train, presenting R3GAN, a simplified, modernized GAN architecture that outperforms existing models on various datasets. https://arxiv.org/abs//2501.05441 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:18

The GAN is dead; long live the GAN! A Modern Baseline GAN

Duration:00:25:35

[QA] Supervision-free Vision-Language Alignment

SVP enhances vision-language models' performance without curated data, achieving significant improvements in captioning, object recall, and hallucination control across various tasks. https://arxiv.org/abs//2501.04568 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:55

Supervision-free Vision-Language Alignment

Duration:00:19:10

[QA] Grokking at the Edge of Numerical Stability

This paper explores grokking in deep learning, linking delayed generalization to Softmax Collapse and proposing solutions to enable grokking without regularization through new activation functions and training algorithms. https://arxiv.org/abs//2501.04697 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:50

Grokking at the Edge of Numerical Stability

Duration:00:16:50

[QA] ComMer: a Framework for Compressing and Merging User Data for Personalization

ComMer is a framework that efficiently personalizes Large Language Models by compressing user documents into compact representations, improving performance in skill learning tasks while facing challenges in knowledge-intensive applications. https://arxiv.org/abs//2501.03276 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:21

ComMer: a Framework for Compressing and Merging User Data for Personalization

Duration:00:16:03

[QA] Entropy-Guided Attention for Private LLMs

This paper addresses privacy concerns in proprietary language models by optimizing transformer architectures for private inference, focusing on the role of nonlinearities and introducing entropy-guided mechanisms for improved performance. https://arxiv.org/abs//2501.03489 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:57

Entropy-Guided Attention for Private LLMs

Duration:00:13:20

[QA] Easing Optimization Paths: a Circuit Perspective

The paper explores using mechanistic interpretability to enhance gradient descent training in AI, aiming to reduce compute costs and mitigate harmful behaviors through efficient learning curricula. https://arxiv.org/abs//2501.02362 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:33

Easing Optimization Paths: a Circuit Perspective

Duration:00:09:58

[QA] Randomly Sampled Language Reasoning Problems Reveal Limits of LLMs

This study evaluates LLMs' language understanding using novel tasks from deterministic finite automata, revealing they struggle compared to basic models when faced with unfamiliar languages. https://arxiv.org/abs//2501.02825 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Duration:00:07:15

Randomly Sampled Language Reasoning Problems Reveal Limits of LLMs