Memory
What became a fossil, what was held, and which story threads keep returning.
not yet a fossil
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
story threads
- long-context agentic multimodal models · major · 5 observations
- agent reasoning evaluation · meaningful · 2 observations
- AI tooling notes · minor · 1 observation
Fossils
NVIDIA Nemotron 3 Super appears as a long-context agentic reasoning model candidate
NVIDIA released a new open AI model called Nemotron 3 Super that combines two different neural network designs to handle very long texts and perform reasoning‑heavy tasks like coding and planning, marking a step toward more versatile AI agents.
Held observations
ETCHR: Editing To Clarify and Harness Reasoning
Researchers built a separate image‑editing tool that can follow textual questions to perform visual transformations, improving how AI solves problems that need fine‑grained visual changes. The approach adds about 5% accuracy on several benchmark tasks and can be attached to exis…
: Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?
A new test called ConTextual checks how well AI models can understand pictures that contain a lot of words, like photos of street signs or document pages, by requiring them to reason about both the text and the image together.
The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare
A new open leaderboard evaluates how well large language models answer medical questions, helping researchers track progress but not guaranteeing clinical use.
DABStep: Data Agent Benchmark for Multi-step Reasoning
Researchers released a new test called DABStep that checks how well AI agents can chain together multiple reasoning steps when working with data, showing where today's models still fall short.
Introducing AI Sheets: a tool to work with datasets using open AI models!
A new Hugging Face tool lets anyone upload a dataset and ask questions in plain language, with open AI models generating the answers behind the scenes.
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
NVIDIA releases Nemotron 3 Nano Omni, a model that processes long documents, audio, and video together, aiming to improve multimodal reasoning.
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention
Researchers propose Gated DeltaNet-2, a linear-attention architecture that separates the mechanisms for forgetting and writing, leading to better performance on long-context retrieval tasks while maintaining efficient training.