Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more.
It’s very difficult to tell what phase of the hype cycle we are in for any given AI tool. Things are moving fast: a concept that just weeks ago seemed cutting edge can now appear stale, while an approach that was headed towards obsolescence might suddenly make a comeback.
Retrieval-augmented generation is an interesting case in point. It dominated conversations a couple of years ago, quickly attracted a vocal crowd of skeptics, splintered into multiple types and flavors, and inspired a cottage industry of enhancements.
These days, it seems to have landed somewhere midway between exciting and mundane. It’s a technique used by millions of practitioners, but no longer producing endless buzz.
To help us make sense of the current state of RAG, we turn to our expert authors, who cover some of its current challenges, use cases, and recent innovations.
Chunk Size as an Experimental Variable in RAG Systems
We begin our exploration with Sarah Schürch‘s enlightening and detailed look into chunking—the process of splitting longer documents into shorter, more easily digestible ones—and its potential effects on the retrieval step in your LLM pipelines.
Retrieval for Time-Series: How Looking Back Improves Forecasts
Can we apply the power of RAG beyond text? Sara Nobrega introduces us to the emerging idea of retrieval-augmented forecasting for time-series data.
When Does Adding Fancy RAG Features Work?
How complex should your RAG systems actually be? Ida Silfverskiöld presents her latest testing, aiming to find the right balance between performance, latency, and cost.
This Week’s Most-Read Stories
Catch up with three articles that resonated with a wide audience in the past few days.
How LLMs Handle Infinite Context With Finite Memory, by Moulik Gupta
Why Supply Chain is the Best Domain for Data Scientists in 2026 (And How to Learn It), by Samir Saci
HNSW at Scale: Why Your RAG System Gets Worse as the Vector Database Grows, by Partha Sarkar
Other Recommended Reads
We hope you explore some of our other recent must-reads on a diverse range of topics.
- Federated Learning, Part 1: The Basics of Training Models Where the Data Lives, by Parul Pandey
- YOLOv1 Loss Function Walkthrough: Regression for All, by Muhammad Ardi
- How to Improve the Performance of Visual Anomaly Detection Models, by Aimira Baitieva
- The Geometry of Laziness: What Angles Reveal About AI Hallucinations, by Javier Marin
- The Best Data Scientists Are Always Learning, by Jarom Hulet
Contribute to TDS
The last few months have produced strong results for participants in our Author Payment Program, so if you’re thinking about sending us an article, now’s as good a time as any!

