When developing a web interface, it’s common to deal with large volumes of data. One of the ways to optimize data retrieval is through pagination, loading records incrementally, for instance, as the user scrolls down the page. With each request, a fixed number of items are fetched (limit), starting from a specific position in the list (offset).
Unset
SELECT
*
FROM
<table>
OFFSET
<Y>
LIMIT
<X>;
The simplest, and most widely used, solution is to add LIMIT
and OFFSET
clauses to your SQL queries. However, this approach quickly becomes problematic as data volumes increase. In this article, I’ll explain why this method isn’t optimal in PostgreSQL, and how to replace it with a more efficient index-based pagination strategy.
LIMIT OFFSET
Using LIMIT 10 OFFSET 1000
to fetch page 101 might seem harmless, but PostgreSQL is doing much more behind the scenes than you think:
Instead of using an arbitrary offset, we can paginate using a known value from an index, typically an auto-incremented ID or a timestamp (created_at
). This is commonly referred to as keyset pagination.
Example
OFFSET-based pagination (not recommended)
Unset
SELECT
*
FROM
users
WHERE
id > 12345
ORDER BY
id
LIMIT
10;
Keyset pagination (recommended)
Unset
SELECT
*
FROM
users
WHERE
id > 12345
ORDER BY
id
LIMIT
10;
The idea is to use the last value from the previous page (e.g., id = 12345
) to load the next one. This query:
Benchmarks
To be more factual, let’s use an example with a table of about 2 million rows. We'll try to retrieve rows from 1,000,000 to 1,000,010.
OFFSET-Based Query (not recommended)
Unset
SELECT
*
FROM
"client_data"."ean"
ORDER BY
id
OFFSET
1000000
LIMIT
10;
Execution Plan Analysis:
The execution plan shows an Index Scan
on ean_pkey
, but without any filtering condition. PostgreSQL reads the entire table, sorts all rows, skips the first 1,000,000, and returns the 10 following rows.
➡️ This leads to a very high total cost (~548k) and makes deep navigation extremely inefficient.
Keyset-Based Query (recommended)
Unset
SELECT
*
FROM
"client_data"."ean"
WHERE
id > '2609901045790'
ORDER BY
id
LIMIT 10;
Execution Plan Analysis:
Thanks to the condition id > ...
, PostgreSQL triggers an Index Scan
on ean_pkey
. Execution starts right after the specified key and reads only the 10 relevant rows. The plan shows a minimal cost (0.56 to 7.40), indicating efficient use of the index. There is no need to scan all the index, making the query highly performant and stable, even at large scale.
This method comes with a few constraints:
However, in most cases, these trade-offs are minor compared to the performance and reliability gains.
To illustrate a practical use case, here's how we implemented keyset pagination at EagleAI, using Python FastAPI for the backend and Next.js for the frontend.
The backend returns:
The frontend, to change pages, sends the reference ID (last_seen_id) (either the first or last item in the list, depending on direction) back to the backend, along with the direction ("next" or "prev"). Here's the backend logic:
Python
def get_page( db_session: Session, limit: int, last_seen_id: str | None = None, direction: Literal["next", "prev"] = "next", ): """Get a page of products.""" is_next = direction == "next" order = order_by.asc() if is_next else order_by.desc() query = select(Product) if last_seen_id: comparator = Product.id > last_seen_id if is_next else
Product.id < last_seen_id
query = query.where(comparator)
query = query.order_by(order).limit(limit)
products = db_session.execute(query).scalars().all()
return products if is_next else list(reversed(products))
If your database contains more than a few thousand rows, or if you care about delivering a fast, smooth user experience, it's time to abandon LIMIT OFFSET
.
Switching to index-based pagination is simple, and it’s a change that pays off in the long run.