Discussion:
Bug? Pagination with MySQL using LIMIT/OFFSET incorrectly.
samc
2008-04-22 14:09:33 UTC
Permalink
I'm pretty sure I've found a bug with Sequel 1.4 when used with MySql
(5.0.51a in my case).

I'm using pagination to get a page of my Location entities:
@locations = Location.order(:name).paginate(page_num, NUM_PER_PAGE)

Having turned on logging of the SQL that Sequel outputs, I see that if
page_num = 1 it runs:
SELECT COUNT(*) FROM location;
SELECT COUNT(*) FROM location LIMIT 10 OFFSET 0;
SELECT * FROM location ORDER BY 'name' LIMIT 10 OFFSET 0

This returns correct results, though I wonder why it needs to perform
the first two queries. The last one alone would suffice to return the
data. That would mean 1 DB round trip rather than 3, which would be a
worthwhile optimisation.

If page_num = 2 it runs:
SELECT COUNT(*) FROM location;
SELECT COUNT(*) FROM location LIMIT 10 OFFSET 10;

And then it returns an empty result set. I don't know why it's
checking the count at all, to be honest, but I can see the problem
with the SQL in any case. In MySQL the LIMIT applies *after* the
count. So it gets the count for the whole location table (12 rows in
my case) to give a 1 row set, then it applies the LIMIT to that set.
So if OFFSET is anything other than 0 we get an empty set. The SQL is
of course wrong even in the page_num = 1 case, it just happens not to
interfere with returning the page of data.

This is a big problem for me right now, so I'm keen to hear your
thoughts.

Thanks

Sam
Jeremy Evans
2008-04-22 20:26:09 UTC
Permalink
Post by samc
I'm pretty sure I've found a bug with Sequel 1.4 when used with MySql
(5.0.51a in my case).
@locations = Location.order(:name).paginate(page_num, NUM_PER_PAGE)
Having turned on logging of the SQL that Sequel outputs, I see that if
SELECT COUNT(*) FROM location;
SELECT COUNT(*) FROM location LIMIT 10 OFFSET 0;
SELECT * FROM location ORDER BY 'name' LIMIT 10 OFFSET 0
This returns correct results, though I wonder why it needs to perform
the first two queries. The last one alone would suffice to return the
data. That would mean 1 DB round trip rather than 3, which would be a
worthwhile optimisation.
It does the first query to get the total number of records, which it
uses to determine if this page is the last page. I don't think this
is a good idea unless you actually ask for the last page, so I'm open
to changing the pagination code so that doesn't execute a query for a
count of the records unless you ask for something that would require
it.

The only reason you would be seeing the second query is if you are
calling paginate on a dataset that has already been called with
paginate/limit:

irb(main):013:0> i.order(:value).paginate(2, 10).paginate(1,10).all
I, [2008-04-22T13:17:48.935554 #16926] INFO -- : SELECT COUNT(*)
FROM items
I, [2008-04-22T13:17:49.045572 #16926] INFO -- : SELECT COUNT(*)
FROM items LIMIT 10 OFFSET 10
I, [2008-04-22T13:17:49.142098 #16926] INFO -- : SELECT * FROM
items ORDER BY `value` LIMIT 10 OFFSET 0
irb(main):015:0> i.order(:value).limit(10,2).paginate(1,10).all
I, [2008-04-22T13:23:28.986637 #16926] INFO -- : SELECT COUNT(*)
FROM items LIMIT 10 OFFSET 2
I, [2008-04-22T13:23:29.082591 #16926] INFO -- : SELECT * FROM
items ORDER BY `value` LIMIT 10 OFFSET 0

There should probably be a check so that calling paginate on a dataset
that already has a limit set raises an error. I'm not sure why
Location.dataset would already have a limit set, but my guess is
somehow that happened.

Jeremy
samc
2008-04-22 22:33:19 UTC
Permalink
I certainly think it would be good to not get the count unless it is
really necessary, to save those oh so precious DB hits. I don't see
why it's necessary to do a separate count even if it is the last page.
But I'm very new to Sequel so am probably just not understanding the
way it has to work in order to try and be 'lazy'.

With enormous help from Manveru on #ramaze I've figured out what was
going wrong in my case to generate the failing SQL. My HTML template
was using @locations.count to determine if there was any data to
display. I find that if I use @locations.current_page_record_count
instead then it doesn't trigger the problem.

That said, this seems overly subtle and likely to trip people up. It
was suggested on #ramaze that paginate should return a duplicate
dataset with options set correctly, thus avoiding the possibility of
getting the error.
Jeremy Evans
2008-04-22 23:57:13 UTC
Permalink
Post by samc
I certainly think it would be good to not get the count unless it is
really necessary, to save those oh so precious DB hits. I don't see
why it's necessary to do a separate count even if it is the last page.
But I'm very new to Sequel so am probably just not understanding the
way it has to work in order to try and be 'lazy'.
It's not necessary, and the SQL query to get the record count will
only be called if necessary in Sequel 2.0.
Post by samc
With enormous help from Manveru on #ramaze I've figured out what was
going wrong in my case to generate the failing SQL. My HTML template
instead then it doesn't trigger the problem.
That would explain the issue.
Post by samc
That said, this seems overly subtle and likely to trip people up. It
was suggested on #ramaze that paginate should return a duplicate
dataset with options set correctly, thus avoiding the possibility of
getting the error.
paginate does return a cloned dataset, with the limit and offset
already set. However, you were calling count on this cloned dataset,
which causes an issue. It may make sense for the pagination module to
override count to be an alias of current_page_record_count. That
sounds like a good idea for 2.0.

Jeremy
Mark V
2008-04-23 01:17:27 UTC
Permalink
Post by Jeremy Evans
Post by samc
I certainly think it would be good to not get the count unless it is
really necessary, to save those oh so precious DB hits. I don't see
why it's necessary to do a separate count even if it is the last page.
But I'm very new to Sequel so am probably just not understanding the
way it has to work in order to try and be 'lazy'.
It's not necessary, and the SQL query to get the record count will
only be called if necessary in Sequel 2.0.
Post by samc
With enormous help from Manveru on #ramaze I've figured out what was
going wrong in my case to generate the failing SQL. My HTML template
instead then it doesn't trigger the problem.
That would explain the issue.
Post by samc
That said, this seems overly subtle and likely to trip people up. It
was suggested on #ramaze that paginate should return a duplicate
dataset with options set correctly, thus avoiding the possibility of
getting the error.
paginate does return a cloned dataset, with the limit and offset
already set. However, you were calling count on this cloned dataset,
which causes an issue. It may make sense for the pagination module to
override count to be an alias of current_page_record_count. That
sounds like a good idea for 2.0.
I've not delved into the paginate code, all I'd ask is in making any
changes, that the 'each_page' method continues to work :)
Cheers
Post by Jeremy Evans
Jeremy
sam carr
2008-04-23 08:53:48 UTC
Permalink
Post by Jeremy Evans
paginate does return a cloned dataset, with the limit and offset
already set. However, you were calling count on this cloned dataset,
which causes an issue. It may make sense for the pagination module to
override count to be an alias of current_page_record_count. That
sounds like a good idea for 2.0.
That alias sounds like the way forward. I look forward to 2.0 with bated
Jeremy Evans
2008-04-23 18:34:38 UTC
Permalink
Post by sam carr
Post by Jeremy Evans
paginate does return a cloned dataset, with the limit and offset
already set. However, you were calling count on this cloned dataset,
which causes an issue. It may make sense for the pagination module to
override count to be an alias of current_page_record_count. That
sounds like a good idea for 2.0.
That alias sounds like the way forward. I look forward to 2.0 with bated
breath!
I modified paginate and each_page so that raise an error if the
dataset already has a limit or offset. I think the alias will fix the
remaining issue, but as 1.5 is almost ready to be released, I don't
want to make that change yet.

Jeremy

Loading...