I was debugging a reporting module that I wrote. The database query looks like the following:
select units_sold, segment, substring(transacted_on, 1, 7) as transacted_on from sales s where foo = 'bar' group by segment, transacted_on
The sales table contains multiple records for a given month. transacted_on is a timestamp field, so we group by month (MySQL date format) so we can have the sales figures by segment and month. Based on the above query, the results were off by significant values, it was as if the entries at the end of the table were not being analyzed.
I checked the SQL query (there were joins, and where clauses that have been removed) the whole morning. It was thoroughly unproductive, because the data was correct, indeed. I wondered if I was dead tired or hungover from Sunday. Or that somehow this bug had slipped into MySQL which produced incorrect results for me, even though the query was absolutely correct.
It turns out that the query was wrong. MySQL developers: you are forgiven, hehe. I am thoroughly embarrassed of course.
select sum(units_sold) as units_sold, segment, substring(transacted_on, 1, 7) as transacted_on from sales s where foo = 'bar' group by segment, transacted_on
wrapping units_sold in the aggregate function sum() solved the problem immediately. If you’re like me, wondering why the results don’t turn out correctly when you’ve have group by clause, just look at the select clause. If sum(), count() and the like are not there, chances are the query is not correct.