Monthly Archives: June 2017

MySQL and ON DUPLICATE KEY UPDATE

I’ve been working on some MySQL stuff recently, specifically storing a snapshot of the price of bitcoin every 5 minutes from an online exchange.  As you can imagine, the amount of data this can generate is quite large, and becomes more than I need.  I mean, I don’t care what the price for Bitcoin was exactly 3 weeks, 2 days, 11 hours and 20 minutes ago was.  I do care about averages and trends though.

So, I decided to aggregate my data to hourly averages, along with a min/max for each hour.  Then I can delete any of the 5 minute interval data older than a week or so, and use the aggregated data to do my analysis and graphs.

In order to do this I wanted to run a query that would average all the entries within the current hour, and then update a row with these updated averages, along with min/max values.  I also wanted the query to insert a new row if it was a new hour.  After a little research I knew I needed to figure out INSERT INTO… ON DUPLICATE KEY UPDATE as my pattern.

Everything seemed to be going great, until I needed to store calculated fields in an updated row.  I couldn’t get the SQL statement to actually update the row.

It turns out, when you use a SELECT to generate the data to be INSERTed, MySQL creates a temporary table with the data in it, and I couldn’t find the name of the table, so trying to update the values with entries from the SELECT would fail with ‘No such column’ error.

I fixed this by storing the calculated results in specific variables, and in turn using those variables to update the existing row.

INSERT INTO tickerHourly
 (ID, time, product_id, priceAvg, priceMax, priceMin, volumeAvg, counter)
 SELECT
 (select ID from tickerHourly WHERE product_id = ticker.product_id 
 AND hour(time) = hour(UTC_TIMESTAMP()) and date(time) = date(UTC_TIMESTAMP())),
 FROM_UNIXTIME(60*60*ROUND(UNIX_TIMESTAMP(time)/(60*60))),
 product_id,
 @avg := AVG(price),
 @max := MAX(price),
 @min := MIN(price),
 @volumeAvg := AVG(volume),
 @counter := count(*)
 FROM ticker where hour(time) = hour(UTC_TIMESTAMP())
 AND date(time) = date(UTC_TIMESTAMP())
 AND product_id = ?
 group by date(time),hour(time),product_id
 ON DUPLICATE KEY UPDATE
 tickerHourly.counter = @counter,
 tickerHourly.priceAvg = @avg,
 tickerHourly.priceMax = @max,
 tickerHourly.priceMin = @min,
 tickerHourly.volumeAvg = @volumeAvg;

So the line ‘@avg := AVG(price)’ takes the average of all the price entries and stores it in the local variable avg, which I then use as part of the UPDATE clause.  This was my missing information.