The Continuous Query in TDengine

TDengine is a time-series database designed and optimized for IoT. You can get it on GitHub. This blog will the Continuous Query feature in TDengine.

Continuous Query is a query executed by TDengine periodically with a sliding window, it is a simplified stream computing driven by timers. Continuous query can be applied to a table or a STable automatically and periodically, and the result set can be passed to the application directly via call back function, or written into a new table in TDengine. The query is always executed on a specified time window (window size is specified by parameter interval), and this window slides forward while time flows (the sliding period is specified by parameter sliding).

Continuous query of TDengine adopts time-driven mode, which can be defined directly by TAOS SQL without additional operation. Using continuous query, results can be generated conveniently and quickly according to the time window, thus down sampling the original collected data. After the user defines a continuous query through TAOS SQL, TDengine automatically pulls up the query at the end of the last complete time period and pushes the calculated results to the user or writes them back to TDengine.

The continuous query provided by TDengine differs from the time window calculation in ordinary stream computing in the following ways:

  • Unlike the real-time feedback calculated results of stream computing, continuous query only starts calculation after the time window is closed. For example, if the time period is 1 day, the results of that day will only be generated after 23:59:59.
  • If a history record is written to the time interval that has been calculated, the continuous query will not recalculate and will not push the results to the user again. For the mode of writing back to TDengine, the existing calculated results will not be updated.
  • Using the mode of continuous query pushing results, the server does not cache the client’s calculation status, nor does it provide Exactly-Once semantic guarantee. If the user’s application side crashed, the continuous query pulled up again would only recalculate the latest complete time window from the time pulled up again. If writeback mode is used, TDengine can ensure the validity and continuity of data writeback.

How to use continuous query

  • create table meters (ts timestamp, current float, voltage int, phase float) tags (location binary(64), groupId int);
  • create table D1001 using meters tags ("Beijing.Chaoyang", 2);
  • create table D1002 using meters tags ("Beijing.Haidian", 2);
  • ...

We already know that the average voltage of these meters can be counted with one minute as the time window and 30 seconds as the forward increment through the following SQL statement.

  • select avg(voltage) from meters interval(1m) sliding(30s);

Every time this statement is executed, all data will be recalculated. If you need to execute every 30 seconds to incrementally calculate the data of the latest minute, you can improve the above statement as following, using a different startTime each time and executing it regularly:

  • select avg(voltage) from meters where ts > {startTime} interval(1m) sliding(30s);

There is no problem with this, but TDengine provides a simpler method, just add create table {tableName} as before the initial query statement, for example:

  • create table avg_vol as select avg(voltage) from meters interval(1m) sliding(30s);

A new table named avg_vol will be automatically created, and then every 30 seconds, TDengine will incrementally execute the SQL statement after as and write the query result into this table. The user program only needs to query the data from avg_vol. For example:

It should be noted that the minimum value of the query time window is 10 milliseconds, and there is no upper limit of the time window range.

In addition, TDengine also supports users to specify the starting and ending times of a continuous query. If the start time is not entered, the continuous query will start from the time window where the first original data is located; If no end time is entered, the continuous query will run permanently; If the user specifies an end time, the continuous query stops running after the system time reaches the specified time. For example, a continuous query created with the following SQL will run for one hour and then automatically stop.

  • create table avg_vol as select avg(voltage) from meters where ts > now and ts <= now + 1h interval(1m) sliding(30s);

It should be noted that now in the above example refers to the time when continuous queries are created, not the time when queries are executed, otherwise, queries cannot be stopped automatically. In addition, in order to avoid the problems caused by delayed writing of original data as much as possible, there is a certain delay in the calculation of continuous queries in TDengine. In other words, after a time window has passed, TDengine will not immediately calculate the data of this window, so it will take a while (usually not more than 1 minute) to find the calculation result.

Manage the Continuous Query

TAOS Data's flagship product, TDengine, is an open-source AIoT big data platform GitHub: https://github.com/taosdata/TDengine