Window Types
In RioDB, ALL window implementations are sliding windows. There are no tumbling window implementations in RioDB (because tumbling windows are simply imprecise).
That being said, there are a few variations of windows available. Windows can have different types of RANGE, different types of DataType, and different types of source. They may also be optionally conditional and optionally partitioned.
Window Range
The range of elements in a window can be either fixed-size or time-based.
For:
- Fixed-Size: The past 10,000 bids on the stock TSLA. Or ranging from the 10,000th oldest to the 2,000th.
See Fixed-Size Windows. - Time-Based: All bids on stock TSLA during the past 10 minutes. Or all bids from 10 minutes ago til 2 minutes ago.
See Time-based Windows.
Window DataType
Each window can only track stats for ONE SINGLE field, and this field can be:
- Numeric (like bid_price, kilometers_per_hour, request_response_time, timestamp_epoch), or
- String (like stock_symbol, error_code, requester_ip_address, email_address).
Normally, you do not need to specify the DataType of a window. RioDB identifies the datatype automatically based on the stream message value that you choose for the window. But there is a difference.
Note, while windows of STRING values do support some aggregations (COUNT, COUNT_DISTINCT, MAX, MIN), they do NOT support more mathematical aggregations such as AVERAGE, SUM, VARIANCE, SLOPE, etc.
Window Source: Stream Field vs Expression
Usually, a window is populated with a field value taken directly from the stream message fields. For example, if the message has the field “bid”, you can create a window to run stats using “bid”. However, a window can also be populated with values from an expression (like a function). For example, suppose the stream provides you a “speed” field with values in miles-per-hour, but you want to populate the window with values converted into meters-per-second. You can populate the window with values that are pre-calculated by an expression. This can be very useful. For example, if the stream messages provide GPS x,y coordinates, but you want the window aggregate stats on the distance traveled between messages, it can be achieved using an expression.
See From Expression.
Conditional Elements
You don’t have to push values from every message into a window. The window can be populated with filtered values.
For examples:
- Only populate the window with response_time from POST requests, ignoring GET requests.
- Only populate the window with bids on stock TSLA, ignoring all other stocks.
- Only populate the window with error_code from production servers, ignoring test servers.
See Conditional Windows.
Partitioned Windows
Assuming that a stream message that contains a stock SYMBOL and a stock PRICE. Suppose you want a window to track running aggregations for each stock SYMBOL, such as the average price of TSLA, the average price of APPL, the average price of GOOG, and so on. Instead of manually defining a window for each stock symbol, you can define a window once that is partitioned by the stock symbol as the key. With this, you instantiate the window only once and let RioDB track the stats for each SYMBOL for you. Other examples:
- Aggregate an API request statistic for each requester_ip_address.
- Aggregate web site clickstream statistics for each session_id.
- Aggregate videogame statis for each player.
Partitioned windows allow you to define the window once, and let RioDB keep track of many keys behind the scenes.
See Partitioned Windows.
Conclusion
Note, that ALL variations above can be combined. For example, you could define one window that is…
A window of time range, tracking numeric computed values, only for stream records that match certain conditions, and be partitioned by a key field from the stream record.