People Count Aggregation Algorithm
This document describes exact aggregation behavior.
It focuses on rules, boundary handling, and calculation flow.
Purpose
Aggregation answers one question:
Given event time range, assignments, sensor interval counts, resets, and fixed aggregation window size, what should cumulative area count be at each stored window?
Core Outcome
For each area, system stores a sequence of windows.
Each window has:
- start time
- end time
- cumulative count at end of that window
Count is cumulative, not per-window-only.
Terms
Window
Fixed aggregation period, such as 10:00-10:10.
Interval net
count_in - count_out
Flipped interval net
If assignment direction is flipped, interval net becomes negative of normal net.
Reset value
Explicit starting count that takes effect from reset timestamp forward.
Boundary Rules
These boundaries are important because many edge cases depend on them.
Event range
Aggregation only exists inside event range.
In practice, windows are generated while:
window_start < event_end
Window boundaries
Window start is inclusive.
Window end is exclusive.
Window rule is:
window.start <= ts_from < window.end
So:
ts_from = window.startmeans interval is included in that windowts_from = window.endmeans interval is excluded from that window and belongs to next one
Assignment boundaries
Assignment start is inclusive.
Assignment end is exclusive.
Assignment rule is:
assignment.active_from <= ts_from < assignment.active_to
So interval starting exactly at active_from is eligible.
Interval starting exactly at active_to is not eligible.
Interval selection anchor
Interval inclusion is decided by ts_from.
System does not split one interval across multiple windows.
That means ts_to does not decide which window gets interval.
Only ts_from decides.
High-Level Flow
Window Construction
Window size comes from PEOPLECOUNT_AGGREGATION_GRANULARITY.
Example with 10 minute windows:
10:00-10:1010:10-10:2010:20-10:30
Natural window end is:
min(window_start + granularity, event_end)
If reset falls inside that natural window, window is cut early so reset becomes boundary.
Pseudocode
current = event_start
while current < event_end:
natural_end = min(current + granularity, event_end)
reset_inside = first reset with timestamp between current and natural_end
if reset_inside exists and reset_inside.timestamp > current:
window_end = reset_inside.timestamp
else:
window_end = natural_end
emit window(current, window_end, reset_value_if_exactly_at_current)
current = window_end
Reset Rules
There are three reset sources:
-
Event start reset Always exists. Value is
0at event start. -
Single reset One-time reset at explicit timestamp.
-
Recurring reset Repeating reset generated inside event range from local reset time and timezone.
Same-Timestamp Priority
If more than one reset lands at same timestamp, priority is:
- single reset
- event start reset
- recurring reset
Reset Effect
If reset timestamp equals window start, that window starts from reset value.
If reset timestamp lands inside window, current window ends at reset time and next window starts from reset value.
Pseudocode
if reset exists exactly at window.start:
window_start_value = reset.reset_value
else:
window_start_value = previous_window_count
Assignment Eligibility
Assignment is first checked at window level.
Assignment is considered active for window when it overlaps that window.
Practical overlap check:
assignment.active_from <= window.end
assignment.active_to >= window.start
This is only candidate selection.
Actual interval inclusion is stricter.
In other words:
- overlap says assignment is worth checking
- interval rule decides whether each interval is actually counted
Authoritative Interval Inclusion Rule
An interval is included only when all conditions are true:
interval.ts_from >= window.startinterval.ts_from < window.endinterval.ts_from >= assignment.active_frominterval.ts_from < assignment.active_to
Consequences:
- Pre-assignment sensor data is ignored.
- Interval at exact assignment end is ignored.
- Missing intervals contribute zero.
- One interval is counted in at most one window.
Short version:
window.start <= ts_from < window.end and assignment.active_from <= ts_from < assignment.active_to
Pseudocode
include interval when:
interval.ts_from >= window.start
and interval.ts_from < window.end
and interval.ts_from >= assignment.active_from
and interval.ts_from < assignment.active_to
Net Contribution Rule
For each included interval:
- compute base net =
count_in - count_out - if assignment is flipped, multiply by
-1 - add result to window net
Pseudocode
window_net = 0
for each active assignment:
for each eligible interval:
net = interval.count_in - interval.count_out
if assignment.direction_flipped:
net = -net
window_net += net
Cumulative Count Rule
Windows are processed in chronological order.
For each window:
- determine window start value
- calculate window net
- add window net to start value
- store result as cumulative count for that window
Pseudocode
current_count = previous_stable_count
for each window in time order:
if reset exists at window.start:
start_value = reset.reset_value
else:
start_value = current_count
current_count = start_value + window_net
store(window.start, window.end, current_count)
Re-Aggregation Rules
System does not rebuild everything on every run.
It first checks whether stored rows are still valid.
Checksum invalidation
Stored rows include checksum of area configuration used when they were created.
If current checksum does not match stored checksum, those rows are deleted.
This protects aggregation from stale results after changes such as:
- assignment timing changes
- assignment direction flip changes
- event start or end changes
- reset changes
Window size invalidation
System examines existing stored window lengths.
If median stored window size differs from current configured size, all stored rows for that area are deleted.
This protects history from being mixed across different aggregation step sizes.
Incremental tail recalculation
After invalidation checks, system recalculates only tail portion of history.
Current behavior:
- older stable windows are kept
- recalculation starts around second-last stored window
- seed count comes from around third-last stored window
Reason is practical: recent windows are most likely to change because data may arrive late or current period may still be incomplete.
Edge Cases
1. Sensor has pre-assignment data
Example:
- assignment starts at
10:05 - interval
10:00-10:10exists
Result:
Interval is ignored because inclusion is based on ts_from, and 10:00 < 10:05.
2. Assignment starts inside window
Example:
- window
10:00-10:10 - assignment starts at
10:07 - interval starts at
10:00
Result:
Interval is ignored.
Even though assignment overlaps window, interval start happened before assignment start.
3. Assignment ends inside window
Example:
- window
10:20-10:30 - assignment ends at
10:25 - interval starts at
10:20
Result:
Interval counts because interval starts before assignment end.
If another interval starts exactly at 10:25, it is ignored.
4. Reset lands inside natural window
Example:
- natural window
13:00-13:10 - reset at
13:05
Result:
System splits into:
13:00-13:0513:05-13:10
Second window starts from reset value.
5. Reset at event start
Event start already creates implicit reset to 0.
If single reset exists at exact event start, single reset wins.
If recurring reset also lands at exact event start, event start wins over recurring reset.
6. Recurring reset with timezone
Recurring reset is defined in local timezone, but generated occurrences are converted to UTC before aggregation.
This means reset happens at expected local wall-clock time even when stored and processed in UTC.
7. Sparse or missing sensor data
If no eligible interval exists in a window, window net is 0.
Stored cumulative count simply carries forward unchanged.
8. Multiple sensors on same area
Each active assignment contributes independently.
Final window net is sum of all included interval nets across all active assignments.
Worked Examples
Example A: Basic assignment boundary
Given:
- window size
10 minutes - event
10:00-10:30 - assignment
10:05-10:25 - interval nets:
10:00-10:10->+510:10-10:20->+610:20-10:30->+210:25-10:35->+9
Result:
10:00-10:10-> net0, cumulative010:10-10:20-> net+6, cumulative610:20-10:30-> net+2, cumulative8
Example B: Flipped direction
Given one included interval with:
count_in = 10count_out = 5- assignment flipped
Normal net would be +5.
Flipped net becomes -5.
Example C: Mid-window reset
Given:
- window size
10 minutes - natural window
13:00-13:10 - current count before window is
42 - reset to
10at13:05
Result:
13:00-13:05starts from4213:05-13:10starts from10
Those are two separate windows with separate cumulative outcomes.
Non-Authoritative Helper Logic
Some helper/debug views use simpler calculations for diagnostics.
Those helpers are not aggregation algorithm.
In particular, debug count helpers may ignore:
- direction flips
- assignment active periods
So they should not be treated as authoritative occupancy history.
Summary
Authoritative aggregation logic is built on five strict ideas:
- fixed event-bounded windows
- reset-aware window splitting
- strict interval inclusion by
ts_from - assignment-aware and direction-aware net calculation
- cumulative count carry-forward across windows
If those five ideas are preserved, behavior stays aligned with current implementation.