Hogsend
Concepts

Buckets

Real-time, code-defined membership groups. A user joins the moment their data matches, leaves when it stops — every join or leave fires an event that can trigger a journey, and the bucket itself carries colocated reactions and live member access.

What a bucket is

A bucket is a named group of users, defined in code, whose membership is computed in real time. You declare a predicate — "used the key feature 10+ times in the last 30 days," "on trial with 3 days left and not yet converted," "went quiet for 7 days" — and Hogsend keeps the membership current as events flow in. A user joins the moment their data satisfies the predicate and leaves the moment it stops.

Buckets are a peer primitive to journeys. Where a journey is a durable flow you author with defineJourney(), a bucket is a declarative membership you author with defineBucket() — same authoring shape, same client/worker wiring, same "this is your content, the engine never imports it" boundary. A bucket's core is still its criteria: there's no top-level run function the way a journey has one. But the object defineBucket() returns isn't only criteria — it's the membership plus a small, cohesive surface bolted onto it. Once you've declared who's in a state, three things hang off the same definition:

  1. Bind a journey to the transition with a typed ref (bucket.entered / bucket.left).
  2. Attach a colocated reaction to a transition with bucket.on("enter" | "leave" | "dwell", …).
  3. Query the membership directly with bucket.count() / bucket.has() / bucket.members().

The rest of this page walks those three in turn, after the transition model they all share.

"Segment" is only a discoverability synonym — the title and glossary keep it so you can find this page. Everywhere else we say Buckets, deliberately: they're a real-time orchestration primitive for journeys, not a CDP audience-sync surface. More on that below.

Join and leave fire an event

The thing that makes buckets useful is what happens on a transition. When a user joins or leaves a bucket, Hogsend emits a first-class event through the same ingestion spine every other event travels:

  • Join emits bucket:entered:<id> — e.g. bucket:entered:power-users
  • Leave emits bucket:left:<id> — e.g. bucket:left:power-users

These are real events. They land in user_events, they get pushed to Hatchet, and they run through the exit-condition check — exactly like a feature.used or a payment_failed. That has one powerful consequence:

A membership change can directly trigger a journey. A journey whose trigger.event is bucket:entered:power-users starts when — and only when — a user joins that bucket. The journey side needs zero special wiring: Hatchet routes the transition event to the journey by exact event-name match, the same way it routes everything else. And because bucket:left:<id> flows through the exit path too, a journey can list exitOn: [{ event: "bucket:left:power-users" }] and a user automatically exits the moment they no longer qualify. (You won't usually hand-write those strings — section 1 shows the typed refs that do it for you.)

So the mental model is a clean two-step composition:

  1. A bucket answers "who is in this state right now?"
  2. A journey answers "what do we do when someone enters or leaves that state?"

Transitions fire only on a change — never per evaluation, never for a user who's already a stable member. A user who satisfies the criteria on every event for a month produces exactly one bucket:entered, not thousands.

1. Bind a journey to the transition

The most direct way to act on a membership change is to trigger (or exit) a separate journey on the transition event. The bucket object carries the two event names as typed refs so a typo can't silently break the wire:

// power-users.ts (the bucket) exports the bucket object…
export const powerUsers = defineBucket({ meta: { id: "power-users", /* … */ } });

// …and a journey reads the literal-typed transition refs off it:
import { powerUsers } from "../buckets/power-users.js";

trigger: { event: powerUsers.entered }, // "bucket:entered:power-users"
exitOn:  [{ event: powerUsers.left }],  // "bucket:left:power-users"

bucket.entered is the literal type `bucket:entered:${Id}` and bucket.left is `bucket:left:${Id}`, both derived synchronously from the bucket's own id when defineBucket() runs. They're plain strings at runtime — byte-identical to the events the engine emits — but TypeScript ties them to the bucket, so you can't reference a bucket that doesn't exist or a transition you spelled wrong. (trigger.event is typed string, so a bare string literal would compile and silently route nothing; the ref makes the wire checkable.)

The typed refs replace the old string helpers bucketEntered("id") / bucketLeft("id") and the hand-maintained BucketId union. Those still ship — deprecated for one release, then removed — so existing apps don't break, but new code should reach for bucket.entered / bucket.left. The generic, any-bucket events Events.BUCKET_ENTERED / Events.BUCKET_LEFT are unchanged: use them when a journey should fire on a transition from any bucket and branch on the payload, rather than binding to one specific bucket.

This is the right tool when the reaction is a real, ongoing flow — a multi-step winback sequence, a sequence that other triggers can also start, or anything you'd want to find under its own name in Studio. For the common case where the reaction belongs to the bucket and there's exactly one of it, there's a more colocated option.

2. Attach a colocated reaction with .on()

bucket.on(kind, opts?, handler) lets you write the reaction next to the criteria that triggers it, in the same file. There are three kinds — "enter", "leave", and "dwell":

powerUsers
  .on("enter", { firstEntryOnly: true }, async (user, ctx) => {
    // ctx.entryCount, ctx.isFirstEntry — plus the full JourneyContext
    await ctx.sleep({ duration: hours(1) });
    await sendEmail({
      to: user.email,
      userId: user.id,
      template: Templates.ACTIVATION_NUDGE,
      subject: "You're flying — here's a power tip",
    });
  })
  .on("leave", { reason: "criteria" }, async (user, ctx) => {
    // ctx.reason is "criteria" | "maxDwell" | "manual"
  })
  .on("dwell", { every: days(7) }, async (user, ctx) => {
    // ctx.dwellCount — fired by the reconcile cron for continuous members
  });

.on() returns the bucket, so calls chain. There's no .subscribe() step and no registration: the reaction is wired the instant the module loads, and it ships with the bucket in your buckets/ array.

A reaction is not a listener you pile onto. Each .on() call desugars to a real, durable journey — tagged with sourceBucketId so the engine and Studio know which bucket owns it — whose trigger is the matching transition event. Because it's a genuine journey, it inherits the entire enrollment-guard stack, the active-state dedup, and the durable JourneyContext for free. The handler has the exact (user, ctx) shape of a journey run; the only addition is a few read-only extras on ctx, layered on by a spread so the engine's canonical context is never mutated:

  • enterctx.entryCount and ctx.isFirstEntry. The { firstEntryOnly: true } option is a filter, not a separate event — a re-entry still enrolls, then returns early inside run.
  • leavectx.reason ("criteria" | "maxDwell" | "manual"), carried on the leave event. { reason } filters to one or more reasons.
  • dwellctx.dwellCount, the elapsed-interval ordinal (more on dwell below).

The design rule is one canonical reaction per transition. .on("enter", …) is the single, blessed "this is what this bucket does when someone enters." If you need a second divergent reaction to the same transition — a different audience, a different schedule, a flow other triggers also start — that's a sign it's a real journey: write a normal defineJourney({ meta: { trigger: { event: bucket.entered } } }) and bind it as in section 1. .on() is for the colocated common case, not an event bus.

Dwell: react to time spent in a bucket

enter and leave fire on a boundary crossing. dwell is the third edge: it fires while a member stays in the bucket. You give it exactly one of two schedules:

  • { after: duration } — one-shot, fires once when the member has been continuously in the bucket for duration.
  • { every: duration } — recurring, fires once per elapsed interval (coalesced, so a missed sweep doesn't fire twice).

Dwell is cron-resolution, not instant. It's driven by the engine's reconcile sweep, gated on continuous membership — a leave-and-rejoin resets the clock, because a re-join is a fresh membership. ctx.dwellCount is the elapsed-interval ordinal: 1 for a one-shot after, incrementing across intervals for every.

You might ask why this isn't just .on("enter") + ctx.sleep(days(30)). The difference is the existing population. A durable sleep inside an enter handler only schedules the future: it covers people who enter after you ship the reaction. Dwell fires for everyone already in the bucket — it reads each member's dwell anchor (coalesce(dwellAnchorAt, enteredAt), a historical anchor derived during backfill), so the moment you deploy a { after: days(30) } dwell on a dormancy bucket, it fires for people who have already been dormant for thirty days, not thirty days from now. That backfill-derived clock is dwell's whole reason to exist.

A colocated dwell reaction reads like this — a weekly nudge for continuous members, driven by the cron over the existing population:

// src/buckets/power-users.ts — colocated dwell reaction
powerUsers.on("dwell", { every: days(7) }, async (user) => {
  await sendEmail({
    to: user.email,
    userId: user.id,
    journeyStateId: user.stateId,
    template: Templates.ACTIVATION_NUDGE,
    subject: "Your weekly power-user tip",
    journeyName: user.journeyName,
  });
});

In Studio, every reaction a bucket generates is grouped under that bucket by its sourceBucketId, so the bucket's enter/leave/dwell handlers read as part of the bucket — not as orphan journeys scattered across the journey list.

3. Query the membership

Because a bucket is a materialized set, you can read it directly off the bucket object — useful from inside a journey, a custom route, or a one-off script:

const { data: total }    = await powerUsers.count();        // number | null
const { data: isMember } = await powerUsers.has(userId);    // boolean
const page = await powerUsers.members({ limit: 50 });       // { data, error, count, cursor }
for await (const m of powerUsers.membersIterator()) { /* paged internally */ }

Every method returns a Supabase-shaped { data, error } result and never throws — a query failure lands in error, not an exception. members() adds a per-call count and a keyset cursor for the next page.

The one hard rule: member access never returns an unbounded array. members() is capped at 100 rows per page and you walk the rest with the cursor; membersIterator() is the convenience over that — it pages internally so you can for await the whole population without ever holding more than a page in memory. There is deliberately no bucket.all() that hands back every member at once. This keeps a bucket a read primitive for acting on members in code, not a bulk-export hatch — which is consistent with the next section's stance on what buckets are and aren't.

Criteria are the conditions you already know

A bucket's membership predicate is the same composable condition model journeys use for trigger filters, exit rules, and mid-journey branching — there's no new language to learn:

  • Propertyplan == "trial", trial_days_left <= 3, converted != true
  • Event — "did feature.used in the last 7 days," or a count threshold like "fired feature.used 10+ times in the last 30 days"
  • Composite — arbitrary and / or nesting of the above

Inclusion and exclusion live in the same tree — a neq or an event check: "not_exists" is just another leaf, no separate exclusion list. The canonical absence predicate ("did not do X in the last N days") is exactly how you express dormancy or churn risk.

Email-engagement conditions (open/click history) are not allowed in bucket criteria in v1. They key on the recipient address rather than the user id, which can't be validated when a bucket is defined, so the engine rejects them at registration. Everything else from the Conditions guide applies.

Dynamic vs manual

Every bucket today is dynamic (the default): membership is recomputed from criteria automatically. You write the predicate; the engine maintains the set — joins and most leaves inline as events arrive, time-based leaves via the reconcile cron.

The kind discriminator also declares "manual", but that mode is not implemented in v1 — a bucket with kind: "manual" is rejected loudly at registration rather than accepted as a set that can never gain members. The forward-compat plan is a bucket whose membership is changed only by explicit API call or import (a hand-curated beta cohort, a list imported from elsewhere). Until then, every bucket needs a criteria predicate.

kind: "manual" throws at registration in v1: "kind:"manual" buckets are not implemented in v1; use a dynamic bucket (kind:"dynamic" with criteria) instead." Don't reach for it yet — express the set as a dynamic criteria predicate.

Real-time, where PostHog's cohorts can't be

This is the gap buckets exist to fill, and it's worth being precise about it.

PostHog computes dynamic cohorts on a batch schedule — roughly every 24 hours. That's the right design for analytics: scanning the full event store to recompute a behavioral cohort is expensive, and for "which users belong to this audience" a daily refresh is fine. But it means PostHog cohort membership is stale by up to a day, and PostHog has no native "this user just joined/left this cohort" webhook to act on. For lifecycle automation, where the whole point is to react now — the trial expires today, the payment just failed, the power user went quiet this morning — a 24-hour batch lag is the difference between a timely nudge and a wasted one.

Buckets close that gap by computing membership off Hogsend's own ingested event stream, in real time:

  • Joins and most leaves are immediate. When an event arrives, the ingestion pipeline re-evaluates the buckets that event could affect and flips membership inline — sub-second, as part of processing the event that caused it.
  • Time-based leaves are reconciled by a cron. Some leaves are triggered by the clock, not an event: "no longer active in the last 7 days" becomes false simply because seven days passed with nothing happening. No inbound event will ever signal that, so an engine-owned reconcile pass sweeps time-based buckets and emits those absence leaves. A late or missed sweep only delays a leave — it never corrupts membership.

The boundary is real-time vs batch recompute, not window length. A 30-day rolling window is still a real-time bucket; what makes it Hogsend's job is that Hogsend computes it off its own stream with its own condition engine, the instant the data changes. PostHog keeps what it's best at — cohorts that must scan PostHog's own analytics store, detection over events Hogsend never ingests, and anything a team would rather author once in PostHog's cohort UI. The two are complementary: a PostHog cohort can still fire a journey via the PostHog webhook with no bucket involved. Buckets are for the real-time slice, where membership-change-as-a-trigger and automatic exitOn are the point.

Buckets observe; they don't author, and they don't sync

Buckets inherit Hogsend's two governing stances.

Code-first, observe-not-author. A bucket is a TypeScript file, exactly like a journey or a webhook source. There is no visual bucket builder and there never will be — Studio shows you a bucket's size, its enter/leave rate over time, which journeys it feeds, and the reactions it owns (grouped under it via sourceBucketId), but you author the criteria and the reactions in code. Definitions live in your repo, you own them, and the engine never imports them. See Philosophy for the full engine-vs-content line.

Not a CDP. Once you have a materialized membership table, entered/left events, and an arbitrary condition language, you have most of the machinery of a customer data platform. The one thing that keeps Hogsend from being one is a deliberate refusal: a bucket transition emits an event into Hogsend's own journey system first. To fan that transition out to an external system, you use the same outbound spine every other Hogsend event rides — bucket.entered / bucket.left are two of the 13 outbound catalog events, so a destination (PostHog, Segment, Slack, a CRM, a warehouse) can subscribe to them with the engine's durable retry/backoff/DLQ delivery. What a bucket itself does not carry is a baked-in destination connector — first-class per-bucket sync would turn the bucket primitive into a generic CDP audience-sync surface, which is an explicit non-goal.

There is exactly one bucket-level external sync, and it's off by default: a bucket can optionally $set a boolean PostHog person property (hogsend_bucket_<id>, overridable via postHogPropertyKey) on join and clear it on leave. That gives a PostHog cohort a person-property membership signal it can evaluate in real time — something PostHog can't compute on its own — while staying inside the "PostHog detects, Hogsend acts" loop. It's a no-op without a PostHog API key, and it's the only bucket-level sync that exists. It is not a destination connector, and it isn't on unless you turn it on.

A bucket in code

The starter app ships one example bucket — power-users, a behavioral, time-based group (a 30-day rolling window):

// src/buckets/power-users.ts — behavioral, time-based (30-day rolling window)
import { days, defineBucket } from "@hogsend/engine";
import { Events } from "../journeys/constants/index.js";

export const powerUsers = defineBucket({
  meta: {
    id: "power-users",
    name: "Power users",
    description: "Used the key feature 10+ times in the last 30 days.",
    enabled: true,
    // Rolling 30-day window → time-based: the reconcile cron sweeps the leave
    // when the window rolls past, since no event signals it.
    timeBased: true,
    entryLimit: "once_per_period",
    entryPeriod: days(7),
    criteria: (b) => b.event(Events.FEATURE_USED).within(days(30)).atLeast(10),
  },
});

The shipped file colocates a set of .on("enter" | "dwell" | "leave", …) reactions but ships them commented out, so a fresh app doesn't email power users by default — uncomment one to wire the reaction the instant the module loads. An absence bucket (dormancy / churn risk) is the other shape you'll reach for; it's the canonical home for a dwell reaction, because the absence leave is owned by the reconcile cron and dwell fires off the backfilled anchor for the existing population:

// An absence bucket — was active once, but NOT in the last 7 days.
import { days, defineBucket } from "@hogsend/engine";
import { Events } from "../journeys/constants/index.js";

export const wentDormant = defineBucket({
  meta: {
    id: "went-dormant",
    name: "Went dormant",
    enabled: true,
    timeBased: true,
    fastExpiry: true,
    criteria: (b) =>
      b.all(
        b.event(Events.FEATURE_USED).exists(),
        b.event(Events.FEATURE_USED).within(days(7)).notExists(),
      ),
  },
});

The unbounded exists() floor is what excludes never-active signups — a bare not_exists within 7d would join every brand-new user who has yet to fire feature.used.

power-users and an absence bucket like went-dormant are time-based — a rolling window means the clock alone can flip membership, so the reconcile cron owns their leaves. A pure-property bucket (e.g. "on trial with 3 days left, not converted") is real-time only, since a property bucket can only change when an event carries a new property value. The criteria are just the conditions you already write everywhere else; what's new is that the predicate defines a group — and that group is something you can bind a journey to, attach a colocated reaction to, and query, all off the same definition.

For authoring detail — re-entry policy, anti-flap (minDwell / maxDwell), the full defineBucket() reference, and the registration steps — see the Buckets guide.

Next steps

On this page