Last deployment:

Turfi Platform Documentation

Official Turfi documentation portal for users, admins, and developers.

Back to support

Documentation Search

Search only within Turfi documentation pages.

C

Data Architecture

Registry identity standardization, lifecycle status, club licensing, entity resolution, and structural data architecture.

Turfi depends on structured sports data that stays readable, governable, and reusable across imports, registries, and downstream engines. This document explains the identity, registry, resolution, and data governance layers (lifecycle, contracts, safe deletion—not federation or competition governance) that make that model reliable. See Platform Philosophy and Scope for product scope.

It sits between platform architecture and domain engines because data contracts shape how imports, competition workflows, player identity, and admin operations all behave. As the platform grows, this file should remain the long-term source of truth for entity structure and platform data governance.

Shared status legend: [docs/_shared/status-legend.md](./_shared/status-legend.md)


Core Entity Architecture

Turfi is built around a layered sports data model. Instead of treating players, teams, competitions, and match events as one flat set of records, the platform separates identity, competition structure, and match intelligence into distinct layers that can evolve independently.

This matters because sports data changes constantly. Players move clubs, teams are restructured, organizations change names, and registrations expire. Turfi is designed so those identity changes do not destabilize the competition record as modeled on Turfi or erase what happened on the field. The platform keeps that structure coherent for display and derived outputs, while ensuring historical match data remains readable even if the entities behind it change or disappear later.

Core Hierarchy

At the highest level, Turfi models sports competition as a connected hierarchy:

Organization ↓ Club ↓ Team ↓ Competition ↓ Group ↓ Game ↓ Event ↓ Moment ↓ Media

Each layer has a distinct purpose in the platform:

Organization The governing entity that owns clubs, venues, and competitions.

Club A sports club that manages teams and participates in competitions.

Team A roster participating in competitions.

Competition A tournament or league within a season.

Group A subdivision of a competition used to organize teams and standings.

Game A match played between teams.

Event A discrete action during a match such as a goal, card, or substitution.

Moment An extracted highlight or key match moment derived from events.

Media Video or media assets associated with matches and highlights.

This hierarchy gives Turfi a stable way to describe how real-world sports activity flows from governance and participation down to match actions and media outputs.

Entity Communication and Centralized Address Model

Turfi now uses a direct communication-plus-address-reference model for the core governance entities that operators manage most often in Admin Data:

  • organizations (shown as Associations in the frontend UI)
  • leagues
  • clubs

The database table name remains organizations, but the admin and support UX label that registry as Associations.

The architecture is intentionally split by responsibility:

  • communication data belongs directly to the entity row
  • location data belongs to the centralized addresses table

Entity-Owned Communication Fields

These entity tables now store their own communication fields directly:

  • phone
  • email
  • website
  • address_id

This means an Association, League, or Club can be filtered and edited directly by its communication metadata without looking up a separate contact record first.

Centralized Address Record

The authoritative location record lives in addresses.

Address fields:

  • address_line1
  • address_line2
  • city
  • province
  • postal_code
  • country
  • latitude
  • longitude

Relationship model:

  • organizations.address_id -> addresses.id
  • leagues.address_id -> addresses.id
  • clubs.address_id -> addresses.id

Schema diagram:

Associations (organizations)   Leagues   Clubs
        |                        |         |
        | address_id             |         | address_id
        +-----------+------------+---------+
                    |
                    v
                addresses

Practical rule:

  • if an entity has no address data yet, address_id remains NULL
  • the entity still keeps phone, email, and website directly on its own row
  • city and province shown in registries are derived from the linked address row rather than stored as the authoritative location on the entity itself

This keeps communication data lightweight and entity-owned while ensuring location data stays normalized, reusable, and geocoding-friendly.

Import and Registry Implications

For Associations, Leagues, and Clubs:

  • imports may supply address fields inline
  • the import engine creates an addresses row when address fields are present
  • the returned address UUID is written back to address_id
  • if address fields are absent, the entity import leaves address_id as NULL

In the registry UI:

  • Phone
  • Email
  • Website
  • City
  • Province

are the core display fields for this model, with City and Province resolved from addresses.

Contact Record Model

Turfi still maintains reusable contacts records for person-centric contact datasets and shared operational workflows.

Contact records keep a structured person-name model:

  • contacts.first_name
  • contacts.last_name
  • contacts.full_name

full_name remains the display value and is generated from first_name + last_name, while backward-compatible fields such as contact_name remain synchronized so older views and workflows do not break during the transition.

Infrastructure Layer

Turfi treats infrastructure as its own normalized layer rather than flattening facilities into club or business rows.

Core infrastructure rules:

  • venues represent physical facilities or complexes
  • turfs represent playable field surfaces attached to a venue
  • one venue may have many turfs
  • turf records must always resolve to a parent venue
  • businesses remain part of the entity layer even when they operate at, manage, or service venues

This distinction matters because venue identity, field characteristics, business ownership, and contact/address relationships evolve independently. A complex can change operators, a turf can be resurfaced, and a business can move venues without forcing Turfi to remodel the facility itself.

Player Model

Players are modeled differently from teams because a player identity must survive across changing team contexts. In Turfi, a player exists independently and then becomes associated with teams through registrations.

Player ↓ Player Registration ↓ Team

A player registration stores the context of that player's participation in a specific season, competition, club, or organization. Rather than treating a team assignment as permanent identity, Turfi records the relationship as something that belongs to a competitive context.

Registrations may contain references such as:

  • player_id
  • team_id
  • club_id
  • organization_id
  • competition_id
  • season_id

This design allows the platform to preserve historical participation even if the player later transfers, the team changes, or the club structure is reorganized.

Historical Data Protection

One of the most important principles in Turfi is simple:

Identity may disappear. History must survive.

Turfi never treats historical match records as disposable. Match history tables are preserved even when linked identities are removed from the active system. Instead of deleting the history, the platform detaches identity references while keeping readable snapshot information behind.

For example, if a player is deleted, historical match records do not disappear. The record can retain the captured player name through snapshot fields such as:

  • player_name_text

The identity reference may be detached:

  • player_id = NULL
  • player_name_text = "Ronaldo"

That means goals, lineups, events, moments, and statistics remain understandable long after the original identity link is gone. Historical sports truth is preserved even when operational identity changes.

Competition Structure

Competitions are organized in a predictable layered structure:

Season ↓ Competition ↓ Group ↓ Teams ↓ Games ↓ Standings

A season defines the broader time window. Within that season, competitions represent leagues or tournaments. Groups divide competitions into manageable subdivisions, usually for scheduling and standings. Teams participate inside those groups, games produce results, and standings are calculated from those results.

Standings are not treated as manually edited records of truth. They are derived from games. When official match results change, standings are rebuilt so the competition table always reflects the current official game record.

Moment and Media Pipeline

Turfi’s highlight system is built on top of match intelligence rather than isolated video editing. The platform moves from structured match activity into candidate highlights and then into published media outputs.

Game Event ↓ Moment Candidate / Approved Event ↓ Video Moment ↓ Highlight Clip ↓ Player Highlight

An event recorded during a match, such as a goal or card, can generate a moment candidate. That candidate may be elevated into a highlight moment after additional validation or editorial logic. From there, the system can associate the highlight with actual video clips and reusable media assets.

This pipeline allows Turfi to connect what happened in the game to what gets surfaced in highlights, scouting, recaps, and future media workflows.

Current media-intelligence rollout:

game_events
    ↓
video_moment_generation_queue
    ↓
video_moments
    ↓
video_clip_generation_queue
    ↓
highlight_clips
    ↓
player_highlight_items
    ↓
v_player_highlights

Video Ingestion and Stats Mapping Tables

The video ingestion architecture extends the existing moment/media pipeline by adding a governed source-video and raw-import layer ahead of canonical event truth.

Proposed tables:

  • game_videos — source video registry for all footage attached to a game
  • video_ingestion_jobs — ingestion, transcoding, and provider-import job tracking
  • video_event_imports — raw provider events before canonical normalization
  • game_event_sources — lineage from canonical events back to source videos or provider events
  • video_player_resolution_queue — unresolved player labels awaiting review
  • video_moment_generation_queue — approved events waiting for moment creation
  • video_moments — reusable generated or linked match moments
  • video_clip_generation_queue — generated moments waiting for segment extraction
  • highlight_clips — extracted clip outputs for highlight playback
  • player_highlight_items — player-facing clip associations used for highlight feeds
  • v_video_moment_generation_jobs — worker-facing read view for moment jobs
  • v_video_clip_generation_jobs — worker-facing read view for clip jobs
  • v_player_highlights — ranked player-highlight read model

Architectural rule:

  • provider feeds are not canonical truth
  • canonical truth remains approved game_events
  • derived outputs such as player_game_stats, video_moments, highlight_clips, player_highlight_items, and highlight_collections must rebuild from the approved event layer

This keeps external stats providers, manual tagging, and future AI detection aligned with the same governed data model used by the rest of Turfi.

Entity Deletion Principles

Turfi protects sports history by avoiding destructive cascading deletes across historical match data. The platform favors detachment over destruction so the record of competition remains intact.

Examples:

Player deletion player_id references are set to null in historical tables while snapshot fields preserve the player name.

Team deletion team_id references may be detached but match history remains readable through snapshot fields.

Organization deletion Dependent entities are detached while historical match data remains intact.

This approach protects the integrity of sports data across the platform. Operational entities can be cleaned up, archived, or removed when necessary, but the historical record of competition remains durable and readable for future analysis.

Entity Resolver Service

Purpose

The Entity Resolver Service converts human-readable values (for example club or team names) into internal platform IDs so imports and integrations do not require administrators to provide UUIDs in source files.

This service is shared infrastructure used by import workflows and future admin/API integration utilities.

Service Location

lib/entityResolver.ts

Supported Entity Types

The resolver currently supports:

  • Players
  • Teams
  • Clubs
  • Competitions
  • Venues
  • Turfs
  • Businesses
  • Seasons

Resolution Contract

Primary resolver function:

resolveEntity(entityType, value)

Response model:

  • status: resolved | ambiguous | not_found
  • entity_id
  • entity_name
  • candidates[]

Example:

Input:

resolveEntity("club", "Montreal CF")

Possible resolved output:

{ status: "resolved", entity_id: "c1e2b3", entity_name: "CF Montreal" }

Matching Strategy

Resolution follows a tiered strategy:

  1. Exact match (= on configured entity field)
  2. Case-insensitive exact match (ILIKE value)
  3. Fuzzy candidate search (ILIKE %value% + ranked similarity)

Candidate results are limited to five entries for ambiguous/fuzzy outcomes.

If matching remains ambiguous, the workflow returns candidates for explicit user resolution.

Entity Mapping Configuration

The resolver uses a shared entity configuration map (table + name field) for each supported entity:

  • players -> players.name
  • teams -> teams.name
  • clubs -> clubs.name
  • competitions -> competitions.name
  • venues -> venues.name
  • turfs -> turfs.name
  • businesses -> businesses.name
  • seasons -> seasons.name

This makes the resolver reusable and extensible without duplicating matching logic in each importer.

Import Engine Integration

During import processing, mapped rows are checked for human-readable reference fields (for example team, club, competition, venue, season).

If a corresponding *_id field is missing, the Import Engine calls the resolver and attempts to populate the ID automatically.

Flow:

  1. CSV row parsed and mapped
  2. Reference field detected
  3. Resolver executes exact/case-insensitive/fuzzy lookup
  4. If resolved, *_id is applied to the row
  5. If ambiguous, row is marked failed and candidate names are returned
  6. If not found, row is marked failed for manual resolution

Player-related resolution now uses the player identity infrastructure (player_identity_index and resolver views) instead of repeatedly joining raw registration/team/club tables in every workflow.

Exception Resolution Support

The resolver supports a per-job resolution map that stores manual or previously resolved mappings:

input_value -> resolved_entity_id

Before running database matching logic, the resolver checks this map. This ensures repeated values (for example Montreal CF in many rows) are resolved once and reused consistently.

Performance Considerations

Resolver performance protections include:

  • in-job cache for repeated lookups
  • strict limits on fuzzy candidate sets
  • exact lookup before fuzzy operations
  • candidate truncation to top results

This keeps large imports efficient while still supporting ambiguous-match review workflows.

Player Identity Infrastructure

Status: IMPLEMENTED

Purpose

Player Identity Infrastructure provides a dedicated identity layer for fast player lookup and consistent player matching across platform workflows.

It is designed to support:

  • fast player lookup
  • duplicate detection
  • profile claim suggestions
  • recruiting search
  • scouting search
  • import matching

Architecture Overview

Turfi does not repeatedly query multiple base tables for identity matching in every feature path. Instead, base player and roster context is aggregated into a canonical identity index:

players + player_registrations + teams + clubs -> player_identity_index

Higher-level systems then query identity-focused resolver views:

  • api_player_resolver
  • api_player_resolver_search

These views are consumed by:

  • Import Engine reference matching
  • claim suggestion workflows
  • duplicate detection workflows
  • future scouting/recruiting resolution tools

Database Model

Base sources:

  • players
  • player_registrations
  • teams
  • clubs

Canonical identity layer:

  • player_identity_index

Resolver read models:

  • api_player_resolver
  • api_player_resolver_search

Core index fields:

  • player_id
  • first_name
  • last_name
  • full_name
  • birth_year
  • team_id
  • team_name
  • club_id
  • club_name

Workflow

  1. Player, registration, team, or club data changes.
  2. Identity index refresh process updates player_identity_index.
  3. Resolver views expose normalized identity rows.
  4. Import/claim/search features query resolver views instead of rebuilding joins.

Example Scenarios

  • Import row includes Danny Malouin and 2008; resolver finds canonical player identity row quickly.
  • Claim suggestion flow presents player candidates with team and club context.
  • Scouting search surfaces consistent identity records across current season rosters.

Future Considerations

  • richer fuzzy identity scoring
  • phonetic/alias matching
  • federation ID linkage
  • confidence scoring surfaced in UI workflows

Data Architecture Overview

This section documents the entity hierarchy and relationship model validated during ingestion testing.

Entity Hierarchy

Organization
    └── Club (belongs to organization)
            └── Team (belongs to club and organization)
                    └── Player (references club via club_text for import compatibility)
                            └── Game (references teams)
                                    └── Game Events (reference players)

Relationship Model

  • Organizations represent associations, clubs, academies, leagues, federations, and similar governing bodies. They are the top-level structural entity.
  • Clubs belong to organizations. A club is a member or child entity of an organization.
  • Teams belong to clubs and organizations. Teams participate in competitions and games.
  • Players reference clubs through club_text for import compatibility. During ingestion, player records can use human-readable club names that are resolved to club_id.
  • Games reference teams (home and away). Games belong to competitions and seasons.
  • Game events reference players. Events such as goals, assists, and substitutions link to player_id (or player_name_text before resolution).

This hierarchy establishes the canonical data model for sports infrastructure and ensures consistent relationships from organization down to individual game events.

Admin Data Architecture

Purpose

The Admin Data Architecture is the operator-facing layer that keeps Turfi's structured sports data editable without inventing a different tool for each entity type. It gives administrators one consistent way to search, filter, load, edit, import, and govern the records that feed competition, player, venue, and commercial workflows.

Architecture Overview

In the current codebase, the Data workspace is split into Registries and Operations.

  • Registries are the record-management pages for core entities.
  • Operations are the cleanup, resolution, duplicate-review, and import-history tools that support those registries.

Each current registry route mounts the shared RegistryDataGrid with a registry-specific RegistryConfig. That gives Turfi one common grid engine while still allowing each entity to define its own filters, columns, forms, save logic, and governed deletion behavior.

The shared interaction model includes:

  • filtering
  • search
  • pagination
  • row selection
  • bulk operations

Key implementation sources:

  • Shared registry framework: components/admin/registry/RegistryDataGrid.tsx
  • Shared registry controller contract: components/admin/registry/useRegistryController.ts
  • Per-registry definitions: lib/admin/registryConfigs/*.ts
  • Current registry and operations navigation: lib/admin/dataNavigationConfig.ts
  • Admin entity read/write map: lib/admin/dataManagementConfig.ts

Data Workspace Navigation

The implemented Data workspace navigation is grouped horizontally in the page header area and mirrored in the admin sidebar.

Current registry groups:

  • Governance
  • Participants
  • Infrastructure
  • Commercial

Current operations groups:

  • Entity Resolution
  • Imports

Data Registry System

The Data Registry System is the operational layer that manages Turfi's reference entities. It exists so the platform has one authoritative place to store, edit, import, and govern the records that every other engine depends on.

Core registries:

  • organizations
  • leagues
  • seasons
  • competitions
  • clubs
  • teams
  • players
  • venues
  • turfs
  • businesses

Registry standardization now requires every core registry entity to follow the same identity contract where applicable:

  • id — UUID primary key
  • key — normalized system identifier
  • slug — URL-safe identifier
  • name — human-readable label
  • normalized_name — duplicate-detection and search helper where the entity type needs it
  • status — lifecycle state
  • created_at
  • updated_at

This contract applies across organizations, leagues, seasons, competitions, clubs, teams, players, venues, turfs, and businesses. Not every entity needs normalized_name, but every registry table must support status.

Normalized Name System:

  • organizations, clubs, and leagues now maintain a canonical normalized_name column
  • normalized_name is generated by the database with normalize_text() through the set_normalized_name() trigger function
  • the value strips accents, punctuation, and case differences so duplicate detection, registry search, import matching, and resolver suggestions compare the same canonical identity
  • application code must never write normalized_name directly; inserts and updates only send the human-facing identity fields and let database triggers maintain the canonical value

Registry responsibilities:

  • entity storage
  • admin editing
  • import template generation
  • import resolution
  • duplicate resolution
  • lifecycle governance through status rather than hard-delete-first workflows

Registries are the foundation of all operational engines because they hold the reference entities that imports, competitions, media, discovery, and admin workflows all depend on.

Operations (Admin Workspace)

Operations are workflow tools that process or clean data. They do not represent direct entity registries. Operations pages currently surface unresolved entities, duplicate-resolution review, import launch, and import history rather than registry-style CRUD grids.

OperationRouteDescription
Unresolved Entities/admin/data/operations/entity-resolverMatch unresolved entities to existing records or create new ones
Entity Resolution/admin/data/operations/duplicatesReview duplicate groups across every registry and merge or delete records safely
Import/admin/data/operations/import-engineLaunch CSV / League Package import (redirects to /admin/imports/new)
Import History/admin/data/operations/import-historyView import jobs and outcomes (redirects to /admin/imports/jobs)

Difference between Registries and Operations: Registries store and edit records; Operations resolve, deduplicate, import, and audit them.

Avatar Menu Integration

The user avatar dropdown (profile menu) includes a My section for admin users with quick links to:

  • Registries/admin/data/registries
  • Operations/admin/data/operations

This allows admins to access data tools directly from the user menu without navigating through the workspace menus.

Admin Data Page Structure

Each data page includes:

  • Page Header: shows entity title and primary actions such as Import, Download Template, and Add New Record
  • Filter Bar: supports attribute filtering and autocomplete lookups for reference entities
  • Data Grid: sortable tabular records with pagination and row selection controls
  • Bulk Actions Surface: appears when selection is active and exposes batch operations

Bulk Operations

Administrators can select multiple rows and perform bulk actions such as:

  • bulk edit
  • bulk delete

The selection system supports three modes:

  • manual row selection
  • select all rows on the current page
  • select all rows matching the current filter

Bulk operations operate on the full filtered dataset when appropriate.

Player Deletion Workflow

When an admin opens a player record in the Admin Data grid, the system retrieves dependency information using fn_player_dependency_summary. The UI displays a summary of affected records: match events, statistics, media, recruitment targets, scouting notes, and related counts. The admin can then choose Archive, Safe Delete, or Force Delete.

Archive hides the player from discovery while keeping the profile and all references. Safe Delete removes the player profile and detaches match history references (SET player_id = NULL) while preserving all match records. Force Delete uses the same detachment strategy but is restricted to elevated administrator roles (system_admin, super_admin) and requires additional confirmation, including typing a confirmation phrase to enable the action.

Workflow: Admin opens player record → System fetches fn_player_dependency_summary → UI shows affected records → Admin chooses Archive, Safe Delete, or Force Delete → Force Delete requires elevated roles and type-to-confirm.

Field Label Mapping

The platform stores mappings between database fields and user-friendly labels. These mappings are stored in the ui_field_labels table. This allows administrative interfaces to display readable field names instead of raw database column names.

Example: birth_year → Birth Year

Labels support multiple languages including:

  • English
  • French
  • Spanish

ID Finder

The ID Finder allows administrators to search for internal entity identifiers when they need to reference platform records. This tool is especially useful when preparing imports or debugging entity relationships.

Import Integration

Registry pages link directly into the Import System through shared header actions. The same header also exposes template download and entity-resolution triggers so an operator can move from browsing a registry to importing or cleaning it without changing workspace context.

  • Download CSV templates from /api/templates/csv?entity=...
  • Open the import wizard through /admin/imports/new?entity=...
  • Map CSV columns to destination fields from api_import_entity_adapters
  • Run preview validation before execution
  • Persist import jobs, rows, and file metadata for review

Template Generation System

Import templates should be treated as a schema-aligned operational contract rather than hand-maintained spreadsheet headers. Template generation exists so the CSV contract shown to operators always matches the underlying registry schema.

Template generation should:

  • read registry schema directly from the database
  • include required fields
  • include normalized relationship references instead of raw foreign-key UUIDs
  • exclude system columns such as id, created_at, and updated_at

This keeps templates aligned with the real database structure and reduces drift between imports, registries, and documentation.

Interactions with Other Engines

  • Player Engine: admin registries manage player identity and profile datasets.
  • Competition Engine: admin registries manage competitions, seasons, teams, and related structure.
  • Media Engine: moderation and media-linked admin workflows integrate with shared admin surfaces.
  • Import System: admin registries provide direct import and template entry points.
  • Discovery Engine: admin-managed entity quality directly affects search and discovery outputs.

Registries Architecture

The Registries system is Turfi's operational record layer. It is where admins manage the core entities that power player identity, competition structure, venues, commercial relationships, and downstream import and resolution workflows.

This section reflects the current implementation in the admin workspace, registry config files, import services, and checked-in SQL. It documents the registries that are actually visible today, the tables and views they use, the order data should usually be imported, and how those registries connect to Operations tools.

Visible Registry Inventory

The current top-level registry menu is:

  • Governance: Associations, Leagues, Seasons, Competitions
  • Participants: Clubs, Teams, Players
  • Infrastructure: Venues, Turfs
  • Commercial: Businesses

Source-of-truth files for the registry system:

  • lib/admin/dataNavigationConfig.ts
  • lib/admin/dataManagementConfig.ts
  • lib/admin/registryConfigs/*.ts
  • components/admin/registry/useRegistryController.ts
  • components/admin/registry/RegistryDataGrid.tsx

Registry Catalog

RegistryPrimary table / viewPurposeKey relationships
Associationspublic.organizations plus derived joinsGoverning-body registry used to manage association-style organizations that connect clubs and competitionsclubs.organization_id -> organizations.id, competitions.organization_id -> organizations.id
Leaguespublic.competitions, read through public.api_competitions_profilesLeague-style view over competition recordscompetitions.season_id -> seasons.id, competitions.region_id -> regions.id
Seasonspublic.seasons, read through public.api_seasonsSeason windows and scheduling periodscompetitions.season_id -> seasons.id
Competitionspublic.competitions, read through public.api_competitions_profilesCore competition records used by standings, scheduling, and gamesseason_id, organization_id, region_id
Clubspublic.clubs, read through public.api_club_profilesClub records, optional licensing metadata, and organization linkageclubs.organization_id -> organizations.id, clubs.license_level_id -> club_license_levels.id, teams.club_id -> clubs.id
Teamspublic.teams, read through public.api_team_profilesCompetition-facing team recordsteams.club_id -> clubs.id, downstream references from standings.team_id, games.home_team_id, games.away_team_id
Playerspublic.players, read through public.api_player_profiles_extendedAthlete identity records used across competition, media, and recruitment flowsplayer_registrations.player_id -> players.id, player_registrations.team_id -> teams.id
Venuespublic.venues, read through public.api_venuesVenue records and venue-level operating metadataBusiness and ownership context is exposed through API views
Turfspublic.turfs, read through public.api_turfsField-level records attached to venuesturfs.venue_id -> venues.id
Businessespublic.businesses, read through public.api_businessesCommercial entities tied to venue operationsVenue linkage is primarily surfaced through venue_name in views

Important implementation nuance:

  • Associations is a derived registry over organizations, not a dedicated base table.
  • Leagues is backed by public.leagues; Competitions is backed by public.competitions.
  • organizations still exists in older admin entity plumbing, and the regions route still exists, but neither is currently surfaced as a first-class visible registry tab.

Competition Hierarchy

Turfi treats competitions as the operational root for schedules, groups, rounds, games, and standings, while still allowing league context when it exists.

Hierarchy:

Organization → League (optional) → Competition → Games

Rules:

  • every competition belongs to an organization through competitions.organization_id
  • competitions may optionally belong to a league through competitions.league_id
  • competition type is now normalized through competitions.competition_type_id -> competition_types.id
  • competition format is now normalized through competitions.competition_format_id -> competition_formats.id
  • if competition_types.key = league, then league_id must be present
  • if competition_types.key is tournament, showcase, or friendly_series, then league_id may be NULL

Examples:

  • League competition: League1 Quebec Senior Men belongs to Soccer Quebec, resolves league_id -> League1 Quebec, uses season 2026 Summer, type League, format Round Robin
  • Standalone competition: Quebec U17 Showcase belongs to Soccer Quebec, leaves league_id = NULL, uses season 2026 Summer, type Showcase, format Group + Knockout

Registry Dependency Hierarchy

Turfi's registries are not a single strict tree because some UI surfaces are derived views, but there is still a practical dependency order that reduces failed reference resolution during import.

Recommended import sequence:

  1. Foundation: organizations, seasons, businesses
  2. Infrastructure: venues, then turfs
  3. Competition structure: leagues, then competitions
  4. Participant structure: clubs, then teams
  5. Identity layer: players, then player_registrations
  6. Match and derived layers after registry setup: team_competitions, competition_phases, competition_groups, competition_rounds, games, game_events, standings, media_assets

Important caveats:

  • Associations is derived from organizations, so association imports conceptually belong with the organization layer.
  • Players are not part of the generic registry identity trigger and continue to use the separate Player Engine identity pipeline.

Identity Model

Turfi now uses one standardized registry identity contract with two automation paths. Core registry entities share the same conceptual column model, while players continue to use a dedicated identity pipeline because they also participate in duplicate detection, resolver review, and merge workflows.

  • Standardized registry contract: organizations, leagues, seasons, competitions, clubs, teams, players, venues, turfs, businesses
  • Generic registry DB trigger: turfi_registry_identity() generates missing key and slug and updates updated_at for non-player registry entities
  • Player identity engine: players, player_registrations, player resolver views, duplicate detection, merge operations
  • Canonical name triggers now maintain normalized_name automatically for organizations, clubs, and leagues
  • Final identity model: uuid = database identity, key = import identity, slug = URL identity, name = human-readable label, status = lifecycle state

Status Lifecycle Standard

Status is now a core registry lifecycle attribute rather than an optional presentation field.

Valid values:

  • active
  • inactive
  • archived

Lifecycle rules:

  • All registry tables must support status
  • New rows default to active
  • status must be NOT NULL
  • Registry pages should default to status = active
  • Registry pages must still allow operators to filter for inactive and archived
  • Historical records should be preserved through lifecycle changes instead of relying on hard deletes

The players, teams, venues, and turfs tables specifically received standardized status columns with NOT NULL, default active, check constraints, and supporting indexes so registry filtering remains efficient at scale.

Category Lookup Model

Turfi now uses normalized category lookups across the registry and competition layers:

  • leagues.gender_id -> genders.id
  • competitions.age_group_id -> age_groups.id
  • competitions.gender_id -> genders.id
  • competitions.competition_type_id -> competition_types.id
  • competitions.competition_format_id -> competition_formats.id
  • teams.age_group_id -> age_groups.id
  • teams.gender_id -> genders.id
  • players.gender_id -> genders.id

Final category intent:

  • League: may be mixed or category-specific
  • Competition: age-group and gender aware
  • Team: age-group and gender specific
  • Player: gender specific

Transition note for teams:

  • teams.age_group and teams.gender remain temporarily for backward compatibility
  • teams.age_group_id and teams.gender_id are now the source of truth
  • a database sync trigger keeps the legacy text fields aligned during the transition period

Category labels are not hard-coded in application logic. They must resolve through the centralized multilingual lookup-label layer so admin forms, registries, and imports render consistent labels across languages.

Centralized Address and Contact Relationships

Turfi now treats reusable addresses and contacts as centralized records that can be linked to registry entities through relationship tables instead of copying role text onto each entity.

Relationship tables:

  • entity_addresses
  • entity_contacts

Role lookups:

  • entity_addresses.address_role_id -> address_roles.id
  • entity_contacts.contact_role_id -> contact_roles.id

Model rules:

  • addresses stores normalized location records
  • contacts stores normalized person/contact records
  • relationship-specific semantics such as "primary", "mailing", "operations", or "booking" belong on the link tables
  • deprecated text columns entity_addresses.address_role and entity_contacts.contact_role remain transitional only and must not be used by application code

Club Licensing Architecture

Club licensing is now aligned to the same centralized lookup architecture used elsewhere in the platform.

Lookup table:

  • club_license_levels

Lookup structure:

  • id
  • key
  • created_at

Example keys:

  • grassroots
  • regional
  • provincial
  • national

Important rules:

  • Human-readable license labels are not stored in club_license_levels
  • Multilingual labels belong in the centralized ui_field_labels system
  • club_license_level_translations is deprecated architecture and must not be recreated
  • clubs.license_level_id is an optional descriptive foreign key to club_license_levels.id
  • Licensing is display and analytics metadata only; it does not enforce competition eligibility, federation compliance, or roster restrictions

Turfi is not a federation compliance platform. Club licensing exists to support profile display, scouting filters, analytics, and discovery context without turning the registry layer into a rule-enforcement system.

Import Template System

The Download Template action returns header-only CSV templates generated from lib/imports/templateRegistry.ts. Registry exports use the same normalized contract so exported datasets remain portable across Turfi environments.

The download endpoint is:

  • app/api/templates/csv/route.ts

Current template coverage:

EntityExported columns
playersfirst_name, last_name, birth_year, team, club
teamskey, name, slug, club_key, organization_key, age_group, gender
clubskey, name, slug, organization_key, short_name, address_line1, address_line2, city, province, postal_code, country, phone, email, website, logo_url
leagueskey, name, slug, organization_key, league_type_key, gender_key, address_line1, address_line2, city, province, postal_code, country, phone, email, website, status
competitionskey, name, slug, organization_key, league_key, season_key, age_group_key, gender_key, type, format, start_date, end_date
seasonskey, name, slug, start_date, end_date
organizationskey, name, slug, parent_organization_key, address_line1, address_line2, city, province, postal_code, country, phone, email, website
regionsname, code
venueskey, name, slug, owner_organization_key, venue_type, city, province, country, location
turfskey, name, slug, venue_key, surface_type, field_format, is_indoor, is_lit, length_m, width_m
businesseskey, name, slug, venue_key, business_type, city, province, country, website, status

How templates and exports work in practice:

  • The downloaded CSV contains only a header row and no sample data.
  • Required fields are not marked in the CSV itself.
  • The import wizard loads destination fields and required fields at runtime from api_import_entity_adapters.
  • The preferred relationship convention is a normalized reference, usually _key, such as team_key, club_key, season_key, and organization_key.
  • Registry exports never emit raw relationship UUIDs when a normalized reference is available; entity links export readable references and lookup links export stable lookup keys.
  • Runtime resolver rules in lib/imports/importService.ts translate _key, _slug, and fallback name fields into *_id fields before adapter writes.
  • Competition imports always require organization_key; league_key remains optional unless the competition row is a league competition.
  • Competition CSV columns type and format are portable lookup keys that resolve into competition_type_id and competition_format_id during import.
  • Competition exports emit competition_types.key and competition_formats.key instead of deprecated text columns or internal IDs.
  • For registry entities, name is required while key and slug are optional because the database may generate them.
  • Optional descriptive lookup metadata such as club licensing should resolve through centralized lookup keys when surfaced by adapters, not through table-specific translation tables.
  • For organizations, leagues, and clubs, address columns in the template are import-time source fields that create or resolve an addresses row and then set address_id on the entity.
  • phone, email, and website remain direct entity fields for those registries.
  • Standalone addresses and contacts imports still exist for reusable normalized records elsewhere in the platform.

Known mismatch in the current implementation:

  • Templates exist for organizations and regions, but the current import wizard entity picker does not expose those entity types.
  • The wizard supports player_registrations, but there is no downloadable CSV template for that entity.

Relationship to Operations

Registries and Operations are designed to support each other. Registries are where records are maintained. Operations is where unresolved, duplicated, and imported data is processed after or between registry edits.

Current relationships include:

  • Entity resolution: registry pages expose a Resolve action and the unresolved queue lives at /admin/data/operations/entity-resolver
  • Duplicate detection: the Entity Resolution page at /admin/data/operations/duplicates reviews duplicate groups across associations, leagues, seasons, competitions, clubs, teams, players, venues, turfs, businesses, addresses, and contacts
  • Import history: /admin/data/operations/import-history redirects to the shared import jobs surface backed by import_jobs and import_rows
  • Import jobs: execution persists to import_jobs, import_rows, and import_job_files
  • Post-import duplicate detection: duplicate groups can be reviewed after imports so newly introduced duplicates can be consolidated before they spread through downstream workflows

Development Reset Workflow

This is the single authoritative development reset workflow for clearing operational data during testing without breaking the underlying platform infrastructure.

The script clears registry entities, competition structure, games, derived match data, media references, and standings so the platform returns to a clean import-ready state. It preserves lookup tables, labels, roles, permissions, system configuration, and broader infrastructure tables.

Use this workflow when testing Data Registry imports, testing resolver and duplicate detection behavior, or resetting a local development environment back to a clean baseline. The development admin account danny@digitalsteam.ca is preserved by design.

Rules for use:

  • Run it only against a local or throwaway development database
  • Do not run it against shared dev, staging, or production unless that wipe is explicitly intended
  • This reset is intentionally scoped to registry and match-domain data so core platform configuration remains intact
  • The reset temporarily disables trg_prevent_direct_team_delete, trg_prevent_direct_organization_delete, and trg_games_rebuild_standings
  • Lookup tables, labels, roles, permissions, and core platform infrastructure must not be deleted by this workflow
begin;

-- MEDIA

delete from public.media_assets;

-- MATCH DATA

delete from public.match_state_snapshots;
delete from public.game_action_context;
delete from public.game_actions;
delete from public.game_events;
delete from public.game_participations;
delete from public.player_game_stats;
delete from public.player_reported_games;

-- STANDINGS

delete from public.standings;

-- GAMES

alter table public.games disable trigger trg_games_rebuild_standings;

delete from public.games;

alter table public.games enable trigger trg_games_rebuild_standings;

-- COMPETITION STRUCTURE

delete from public.team_competitions;
delete from public.competition_groups;
delete from public.competition_rounds;
delete from public.competition_phases;
delete from public.competition_stages;

-- CORE COMPETITION DATA

delete from public.competitions;
delete from public.leagues;
delete from public.seasons;

-- CORE SPORTS ENTITIES

alter table public.teams disable trigger trg_prevent_direct_team_delete;
alter table public.organizations disable trigger trg_prevent_direct_organization_delete;

delete from public.players;
delete from public.teams;
delete from public.clubs;
delete from public.turfs;
delete from public.venues;
delete from public.regions;
delete from public.organizations;

alter table public.teams enable trigger trg_prevent_direct_team_delete;
alter table public.organizations enable trigger trg_prevent_direct_organization_delete;

commit;

-- REMOVE ALL USERS EXCEPT ADMIN

delete from auth.users
where email <> 'danny@digitalsteam.ca';

-- USER VALIDATION

select email
from auth.users;

Data Registry Delete Script

This script clears all registry data used by the Turfi platform during development or testing. It removes competitions, games, players, teams, clubs, organizations, media, and related event stacks while preserving the database schema.

The script also removes all authentication users except the main administrator.

  • This script is intended for development environments only.
  • It should not be executed in production.
  • The order of deletion is carefully designed to avoid foreign key conflicts.
  • Certain triggers are temporarily disabled to allow safe cleanup.
begin;

--------------------------------------------------
-- MEDIA FIRST (prevents anchor constraint issues)
--------------------------------------------------

delete from public.media_assets;

--------------------------------------------------
-- EVENT STACK
--------------------------------------------------

delete from public.match_state_snapshots;
delete from public.game_action_context;
delete from public.game_actions;
delete from public.game_events;
delete from public.game_participations;
delete from public.player_game_stats;
delete from public.player_reported_games;

--------------------------------------------------
-- STANDINGS CACHE
--------------------------------------------------

delete from public.standings;

--------------------------------------------------
-- GAMES (disable standings rebuild trigger)
--------------------------------------------------

alter table public.games disable trigger trg_games_rebuild_standings;

delete from public.games;

alter table public.games enable trigger trg_games_rebuild_standings;

--------------------------------------------------
-- COMPETITION STRUCTURE
--------------------------------------------------

delete from public.team_competitions;
delete from public.competition_groups;
delete from public.competition_rounds;
delete from public.competition_phases;
delete from public.competition_stages;

--------------------------------------------------
-- COMPETITION ENTITIES
--------------------------------------------------

delete from public.competitions;
delete from public.seasons;

--------------------------------------------------
-- CORE ENTITY STACK
--------------------------------------------------

alter table public.teams disable trigger trg_prevent_direct_team_delete;
alter table public.organizations disable trigger trg_prevent_direct_organization_delete;

delete from public.players;
delete from public.teams;
delete from public.clubs;
delete from public.turfs;
delete from public.venues;
delete from public.regions;
delete from public.organizations;

alter table public.teams enable trigger trg_prevent_direct_team_delete;
alter table public.organizations enable trigger trg_prevent_direct_organization_delete;

commit;

--------------------------------------------------
-- REMOVE ALL AUTH USERS EXCEPT DANNY
--------------------------------------------------

delete from auth.users
where email <> 'danny@digitalsteam.ca';

--------------------------------------------------
-- VALIDATION
--------------------------------------------------

select 'players' as table_name, count(*) as row_count from public.players
union all
select 'teams', count(*) from public.teams
union all
select 'clubs', count(*) from public.clubs
union all
select 'organizations', count(*) from public.organizations
union all
select 'regions', count(*) from public.regions
union all
select 'venues', count(*) from public.venues
union all
select 'turfs', count(*) from public.turfs
union all
select 'seasons', count(*) from public.seasons
union all
select 'competitions', count(*) from public.competitions
union all
select 'team_competitions', count(*) from public.team_competitions
union all
select 'competition_groups', count(*) from public.competition_groups
union all
select 'competition_rounds', count(*) from public.competition_rounds
union all
select 'competition_phases', count(*) from public.competition_phases
union all
select 'competition_stages', count(*) from public.competition_stages
union all
select 'games', count(*) from public.games
union all
select 'game_events', count(*) from public.game_events
union all
select 'game_actions', count(*) from public.game_actions
union all
select 'game_action_context', count(*) from public.game_action_context
union all
select 'game_participations', count(*) from public.game_participations
union all
select 'player_game_stats', count(*) from public.player_game_stats
union all
select 'player_reported_games', count(*) from public.player_reported_games
union all
select 'match_state_snapshots', count(*) from public.match_state_snapshots
union all
select 'media_assets', count(*) from public.media_assets
union all
select 'standings', count(*) from public.standings
order by table_name;

--------------------------------------------------
-- USER CHECK
--------------------------------------------------

select email
from auth.users
order by email;

Turfi Registry Governance and Entity Lifecycle

Turfi is a sports data platform where historical integrity is critical. Matches, player statistics, standings, events, and highlights must remain readable even when structural entities such as organizations, clubs, or teams are removed. Because of this requirement, Turfi uses a controlled entity lifecycle system instead of direct database deletions. This system ensures that administrators can safely remove structural entities while preserving the historical record for compliance, analytics, and user experience.

Platform Philosophy

Sports platforms must preserve historical data. When a club merges, a team is disbanded, or an organization restructures, the match history, player statistics, and highlight references from past seasons must remain intact. Traditional cascade deletion causes:

  • Lost match history — Games become orphaned or disappear when teams are deleted
  • Broken statistics — Player and team stats lose their context when referenced entities vanish
  • Missing highlight references — Video moments and highlights lose team or player identity
  • Invalid player records — Career timelines and participation records become unreadable

A goal scored in 2020 should still display the scorer's name and team in 2030, even if that team no longer exists. Turfi's approach combines:

  • Snapshot identity — Storing names at event creation in fields such as player_name_text and team_name_text
  • Safe entity deletion — Detaching references (SET NULL) instead of deleting rows
  • Dependency inspection — Showing impact before deletion so admins understand consequences
  • Hierarchical governance — Organizations contain clubs, clubs contain teams, teams participate in games

Every deletion follows a controlled path that preserves historical readability.

Platform Hierarchy Model

The structural entity hierarchy is: Organization → Club → Team → Player.

  • Organizations represent federations, leagues, or governing bodies
  • Clubs belong to organizations
  • Teams belong to clubs
  • Players participate on teams

Competition data sits alongside this structure: competitions define seasons and structures; team_competitions register teams into competitions with optional group and seed context; games are played between registered teams; standings track results; events record goals, assists, substitutions; moments capture highlight candidates; stats aggregate player and team performance.

This registration layer makes the competition graph explicit: teams may appear in multiple competitions, competitions may include teams from multiple organizations, and game creation must validate competition membership through team_competitions rather than inferring eligibility from standings or club ownership.

When a structural entity is deleted, competition data must remain. Games, events, and stats reference teams and players; those references are detached (SET NULL) rather than deleted. Snapshot fields such as team_name_text and player_name_text preserve human-readable identity for display.

Historical Identity Protection

Turfi uses snapshot fields so historical records stay readable after entity deletion. Example fields: player_name_text in game_events, team_name_text where applicable. When an event is created, the system stores the current name.

Before deletion:

team_id = UUID
team_name_text = CF Montreal U17

After deletion:

team_id = NULL
team_name_text = CF Montreal U17

The platform's identity resolution logic: if the entity ID exists, display the live profile name; if NULL but snapshot exists, display the snapshot; if both NULL, display "Unidentified".

Database Safety Architecture

The database enforces safe deletion through functions and (where implemented) triggers.

Snapshot triggers such as trg_snapshot_player_name_on_event copy names into snapshot fields on insert.

Dependency inspection functions return counts and records before deletion:

FunctionPurpose
entity_dependency_summary(entity_type, entity_id)Returns a JSON object with counts: matches_affected, standings_affected, player_stats_affected, events_affected, highlights_affected, competitions_affected, teams_affected, clubs_affected. Use before any structural deletion to show impact to the admin.
entity_dependency_details(entity_type, entity_id)Returns grouped records by dependency_type. Use when the admin clicks "View impacted records" to show which specific rows will be detached.
delete_entity_safe(entity_type, entity_id)Detaches all references (SET NULL) and then deletes the entity. Supported types: organization, club, team. Must always be used instead of direct DELETE for these entities.

Direct Delete Protection

The platform design requires that direct DELETE on organizations, clubs, and teams must never be used by the application. Database triggers (trg_prevent_direct_delete_organizations, trg_prevent_direct_delete_clubs, trg_prevent_direct_delete_teams) raise an exception on direct DELETE, forcing all removal through delete_entity_safe. The purpose is to prevent accidental destructive actions—a developer or script that runs DELETE FROM teams would orphan games, standings, and events. The governance system ensures every deletion follows the correct detachment sequence. The delete_entity_safe function sets a session variable to bypass these triggers for its internal DELETEs.

Dependency Inspection System

Before deletion, the admin UI calls entity_dependency_summary to fetch counts (matches_count, standings_count, events_count, competitions_count, teams_count, clubs_count). Optionally, entity_dependency_details returns the actual records grouped by type. This information powers the deletion confirmation dialog: admins see how many matches, standings, and events are affected, and can expand to view specific records. Transparency prevents accidental destructive actions.

Universal Deletion Engine

delete_entity_safe(entity_type, entity_id) is the universal deletion function for structural entities.

  • Team deletion: games (home_team_id, away_team_id) and standings are SET NULL; game_participations and match_lineups are detached; then the team row is deleted.
  • Club deletion: teams are detached (club_id SET NULL); then the club is deleted.
  • Organization deletion: clubs are detached (organization_id SET NULL); then the organization is deleted.

The platform must always use this function for organizations, clubs, and teams.

Admin Registry Deletion Workflow

  1. Step 1: Call entity_dependency_summary when the admin clicks Delete
  2. Step 2: Optionally call entity_dependency_details when the admin clicks "View impacted records"
  3. Step 3: Display deletion impact (counts and optionally records) to the admin
  4. Step 4: Require explicit confirmation (e.g. "Type DELETE to confirm")
  5. Step 5: Call delete_entity_safe to perform the deletion

This workflow ensures transparency and prevents accidental destructive actions. Every deletion is preceded by impact visibility and explicit confirmation.

Application Layer Rules

The application must never execute:

  • DELETE FROM teams
  • DELETE FROM clubs
  • DELETE FROM organizations

All deletions for these entities must go through delete_entity_safe(). Players use archive_player, delete_player_safe, or force_delete_player. Database triggers may enforce this rule by raising on direct DELETE.

Deployment Across Platform Entities

The entity lifecycle model applies to organizations, clubs, teams, and players. Admin tools and APIs must respect the governance system. Any new structural entity type added to the platform must follow the same lifecycle model: dependency inspection, safe deletion function, and no direct DELETE from application code.

Developer Guidelines

  • Never delete structural entities directly
  • Always inspect dependencies before deletion
  • Always preserve historical identity via snapshot fields
  • Always use the universal deletion engine (delete_entity_safe for org/club/team)

Consensus Driven Development Principles

Cross-cutting rules aligned with Platform Philosophy and Scope (Consensus Engine, Truth Layers):

  • Do not assume a single external source of truth for games, events, or stats—design for multiple inputs and reconciliation.
  • Preserve raw inputs and provenance where feasible so disputes can be audited.
  • Support multiple inputs for the same logical game or event (duplicate media, conflicting scores) with explicit resolution paths.
  • Separate proposed from canonical structured data using status, review, or promotion rules appropriate to the domain.
  • Trace canonical facts to underlying events, media, or imports for stats and public outputs.
  • Tolerate conflict—prefer explicit rules and queues over silent last-write wins.

These sit alongside registry-specific rules (deletion, lifecycle) above.

Implementation Check

Periodically verify: all deletion operations use delete_entity_safe() or player lifecycle RPCs; no direct DELETE on organizations, clubs, or teams in services or APIs; admin interfaces call entity_dependency_summary and entity_dependency_details; entity lifecycle logic is applied consistently.

Entity Resolution System

Status: IMPLEMENTED

Turfi's Entity Resolution System now includes a platform-wide duplicate detection and merge engine for Data Registry entities, exposed in Admin -> Operations -> Entity Resolution.

This system exists to:

  • maintain clean master data
  • prevent duplicate entities during imports and repeated admin edits
  • allow administrators to consolidate duplicate records safely
  • preserve relational integrity while entities are merged

Duplicate groups are detected by the system and presented for administrator review. An operator selects the record to keep, reviews dependency impact, and merges the remaining duplicates into that surviving record.

Supported Registry Entities

The duplicate-resolution engine covers the same entities surfaced in Registries:

Entity TypeDatabase TableUI Label
organizationorganizationsAssociations
leagueleaguesLeagues
seasonseasonsSeasons
competitioncompetitionsCompetitions
clubclubsClubs
teamteamsTeams
playerplayersPlayers
venuevenuesVenues
turfturfsTurfs
businessbusinessesBusinesses
addressaddressesAddresses
contactcontactsContacts

Duplicate Detection

Duplicate groups are generated from shared detection signals rather than one hard-coded rule per entity.

Detection inputs may include:

  • normalized name comparison
  • slug similarity
  • city matching
  • shared contact information
  • string similarity scoring

The system groups records into duplicate groups when those signals suggest that multiple rows likely describe the same real-world entity. Each group contains two or more candidate records and a confidence score that helps the operator prioritize review.

Detection flow:

Registry Records
      ↓
Duplicate Detection
      ↓
Duplicate Groups
      ↓
Admin Review
      ↓
Merge Engine

Merge Engine

Core merge function:

public.merge_entity_safe(entity_type, keep_id, merge_ids)

Parameters:

  • entity_type — the entity category being merged
  • keep_id — the surviving record that remains after the merge
  • merge_ids — array of duplicate record IDs that should be consolidated into the surviving record

Behavior:

  • all supported foreign key references are reassigned to the kept record
  • duplicate rows are deleted after rewiring completes
  • merge activity is logged for audit review

The merge UI also previews dependency impact before confirmation through entity_dependency_counts(...), so operators can see what will be affected before the transaction is committed.

Relationship Integrity

Turfi preserves relational integrity by using an entity reference map inside the merge engine. Before a duplicate record is deleted, the system determines where that entity type is referenced and rewires those relationships to the surviving record.

Examples:

  • organizations -> clubs.organization_id
  • clubs -> teams.club_id
  • teams -> games.home_team_id and games.away_team_id
  • venues -> turfs.venue_id
  • leagues -> competitions.league_id

This prevents dangling references and allows the platform to consolidate duplicate master data without breaking downstream competition, infrastructure, or contact relationships.

Merge Logging

Merge operations are recorded in:

entity_merge_log

Documented fields:

  • entity_type
  • kept_entity_id
  • merged_entity_id
  • action_type
  • performed_at
  • performed_by
  • notes

This audit trail gives administrators a durable record of who merged what, when the action happened, and what notes or context were attached to the operation.

Safety Mechanisms

All merge operations run inside database transactions.

Safety guarantees:

  • the merge either completes fully or not at all
  • foreign key rewiring happens before duplicate rows are removed
  • if any step fails, the transaction is rolled back
  • partial merges are therefore not allowed to persist

This transaction-first design is essential because duplicate resolution touches many dependent tables and cannot leave the platform in a half-rewired state.

Resolution Architecture Diagram

Entity Resolution Engine
        ├── Duplicate Detection
        ├── Duplicate Groups
        └── Merge Engine
                 ├── Relationship Rewiring
                 └── Merge Log

Related Unresolved Reference Resolver

Turfi also keeps the unresolved text-reference resolver for non-duplicate workflows. That companion system automatically attaches entity IDs when import or ingestion data contains plain-text names that can be matched safely, and it surfaces unresolved references through api_entity_resolution_queue when manual review is needed.

Player Career Timeline System

Status: IMPLEMENTED

The player career timeline is generated automatically via database triggers when game events are inserted.

Trigger: rebuild_player_career_timeline()

When events such as goals, assists, or substitutions are inserted into game_events, a trigger executes rebuild_player_career_timeline(). This function creates or updates entries in:

Table: player_career_entries

FieldDescription
player_idLink to the player
entry_typeType of career event (goal, assist, appearance, etc.)
entry_dateDate of the event
game_idLink to the game
source_typeOrigin of the entry (game_event, import, etc.)

This system powers player timelines and highlight feeds. Career entries drive statistics, activity feeds, and scouting views.

Ingestion Testing and Validation

Status: IMPLEMENTED (validated workflows)

This section documents the testing workflow used to validate the ingestion system and the systems that were successfully validated.

Test Dataset Creation Procedure

Steps performed to validate the ingestion pipeline:

  1. Create test organization — Insert an organization representing an association or league
  2. Create clubs — Create clubs that belong to the organization
  3. Create teams — Create teams belonging to clubs and the organization
  4. Insert players — Add players with club_text for import compatibility
  5. Create game — Create a game referencing the teams
  6. Insert game events — Add goals, assists, or other events referencing players
  7. Verify career timeline generation — Confirm player_career_entries are populated by the trigger
  8. Test auto resolver with exact match — Insert event with player_name_text matching existing player; verify auto-link
  9. Test unresolved entity behavior — Insert event with player_name_text = "Josh Trembley" (intentional typo); verify entry appears in api_entity_resolution_queue; resolve via Entity Resolver

System Validation Results

The following systems were successfully validated:

SystemStatus
Organization hierarchyValidated
Club and team relationshipsValidated
Player creationValidated
Game creationValidated
Game event ingestionValidated
Player career timeline triggerValidated
Auto entity resolutionValidated
Resolution queue generationValidated

Future Data Governance Systems

Status: PLANNED

The following systems are still planned for the Data -> Operations workspace:

SystemDescription
Import Error MonitoringDashboard and alerting for import failures, partial completions, and resolution queue growth

These future systems now sit alongside the implemented Entity Resolution and Import Engine workflows rather than replacing them.

Platform Data Model

This section summarizes the core data entities developers work with across Turfi and shows how they connect in current architecture.

Users

Represents authenticated platform accounts and profile state.

Stored in

  • auth.users
  • users
  • user_roles
  • roles

Connected to

  • api_user_workspace_summary
  • auth session lifecycle
  • onboarding and account management flows

Players

Represents athlete identity and player-facing performance/media context.

Stored in

  • api_player_profiles
  • api_player_competition_analytics
  • api_player_matches
  • api_player_trends

Connected to

  • api_games
  • api_player_game_stats
  • api_player_media
  • api_player_activity

Teams

Represents competitive team entities participating in games and competitions.

Stored in

  • api_games (team context fields in current app read model)

Connected to

  • competitions (through game context)
  • games
  • match timeline/stat contexts

Competitions

Represents leagues/tournaments and competition context for games and player analytics.

Stored in

  • api_games (competition context fields)
  • api_player_competition_analytics

Connected to

  • teams
  • games
  • player competition analytics

Games

Represents official matches and their canonical runtime context.

Stored in

  • api_games

Connected to

  • api_match_timeline_unified
  • api_player_game_stats
  • video_moments
  • highlight_collections

Media

Represents playable media assets used in match and player experiences.

Stored in

  • api_player_media
  • highlight_collections

Connected to

  • players
  • games
  • moments
  • broadcast/playback surfaces

Moments

Represents timestamped clips/events tied to a game timeline.

Stored in

  • video_moments
  • api_match_timeline_unified (timeline context)

Connected to

  • games
  • media/highlights
  • match intelligence views

Activities

Represents feed-style activity records shown in player/social contexts.

Stored in

  • api_player_activity

Connected to

  • players
  • media events
  • match/competition event outcomes

Event and Trigger Architecture

This subsection documents database-driven automation flows that connect events, derived context, and system side effects.

Trigger Flow Pattern

Game event inserted Trigger/function pipeline runs Related context rows update Candidate/moment/highlight records update Downstream feed/stat/standing views reflect the new state

Match event insertion

Entry tables

  • game_events
  • game_actions

Derived tables updated by automation pipeline

  • match_event_context
  • play_sequences
  • play_sequence_actions
  • player_game_stats
  • player_competition_stats

Functions/triggers responsible

  • The repository confirms the data model for this pipeline, but does not expose explicit trigger/function names for these match-domain tables.
  • Use the Supabase trigger catalog (pg_trigger + pg_proc) in the linked project to list the exact trigger/function identifiers for this flow.

Play sequence updates

Entry/derived tables

  • play_sequences
  • play_sequence_actions
  • game_action_context

Functions/triggers responsible

  • Sequence update triggers/functions are schema-driven, but exact names are not currently present in repository code exports.
  • Operational behavior is visible through table relationships and consumed read models.

Moment candidate creation

Entry/derived tables

  • moment_candidates
  • game_actions
  • game_events
  • match_event_context

Functions/triggers responsible

  • Candidate-creation trigger/function names are not currently exposed in repository files.
  • The linked schema confirms candidate tables and downstream highlight/moment tables used by this workflow.

Highlight detection

Entry/derived tables

  • moment_candidates
  • highlight_moments
  • video_moments
  • highlight_collections
  • highlight_items

Functions/triggers responsible

  • Highlight detection persistence is represented at table level; explicit trigger/function names are not exported in the current code snapshot.
  • Current app implementation consumes generated moments/highlights through read endpoints.

Standings recalculation

Entry/derived tables

  • games
  • game_events
  • standings

Functions/triggers responsible

  • Standings update automation is reflected in schema tables and app read paths.
  • Exact standings trigger/function names are not exposed in repository code and should be confirmed directly from Supabase trigger metadata.

Activity feed event generation

Entry/derived tables

  • game_events
  • highlight_collections
  • activity_feed
  • activity_comments
  • activity_likes

Functions/triggers responsible

  • Feed-generation workflow is represented by schema tables and consumed activity views.
  • Explicit trigger/function identifiers for feed event insertion are not present in current repository exports.

Confirmed trigger-backed identity workflows

These trigger/function mappings are explicitly confirmed in project schema snapshots:

  • auth.users -> trigger on_auth_user_created -> function handle_new_auth_user()
  • auth.users -> trigger on_auth_user_updated_sync_public_user -> function sync_auth_user_verification()
  • public.user_roles -> trigger trg_refresh_workspace_members_from_user_roles -> function trg_refresh_workspace_members_from_user_roles()
  • public.users -> trigger assign_default_player_role / trg_assign_default_player_role -> function assign_default_player_role()
  • public.users -> trigger set_users_updated_at -> function set_updated_at()
  • public.users -> trigger trg_update_user_location -> function update_user_location()

Registry Identity Engine

Turfi registry entities use a standardized identity model so imports, URLs, admin editing, and cross-system references all point at the same conceptual record.

Registry identity fields:

  • id
  • key
  • slug
  • name
  • normalized_name where applicable
  • status
  • created_at
  • updated_at

Identity roles:

  • uuid = database identity
  • key = import identity
  • slug = URL identity
  • name = human readable label
  • normalized_name = duplicate-detection/search identity where needed
  • status = lifecycle identity

Database identity automation:

  • turfi_slugify()
  • turfi_registry_identity()

Trigger behavior:

  • If slug is missing, generate it from name
  • If key is missing, generate it from name
  • Always update updated_at
  • Lifecycle-aware registry UIs default to status = active

Registry tables covered by this system:

  • organizations
  • leagues
  • seasons
  • competitions
  • clubs
  • teams
  • players
  • venues
  • turfs
  • businesses

Players still use their specialized identity pipeline for resolver, claim, and duplicate workflows, but they are part of the standardized registry column model and lifecycle policy.