Full-Stack Engineering

Database Schema Design for Production Applications

Uvin Vindula·January 6, 2025·12 min read

Last updated: April 14, 2026

TL;DR

Your database schema is the foundation everything else is built on. Get it wrong, and you'll spend months patching queries, fighting migrations, and explaining to clients why their app is slow. Get it right, and everything — from API design to frontend data fetching — falls into place naturally. This article covers the exact PostgreSQL schema design patterns I use in every production application I build through Supabase: naming conventions that scale, data type decisions you can't easily undo, relationship design, indexing strategy, the UUID vs serial debate, timestamps and soft deletes, migration workflows, and the common mistakes I see developers make repeatedly. Every pattern comes from real projects — EuroParts Lanka, uvin.lk, and FreshMart — that are serving real users right now.

Schema Design Principles

Before I write a single CREATE TABLE statement, I follow five principles that have saved me from painful rewrites on every project I've shipped.

1. Design for queries, not for storage.

Most developers design schemas by thinking about what data they need to store. That's backwards. I start by listing every query the application will need to run, then design tables that make those queries fast and simple. On EuroParts Lanka, I knew the most common query would be "find all parts compatible with this vehicle make, model, and year." That single requirement shaped the entire parts schema — the compatibility junction table, the indexes, even the column order.

2. Normalise until it hurts, then denormalise where it matters.

Third normal form is the starting point, not the end goal. I normalise aggressively to eliminate data duplication, then selectively denormalise where query performance demands it. On FreshMart, the order items table stores a snapshot of the product price at the time of purchase — that's intentional denormalisation. If the product price changes tomorrow, historical orders must remain accurate.

3. Every table gets a clear owner.

If I can't answer "who is responsible for the data in this table?" in one sentence, the schema is wrong. On uvin.lk, the lesson_progress table is owned by the user — it tracks their individual progress through course content. The lessons table is owned by the system — it's authored content that users consume but never modify. This distinction drives everything from RLS policies to API permissions.

4. Constraints are documentation that the database enforces.

I use NOT NULL, CHECK, UNIQUE, and foreign key constraints aggressively. They're not just data integrity tools — they're living documentation. When another developer reads my schema, the constraints tell them the business rules without needing a separate spec document.

5. Plan for the schema to change.

Every schema will evolve. I design with migration-friendliness in mind: avoid columns that are expensive to alter, use sensible defaults, and never paint myself into a corner with overly rigid constraints that will need to be dropped later.

Naming Conventions

Naming consistency is the cheapest investment with the highest return in database design. Here are the conventions I use across every project.

Tables: plural, snake_case.

sql

-- Yes
CREATE TABLE products (...);
CREATE TABLE order_items (...);
CREATE TABLE vehicle_compatibilities (...);

-- No
CREATE TABLE Product (...);
CREATE TABLE OrderItem (...);
CREATE TABLE tbl_products (...);

Plural because a table is a collection. Snake_case because PostgreSQL folds unquoted identifiers to lowercase anyway — fighting it with camelCase or PascalCase just creates confusion.

Columns: singular, snake_case, no table prefix.

sql

-- Yes
CREATE TABLE products (
  id UUID PRIMARY KEY,
  name TEXT NOT NULL,
  sku VARCHAR(50) UNIQUE NOT NULL,
  unit_price NUMERIC(10, 2) NOT NULL
);

-- No (don't prefix columns with table name)
CREATE TABLE products (
  product_id UUID PRIMARY KEY,
  product_name TEXT NOT NULL,
  product_sku VARCHAR(50) NOT NULL
);

The one exception: foreign keys. I always suffix foreign key columns with _id and prefix with the referenced table name in singular form:

sql

CREATE TABLE order_items (
  id UUID PRIMARY KEY,
  order_id UUID NOT NULL REFERENCES orders(id),
  product_id UUID NOT NULL REFERENCES products(id),
  quantity INTEGER NOT NULL CHECK (quantity > 0),
  unit_price NUMERIC(10, 2) NOT NULL
);

Indexes: describe what they index.

sql

CREATE INDEX idx_products_sku ON products(sku);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE UNIQUE INDEX idx_users_email ON users(email);

Enums: singular, snake_case.

sql

CREATE TYPE order_status AS ENUM (
  'pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled'
);

These conventions aren't arbitrary. They're the result of working across multiple projects where consistency between EuroParts Lanka's parts catalog, uvin.lk's learning platform, and FreshMart's e-commerce system means I can context-switch without mentally translating naming patterns.

Data Types — Get Them Right First Time

Changing a column's data type on a table with millions of rows is painful. It locks the table, rewrites every row, and can take minutes on large datasets. Get the types right from the start.

Money: NUMERIC, never FLOAT.

sql

-- Correct: exact decimal arithmetic
unit_price NUMERIC(10, 2) NOT NULL,
total_amount NUMERIC(12, 2) NOT NULL,

-- Wrong: floating point introduces rounding errors
unit_price FLOAT NOT NULL, -- 0.1 + 0.2 = 0.30000000000000004

On EuroParts Lanka, a single rounding error in parts pricing would cascade through invoices, reports, and accounting. NUMERIC(10, 2) gives me exact decimal math with up to 99,999,999.99 — more than enough for auto parts pricing in LKR, GBP, or USD.

Text: TEXT over VARCHAR in most cases.

sql

-- For unconstrained text
description TEXT,
notes TEXT,

-- For text with a business-rule length constraint
sku VARCHAR(50) NOT NULL,
phone VARCHAR(20),
postal_code VARCHAR(10),

PostgreSQL stores TEXT and VARCHAR identically internally. The only reason to use VARCHAR(n) is when the length constraint is a genuine business rule — like SKU codes that must be under 50 characters — not an arbitrary guess.

Timestamps: TIMESTAMPTZ, always.

sql

-- Correct: timezone-aware
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),

-- Wrong: loses timezone context
created_at TIMESTAMP NOT NULL DEFAULT NOW(),

I learned this the hard way. EuroParts Lanka has customers in Sri Lanka (UTC+5:30) and suppliers in the UK (UTC+0/+1). Without TIMESTAMPTZ, order timestamps become ambiguous. PostgreSQL stores everything in UTC internally and converts on output — but only if you use TIMESTAMPTZ.

Boolean: for two states. Enum: for three or more.

sql

-- Two states: boolean
is_active BOOLEAN NOT NULL DEFAULT true,
is_verified BOOLEAN NOT NULL DEFAULT false,

-- Three or more states: enum
status order_status NOT NULL DEFAULT 'pending',

Don't use a VARCHAR column with string values like 'active' or 'inactive'. Enums are type-safe, self-documenting, and PostgreSQL stores them as integers internally — far more efficient.

JSON: JSONB, never JSON.

sql

-- Correct: binary format, indexable, faster queries
metadata JSONB DEFAULT '{}',

-- Wrong: stored as text, must be reparsed on every access
metadata JSON DEFAULT '{}',

On uvin.lk, I store lesson metadata (estimated duration, difficulty tags, prerequisite hints) in a JSONB column. It's flexible enough to evolve without migrations, and I can index specific paths with GIN indexes when query patterns emerge.

Relationships and Foreign Keys

Relationships are where schema design gets real. Here's how I handle each type, with examples from production projects.

One-to-Many: the workhorse.

sql

-- EuroParts Lanka: one category has many products
CREATE TABLE categories (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  slug TEXT UNIQUE NOT NULL,
  parent_id UUID REFERENCES categories(id),
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE products (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  category_id UUID NOT NULL REFERENCES categories(id),
  name TEXT NOT NULL,
  sku VARCHAR(50) UNIQUE NOT NULL,
  unit_price NUMERIC(10, 2) NOT NULL,
  stock_quantity INTEGER NOT NULL DEFAULT 0 CHECK (stock_quantity >= 0),
  is_active BOOLEAN NOT NULL DEFAULT true,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Notice the parent_id self-reference on categories — this gives me a tree structure for nested categories (Engine Parts > Pistons > Piston Rings) without needing a separate hierarchy table.

Many-to-Many: junction tables with payload.

sql

-- EuroParts Lanka: products compatible with multiple vehicles
CREATE TABLE vehicles (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  make TEXT NOT NULL,
  model TEXT NOT NULL,
  year_start INTEGER NOT NULL,
  year_end INTEGER,
  engine_code VARCHAR(20),
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE product_vehicle_compatibilities (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  product_id UUID NOT NULL REFERENCES products(id) ON DELETE CASCADE,
  vehicle_id UUID NOT NULL REFERENCES vehicles(id) ON DELETE CASCADE,
  notes TEXT,
  verified_at TIMESTAMPTZ,
  UNIQUE(product_id, vehicle_id)
);

The junction table has its own id, notes, and verified_at columns — it carries payload beyond just the relationship. The UNIQUE constraint prevents duplicate compatibility entries, and ON DELETE CASCADE ensures orphaned rows are cleaned up automatically.

One-to-One: rare, but intentional.

sql

-- uvin.lk: user profile extends the auth user
CREATE TABLE user_profiles (
  id UUID PRIMARY KEY REFERENCES auth.users(id) ON DELETE CASCADE,
  display_name TEXT,
  bio TEXT,
  avatar_url TEXT,
  timezone TEXT NOT NULL DEFAULT 'UTC',
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

The primary key is also the foreign key — this enforces the one-to-one relationship at the database level. On uvin.lk, the auth.users table is managed by Supabase Auth, so I extend it with a user_profiles table for application-specific data.

Foreign key actions: be explicit.

sql

-- CASCADE: delete child rows when parent is deleted
REFERENCES orders(id) ON DELETE CASCADE

-- RESTRICT: prevent deletion if children exist (default, but be explicit)
REFERENCES categories(id) ON DELETE RESTRICT

-- SET NULL: nullify the FK when parent is deleted
REFERENCES assigned_to(id) ON DELETE SET NULL

I choose the action based on the business rule, not convenience. On FreshMart, deleting a product should not cascade-delete historical order items — that would destroy financial records. The order items reference is ON DELETE RESTRICT.

Indexes — When and What to Index

Indexes are the single most impactful performance tool in PostgreSQL. They're also the most misunderstood.

Rule 1: Index foreign keys. Always.

sql

-- Every FK column gets an index
CREATE INDEX idx_products_category_id ON products(category_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);

PostgreSQL does not automatically index foreign keys. Without these indexes, every JOIN and every ON DELETE CASCADE triggers a sequential scan on the child table. On EuroParts Lanka, adding the missing FK index on order_items.product_id dropped a product detail page query from 340ms to 8ms.

Rule 2: Index columns you filter and sort by.

sql

-- FreshMart: customers browse products by category, sorted by price
CREATE INDEX idx_products_category_active
  ON products(category_id, unit_price)
  WHERE is_active = true;

This is a partial index — it only indexes active products. Since 90% of queries filter by is_active = true, the index is smaller and faster than indexing all products.

Rule 3: Composite indexes follow the query pattern.

sql

-- EuroParts Lanka: find parts by vehicle make + model + year
CREATE INDEX idx_vehicles_make_model_year
  ON vehicles(make, model, year_start);

Column order matters. The leftmost columns are used first. If you query by make alone, this index helps. If you query by model alone, it doesn't.

Rule 4: Use EXPLAIN ANALYZE before creating, and after.

sql

-- Before: check what the query planner is doing
EXPLAIN ANALYZE
SELECT p.name, p.sku, p.unit_price
FROM products p
JOIN product_vehicle_compatibilities pvc ON pvc.product_id = p.id
JOIN vehicles v ON v.id = pvc.vehicle_id
WHERE v.make = 'Toyota' AND v.model = 'Corolla' AND v.year_start <= 2020
ORDER BY p.unit_price ASC;

I run EXPLAIN ANALYZE on every slow query before reaching for an index. Sometimes the issue is a missing WHERE clause, not a missing index. Sometimes the index already exists but PostgreSQL isn't using it because the table is too small for an index scan to be worthwhile.

Rule 5: Don't over-index.

Every index slows down INSERT, UPDATE, and DELETE operations because the index must be maintained alongside the table. On FreshMart, the cart_items table has high write throughput — items are added and removed constantly. I keep indexes minimal on that table: just the foreign keys and a composite unique constraint.

sql

CREATE TABLE cart_items (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  cart_id UUID NOT NULL REFERENCES carts(id) ON DELETE CASCADE,
  product_id UUID NOT NULL REFERENCES products(id),
  quantity INTEGER NOT NULL CHECK (quantity > 0),
  added_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE(cart_id, product_id)
);

CREATE INDEX idx_cart_items_cart_id ON cart_items(cart_id);
-- No additional indexes — this table is write-heavy

UUID vs Serial IDs

This is one of the most debated topics in PostgreSQL schema design. I've used both extensively and settled on UUIDs as my default for every new project.

Why I use UUIDs:

sql

-- UUID primary keys
CREATE TABLE orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  status order_status NOT NULL DEFAULT 'pending',
  total_amount NUMERIC(12, 2) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

No ID enumeration. Serial IDs expose your data volume and make IDOR attacks trivial. If order 1547 exists, an attacker knows to try 1548. With UUIDs, guessing a valid ID is effectively impossible.

Client-side generation. I can generate IDs on the client before inserting, which simplifies optimistic UI updates and offline-first patterns. On uvin.lk, lesson progress is tracked with client-generated UUIDs that sync when the connection returns.

Multi-system compatibility. EuroParts Lanka integrates with supplier systems that have their own IDs. UUIDs avoid conflicts when merging data across systems.

Supabase alignment. Supabase uses UUIDs for auth.users IDs. Using UUIDs everywhere means consistent ID types across the entire schema — no type casting between UUID and INTEGER in joins.

The trade-offs I accept:

UUIDs are 16 bytes vs 4 bytes for integers. On a table with 10 million rows, that's ~120MB of extra storage for the ID column alone. For my projects, this is negligible.
UUIDs are harder to read in logs and debugging. I mitigate this by always logging additional context (user email, order number) alongside IDs.
Random UUIDs cause B-tree index fragmentation. PostgreSQL's gen_random_uuid() generates v4 UUIDs which are random. For extremely high-write tables, I'd consider UUIDv7 (time-ordered), but I haven't hit this ceiling on any project yet.

When I still use serial:

sql

-- Sequence-based IDs for human-readable references
CREATE TABLE orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  order_number SERIAL UNIQUE, -- human-readable: #1001, #1002
  -- ...
);

On EuroParts Lanka, customers reference orders by a human-readable order number (EP-1547), not by UUID. The UUID is the primary key for system use; the order number is for human use.

Timestamps and Soft Deletes

Every table in every project I build gets timestamps. No exceptions.

The standard timestamp pattern:

sql

CREATE TABLE products (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  -- ... business columns ...
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Auto-update updated_at with a trigger
CREATE OR REPLACE FUNCTION update_updated_at()
RETURNS TRIGGER AS $$
BEGIN
  NEW.updated_at = NOW();
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER set_updated_at
  BEFORE UPDATE ON products
  FOR EACH ROW
  EXECUTE FUNCTION update_updated_at();

I create this trigger function once and reuse it across every table. On Supabase, I add this in my initial migration and apply the trigger to every table that has updated_at.

Soft deletes: use them where business logic demands it.

sql

CREATE TABLE products (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  sku VARCHAR(50) UNIQUE NOT NULL,
  -- ... other columns ...
  deleted_at TIMESTAMPTZ, -- NULL means active, timestamp means soft-deleted
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Default view excludes soft-deleted rows
CREATE VIEW active_products AS
SELECT * FROM products WHERE deleted_at IS NULL;

On EuroParts Lanka, I soft-delete products rather than hard-deleting them because historical order items reference those products. If I hard-delete a product, the order history becomes incomplete. Soft deletes preserve referential integrity while hiding the product from active listings.

On FreshMart, I also soft-delete user accounts. When a customer requests account deletion (GDPR compliance), I soft-delete the account, anonymise PII, but preserve order history for financial reporting.

When I hard-delete:

Cart items: ephemeral data with no historical value.
Session records: no reason to keep expired sessions.
Temporary upload records: cleaned up after processing.

The rule is simple: if any other table references it, or if the business needs a historical record, soft-delete. Otherwise, hard-delete.

Migration Strategy

Schema changes in production are the highest-risk database operations. Here's how I handle them.

Supabase migrations with version control:

bash

# Generate a new migration
supabase migration new add_product_tags

# This creates a timestamped file
# supabase/migrations/20250106090000_add_product_tags.sql

Every migration is a SQL file in version control. No manual schema changes in production, ever. If it's not in a migration file, it doesn't exist.

Migration structure:

sql

-- supabase/migrations/20250106090000_add_product_tags.sql

-- Add tags column to products (JSONB for flexible tagging)
ALTER TABLE products
ADD COLUMN tags JSONB DEFAULT '[]';

-- Create GIN index for tag queries
CREATE INDEX idx_products_tags ON products USING GIN (tags);

-- Backfill existing products with empty tags
-- (Already handled by DEFAULT, but explicit for clarity)
UPDATE products SET tags = '[]' WHERE tags IS NULL;

-- Add constraint to ensure tags is always an array
ALTER TABLE products
ADD CONSTRAINT chk_products_tags_is_array
CHECK (jsonb_typeof(tags) = 'array');

My migration rules:

One concern per migration. Don't mix adding a column with modifying a trigger. If something goes wrong, I need to know exactly which change caused it.

Always additive first. Add new columns with defaults before modifying application code. This prevents downtime — the old code works with the new schema because the new columns have defaults.

Never rename in production. Instead, add the new column, migrate data, update application code, then drop the old column in a later migration. On uvin.lk, when I renamed course_modules to modules, I did it in three migrations spread across two deployments.

Test migrations on a branch database. Supabase branching lets me test migrations against a copy of production data before applying them to the real database.

bash

# Create a branch for testing
supabase db branch create feature/product-tags

# Apply and test the migration
supabase db push --db-url $BRANCH_URL

# If everything works, merge to main
supabase db branch merge feature/product-tags

Common Schema Mistakes

These are the mistakes I see most often when reviewing schemas — and the ones I've made myself early in my career.

Mistake 1: Storing calculated values.

sql

-- Wrong: storing a value that can be calculated
CREATE TABLE orders (
  id UUID PRIMARY KEY,
  subtotal NUMERIC(12, 2),
  tax_amount NUMERIC(12, 2),
  total_amount NUMERIC(12, 2) -- This is just subtotal + tax_amount
);

-- Better: calculate at query time or use a generated column
CREATE TABLE orders (
  id UUID PRIMARY KEY,
  subtotal NUMERIC(12, 2) NOT NULL,
  tax_rate NUMERIC(5, 4) NOT NULL DEFAULT 0.0,
  total_amount NUMERIC(12, 2) GENERATED ALWAYS AS (
    subtotal + (subtotal * tax_rate)
  ) STORED
);

Generated columns keep the calculation in the database, always consistent, without requiring application-level logic.

Mistake 2: The "God table."

I once saw a users table with 47 columns — everything from login credentials to shipping preferences to notification settings. The fix is simple: split by concern.

sql

-- Split by concern
CREATE TABLE users (id UUID PRIMARY KEY, email TEXT UNIQUE NOT NULL, ...);
CREATE TABLE user_profiles (id UUID PRIMARY KEY REFERENCES users(id), ...);
CREATE TABLE user_preferences (id UUID PRIMARY KEY REFERENCES users(id), ...);
CREATE TABLE user_addresses (id UUID PRIMARY KEY, user_id UUID REFERENCES users(id), ...);

Mistake 3: No constraints.

sql

-- Dangerous: accepts any garbage data
CREATE TABLE order_items (
  quantity INTEGER,
  unit_price NUMERIC
);

-- Correct: constraints enforce business rules
CREATE TABLE order_items (
  quantity INTEGER NOT NULL CHECK (quantity > 0),
  unit_price NUMERIC(10, 2) NOT NULL CHECK (unit_price >= 0)
);

Every column should answer: Can this be null? What values are valid? Is this unique? The constraints encode business rules that the application layer might forget to check.

Mistake 4: Premature JSON columns.

sql

-- Tempting but dangerous: stuffing structured data into JSON
CREATE TABLE orders (
  id UUID PRIMARY KEY,
  customer_data JSONB, -- name, email, phone, address all in JSON
  items JSONB          -- array of order items in JSON
);

JSON is great for truly unstructured or variable data — metadata, feature flags, configuration. It's terrible for core business data that you need to query, join, and enforce constraints on. If you know the shape of the data, use proper columns and tables.

Mistake 5: Ignoring time zones.

I covered this in the data types section, but it bears repeating: always use TIMESTAMPTZ. I've debugged timezone issues on three different projects, and every single one was caused by using TIMESTAMP instead of TIMESTAMPTZ.

My Schema Design Process

Here's the exact process I follow when designing a schema for a new project. I'll use the uvin.lk learning platform as a real example.

Step 1: List the entities.

I start by listing every noun in the product requirements:

Users, Modules, Lessons, Lesson Progress, Achievements, Certificates

Step 2: Define the relationships.

User --(has many)--> Lesson Progress
Module --(has many)--> Lessons
Lesson --(has many)--> Lesson Progress
User --(has many)--> Achievements
User --(has many)--> Certificates
Module --(has many)--> Certificates (completion cert per module)

Step 3: Write the schema.

sql

-- uvin.lk: learning platform schema

CREATE TABLE modules (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  title TEXT NOT NULL,
  slug TEXT UNIQUE NOT NULL,
  description TEXT,
  difficulty TEXT NOT NULL CHECK (difficulty IN ('beginner', 'intermediate', 'advanced')),
  estimated_hours NUMERIC(4, 1),
  is_published BOOLEAN NOT NULL DEFAULT false,
  sort_order INTEGER NOT NULL DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE lessons (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  module_id UUID NOT NULL REFERENCES modules(id) ON DELETE CASCADE,
  title TEXT NOT NULL,
  slug TEXT NOT NULL,
  content TEXT,
  video_url TEXT,
  sort_order INTEGER NOT NULL DEFAULT 0,
  duration_minutes INTEGER,
  is_published BOOLEAN NOT NULL DEFAULT false,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE(module_id, slug)
);

CREATE TABLE lesson_progress (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE,
  lesson_id UUID NOT NULL REFERENCES lessons(id) ON DELETE CASCADE,
  status TEXT NOT NULL DEFAULT 'not_started'
    CHECK (status IN ('not_started', 'in_progress', 'completed')),
  started_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ,
  last_position_seconds INTEGER DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE(user_id, lesson_id)
);

-- Indexes for common queries
CREATE INDEX idx_lessons_module_id ON lessons(module_id);
CREATE INDEX idx_lesson_progress_user_id ON lesson_progress(user_id);
CREATE INDEX idx_lesson_progress_lesson_id ON lesson_progress(lesson_id);
CREATE INDEX idx_lesson_progress_status ON lesson_progress(user_id, status);

Step 4: Write the critical queries and EXPLAIN ANALYZE them.

sql

-- "Show module progress for a user" — the most common query on uvin.lk
SELECT
  m.id,
  m.title,
  m.slug,
  COUNT(l.id) AS total_lessons,
  COUNT(lp.id) FILTER (WHERE lp.status = 'completed') AS completed_lessons,
  ROUND(
    COUNT(lp.id) FILTER (WHERE lp.status = 'completed')::NUMERIC
    / NULLIF(COUNT(l.id), 0) * 100, 1
  ) AS progress_percentage
FROM modules m
LEFT JOIN lessons l ON l.module_id = m.id AND l.is_published = true
LEFT JOIN lesson_progress lp ON lp.lesson_id = l.id AND lp.user_id = $1
WHERE m.is_published = true
GROUP BY m.id
ORDER BY m.sort_order;

I run this against test data, check the query plan, and adjust indexes if needed. The idx_lesson_progress_status composite index exists specifically because of this query.

Step 5: Add RLS policies (Supabase-specific).

sql

-- Users can only read their own progress
ALTER TABLE lesson_progress ENABLE ROW LEVEL SECURITY;

CREATE POLICY "Users read own progress"
  ON lesson_progress FOR SELECT
  USING (auth.uid() = user_id);

CREATE POLICY "Users insert own progress"
  ON lesson_progress FOR INSERT
  WITH CHECK (auth.uid() = user_id);

CREATE POLICY "Users update own progress"
  ON lesson_progress FOR UPDATE
  USING (auth.uid() = user_id)
  WITH CHECK (auth.uid() = user_id);

RLS policies are part of the schema design, not an afterthought. I write them alongside the table definitions because they're fundamental to how the data is accessed.

Step 6: Document and commit.

Every schema gets a migration file in version control with clear comments explaining the business logic behind each constraint and design decision.

Key Takeaways

Design for queries first. List every query your app will run before writing CREATE TABLE. The schema should make your most common queries simple and fast.

Use TIMESTAMPTZ, NUMERIC for money, TEXT over VARCHAR, JSONB over JSON. These data type decisions are expensive to change later. Get them right on day one.

Index foreign keys and query filter columns. But don't over-index write-heavy tables. Always verify with EXPLAIN ANALYZE.

UUIDs for primary keys. They prevent ID enumeration, enable client-side generation, and align with Supabase's auth system.

Constraints are your best friend. NOT NULL, CHECK, UNIQUE, and foreign keys encode business rules that the database enforces even when the application code forgets.

Soft-delete when history matters. Use deleted_at TIMESTAMPTZ for data referenced by other tables or required for compliance. Hard-delete ephemeral data.

One concern per migration. Additive changes first, destructive changes later. Never rename columns directly in production.

RLS policies are schema design. On Supabase, write Row Level Security policies alongside your table definitions, not as an afterthought.

Normalise by default, denormalise with intention. Every denormalisation should have a documented reason — usually query performance or historical accuracy.

The schema is the contract. Everything else — APIs, frontend, business logic — is built on top of it. Invest the time to get it right.

If you're building a production application and need help designing a database schema that scales — or if you've got a schema that's already causing pain and needs refactoring — check out my services. I design PostgreSQL schemas through Supabase for startups and businesses across Sri Lanka, the UK, and globally. Every project gets the same rigour I apply to my own: proper constraints, strategic indexes, migration workflows, and RLS policies that actually protect your data.

*Written by Uvin Vindula↗ — Full-stack and Web3 engineer building production applications from Sri Lanka and the UK. I design PostgreSQL schemas through Supabase for every project I ship, including EuroParts Lanka↗, uvin.lk↗, and FreshMart. Follow my work at @IAMUVIN↗.*

Working on a Web3 or AI project?

Let's talk↗

Uvin Vindula

Web3 and AI engineer based in Sri Lanka and the UK. Author of The Rise of Bitcoin. Director of Blockchain and Software Solutions at Terra Labz. Founder of uvin.lk — Sri Lanka's Bitcoin education platform with 10,000+ learners.

hello@iamuvin.com uvin.lk↗LinkedIn↗