---
title: A/B Test Setup
category: product
entity_type: skill
price: $15
canonical: https://forgehouse.ai/skills/ab-test-setup/
lang: en
hreflang_alt: https://forgehouse.ai/tr/skiller/ab-test-setup/
last_updated: 2026-06-20
---

# A/B Test Setup

> Plan, design, or implement an A/B test or experiment.

A disciplined framework for planning, designing, and analyzing A/B tests that produce statistically valid, actionable results. It enforces hypothesis-first design, pre-committed sample sizes, and the validity guardrails most teams skip, so your 'winning' variant is real, not noise.

## Use cases
- Writing a strong, falsifiable hypothesis before touching the page
- Calculating required sample size and test duration up front
- Choosing between client-side, server-side, and feature-flag implementation
- Catching a broken experiment with a sample ratio mismatch check
- Segmenting results by device and source to avoid Simpson's Paradox
- Documenting every test into a searchable learning repository

## Benefits
- Stop shipping false positives from peeking and early stopping
- Spend limited test slots on the highest-impact changes via ICE scoring
- Trust your results because validity is verified before any winner is declared
- Compound learnings by spreading winning patterns across the whole site

## What’s included
- Hypothesis structure template (observation, change, effect, audience, metric)
- Sample-size reference tables and duration formula
- Primary, secondary, and guardrail metric selection guide
- Sample ratio mismatch chi-square validity check before analysis
- Pre-launch checklist and peeking-problem safeguards
- Test documentation and learning-repository templates

## Who it’s for
Growth, CRO, and product teams that want experiments grounded in statistical rigor instead of gut feeling and premature wins.

## How it runs
Most experiments die from peeking and wishful math. This one starts with a written hypothesis and a locked sample size, then earns its verdict step by step:
1. Writes the hypothesis in a fixed frame before anything else: because of [observation], we believe [change] will cause [outcome] for [audience], measured by [metric]. No written hypothesis, no test.
2. Calculates the required sample size up front from baseline conversion rate, minimum detectable effect, 95% significance and 80% power, then derives the test duration from daily traffic. The test does not stop before that number is reached.
3. Defines three metric layers: one primary metric that calls the test, secondary metrics that explain why it moved, and guardrail metrics (revenue, bounce, downstream conversion) that kill the test if they degrade.
4. Picks the implementation path (client-side, server-side or feature flag) and walks the pre-launch checklist: variants QA'd, tracking verified, users see the same variant on return visits.
5. Runs an SRM check before reading any result: pulls the actual assignment counts per variant and chi-squares them against the designed split. p below 0.01 means randomization or tracking is broken and the result is not interpreted at all.
6. Reads the outcome on three axes (sample size reached, statistically significant, practically meaningful), then segments by device, traffic source and new/returning to catch Simpson's paradox before any winner is declared, and logs the test in the learning repository so failed ideas are never re-run.

## FAQ
### Does this lock me into a specific testing platform like Optimizely or VWO?
No, it covers client-side, server-side and feature-flag implementations, so you pick the method that fits your stack. The framework sits above the tool, not inside one.

### I can already tell when a variant is winning, so why pre-commit a sample size?
Calling a winner early is the easiest way to mistake noise for a result, which is what the pre-committed sample size prevents. Without it, the same data can look like a win one day and a loss the next.

### Will it build the variant and run the experiment for me?
No, it plans the hypothesis, sizes the test and analyzes the outcome. Building the variant and serving traffic stay on your side.

## Price
$15, one-time, no subscription. VAT included.

Related guide: [AI Google Ads and Meta Ads management](https://forgehouse.ai/guides/ai-google-ads-management/)
