{
  "assumption_density": 0,
  "assumptions": [
    "The $15K/month budget is a hard ceiling that cannot be negotiated upward",
    "5TB/day ingestion rate is sustained and will not decrease significantly",
    "The 3-person data engineering team has or can acquire sufficient ClickHouse operational skills within 60 days",
    "Dashboard query patterns are predictable enough to align with MergeTree ORDER BY keys for sub-second performance",
    "AWS i3.2xlarge instances remain available and priced at approximately $1,400/month each"
  ],
  "confidence": 0.82,
  "id": "a4ae1128-a590-4759-8825-c1f07635e6b6",
  "next_action": "Write a Terraform module provisioning 3x i3.2xlarge instances in a single AZ with ClickHouse Keeper, ReplicatedMergeTree table schemas matching the top 5 dashboard query patterns, and an S3 lifecycle policy for 90-day data tiering — target a working proof-of-concept cluster within 5 business days.",
  "question": "Snowflake vs self-hosted ClickHouse for a 5TB/day analytics pipeline. Team of 3 data engineers, $15K/month budget, need sub-second dashboard queries.",
  "question_fit_score": 0,
  "rejected_alternatives": [
    {
      "path": "Snowflake with concurrent data modeling overhaul (b001)",
      "rationale": "Snowflake's volume-proportional pricing is a structural budget wall at 5TB/day. Storage alone reaches $15K/month within 4-5 months regardless of query optimization. The $15K budget is a hard constraint that Snowflake cannot satisfy long-term at this ingestion rate. Data model redesign addresses query speed but not the cost escalation problem."
    },
    {
      "path": "Conditional recommendation — Snowflake if simplicity, ClickHouse if cost (b002)",
      "rationale": "Killed in round 1. Provided no architecture, no cost analysis, no specific configuration. Restated the question as a conditional without resolving it. Structurally empty."
    },
    {
      "path": "Reduce data volume first before choosing platform (b005)",
      "rationale": "Killed in round 3. The 70% reduction target is aspirational with no evidence. A 4-week initiative consuming all 3 engineers produces zero pipeline progress with no guarantee of success. Even at 1.5TB/day, Snowflake storage costs still compound past $15K within ~14 months. Delays the hard decision without removing the structural constraint."
    },
    {
      "path": "Challenge to ClickHouse operational feasibility with 3 engineers (b004)",
      "rationale": "Auto-pruned as low-confidence. Raised a valid concern about operational burden that b003 already addresses explicitly with its 20-30% ops time estimate and ClickHouse Cloud escape hatch mitigation."
    }
  ],
  "reversal_conditions": [
    {
      "condition": "Budget increases to $30K+/month or Snowflake offers a negotiated enterprise rate below $10/TB compressed storage",
      "flips_to": "Snowflake becomes viable — its managed service model eliminates the operational burnout risk that is ClickHouse's primary failure mode for a 3-person team"
    },
    {
      "condition": "Team drops to 2 or fewer engineers, or ops burden exceeds 40% of team capacity for 2+ consecutive sprints",
      "flips_to": "Migrate to ClickHouse Cloud managed service as the pre-planned escape hatch, trading higher cost for reduced operational burden"
    },
    {
      "condition": "Ingestion rate drops below 1TB/day due to successful data volume reduction or business scope change",
      "flips_to": "Re-evaluate Snowflake — at lower volumes, storage costs stay within budget and the managed service benefit outweighs ClickHouse operational overhead"
    }
  ],
  "unresolved_uncertainty": [
    "Whether the 3-person team has sufficient ClickHouse operational expertise — if all three are Snowflake/Redshift-experienced with zero ClickHouse exposure, the 60-day ramp-up estimate may be optimistic",
    "Schema evolution failure mode was cut off in b003's output — the full severity and mitigation for ALTER TABLE blocking on 10B+ row ReplicatedMergeTree tables was not fully articulated",
    "The $23/TB Snowflake compressed storage pricing is the current list rate; negotiated enterprise pricing could extend Snowflake's budget viability, though likely not enough to overcome the structural gap at 5TB/day",
    "Query pattern diversity is unknown — sub-100ms assumes dashboard queries align with MergeTree ORDER BY keys. Ad-hoc exploratory queries on non-indexed dimensions may exceed sub-second SLA",
    "b004's killed argument about Keeper failure during peak ingest causing hours of recovery was not fully rebutted — this is a real operational risk for a 3-person team"
  ],
  "url": "https://vectorcourt.com/v/a4ae1128-a590-4759-8825-c1f07635e6b6",
  "verdict": "Deploy self-hosted ClickHouse on 3x AWS i3.2xlarge instances ($4,200/month compute) with S3 tiered storage for 90+ day data (~$3,000/month at steady state), totaling $7,200-$9,000/month. Use ReplicatedMergeTree engine with ClickHouse Keeper (not ZooKeeper). This stays well within the $15K budget while Snowflake structurally cannot — at 5TB/day, Snowflake storage alone reaches $15K/month within 4-5 months, with compute credits pushing to $20K-$35K/month by month 6. ClickHouse on NVMe delivers sub-100ms on typical dashboard aggregations over billion-row tables with proper MergeTree ORDER BY keys, exceeding the sub-second requirement by 10x. Critical failure mode: operational burnout. A 3-person team will spend 20-30% of time on ops. If one engineer leaves, the remaining two face unsustainable burden. Mitigation: document runbooks aggressively in the first 60 days and keep ClickHouse Cloud as an escape hatch. Provision with Ansible/Terraform, enforce 90-day tiered storage policy to keep NVMe under 80% capacity.",
  "verdict_core": {
    "recommendation": "Deploy self-hosted ClickHouse on a 3-node cluster using i3.2xlarge instances with ReplicatedMergeTree engine and ClickHouse Keeper, with S3 tiered storage for data older than 90 days.",
    "mechanism": "Because Snowflake's volume-proportional pricing hits the $15K/month budget ceiling within 4-5 months at 5TB/day ingestion (150TB/month × $23/TB compressed storage alone), while ClickHouse on NVMe-backed instances delivers sub-100ms dashboard aggregations at a fixed ~$7,200-$9,000/month total cost, leaving budget headroom for monitoring and backup infrastructure.",
    "tradeoffs": [
      "20-30% of a 3-person team's time consumed by cluster operations (upgrades, disk monitoring, replication lag), creating fragility if any engineer leaves",
      "Schema evolution on ReplicatedMergeTree with 10B+ rows can take hours and block inserts, requiring careful migration planning",
      "No managed service safety net — cluster failures require in-house expertise to resolve, with no vendor SLA"
    ],
    "failure_modes": [
      "Operational burnout: 3-person team loses one engineer, remaining two face unsustainable on-call burden with no managed fallback",
      "Schema evolution pain: ALTER TABLE on ReplicatedMergeTree with 10B+ rows blocks inserts for hours",
      "NVMe capacity breach: without enforced 90-day tiered storage policy, local NVMe fills past 80% and query performance degrades"
    ],
    "thresholds": [
      "5TB/day ingestion rate",
      "$7,200-$9,000/month total infrastructure cost",
      "$15K/month hard budget ceiling",
      "Sub-100ms typical dashboard aggregation latency on NVMe",
      "3x AWS i3.2xlarge instances at $4,200/month compute",
      "~$3,000/month S3 tiered storage at steady state",
      "90-day hot/warm storage boundary",
      "80% NVMe capacity threshold",
      "20-30% engineer time on operations"
    ]
  },
  "verdict_type": ""
}