The SLA Math: What 99.9% Actually Means
Every SaaS platform promises uptime SLAs. Most quote 99.9%. That number sounds impressive until you do the math: 99.9% uptime allows for 8 hours, 45 minutes, and 36 seconds of downtime per year. That is your entire budget for maintenance windows, deployments, infrastructure failures, and DDoS attacks combined.
If you promise 99.95%, your total annual downtime budget drops to 4 hours and 22 minutes. At 99.99%, you get 52 minutes for the entire year.
SLA Target Annual Downtime Monthly Downtime Per-Incident Budget ────────────────────────────────────────────────────────────────────────── 99.9% 8h 45m 36s 43m 49s ~15 minutes 99.95% 4h 22m 48s 21m 54s ~7 minutes 99.99% 52m 33s 4m 23s ~2 minutes
A single unmitigated DDoS attack can consume your entire annual downtime budget in one afternoon. If your detection takes 5 minutes and your mitigation takes another 10, a single incident costs you 15 minutes - one third of your monthly budget at 99.9%.
This is why detection speed is not a nice-to-have for SaaS platforms. It is a contractual obligation.
Multi-Tenant Vulnerability: One Customer's Enemy Is Everyone's Problem
SaaS platforms have a unique DDoS exposure that single-tenant applications do not face. Your infrastructure is shared. Your database clusters, API gateways, load balancers, and compute nodes serve hundreds or thousands of customers simultaneously. When one customer's adversary launches a DDoS attack, every customer on the same infrastructure feels the impact.
Consider the attack surface of a typical multi-tenant SaaS deployment:
- Shared API endpoints: A volumetric flood targeting
api.yourapp.comaffects every customer, not just the target. - Shared load balancers: Connection-state exhaustion on your ALB or nginx tier degrades service for all tenants.
- Shared databases: If the attack triggers excessive error logging, retries, or connection churn, your database cluster becomes the bottleneck.
- Custom domains: If your platform supports custom domains (CNAME to your infrastructure), attackers can target the custom domain and still hit your shared infrastructure.
- Noisy neighbors: A customer in a regulated industry (finance, gaming, crypto) may attract nation-state-level adversaries whose attack capabilities far exceed what your infrastructure was designed to absorb.
The multi-tenant model means that your DDoS exposure is the union of all your customers' threat profiles, not just your own. One enterprise customer with a determined adversary can put your entire platform at risk.
DDoS attacks on SaaS platforms often trigger cascading failures. The initial flood saturates network capacity, which causes health checks to fail, which triggers auto-scaling (consuming more cloud budget), which causes connection pool exhaustion on the database tier. By the time the attack ends, your monthly cloud bill has doubled and your incident-response team is still debugging a database replication lag that started 45 minutes after the attack.
The Problem with Cloud-Native-Only Protection
If you run on AWS, you already have AWS Shield Standard. It is free, it is automatic, and it protects against common L3/L4 attacks. So why do you need anything else?
Because Shield Standard does not alert you. It absorbs what it can and drops the rest. You get no notification, no metrics, no visibility into what happened. You cannot tell your customers "we detected and mitigated a 50 Gbps UDP flood in 2 seconds" because you do not know it happened. Your monitoring might show elevated latency or error rates, but you have no DDoS-specific telemetry.
Here is what each cloud provider's built-in protection actually gives you:
Provider Free Tier Protection What You Do NOT Get
──────────────────────────────────────────────────────────────────────────
AWS Shield L3/L4 absorption (Standard) Alerts, metrics, L7 protection
Shield Advanced: $3,000/mo + DRT Per-resource, single-cloud only
GCP Armor L3/L4 basic (Standard Tier) Custom rules, adaptive protection
Cloud Armor Plus: $200/mo + per-rule Single-cloud, no BGP integration
Azure Basic DDoS (free, auto-enabled) Alerting, telemetry, custom policies
DDoS Protection: ~$2,944/mo Single-cloud, per-VNet pricing
The enterprise tiers solve the problem - but at enterprise prices. AWS Shield Advanced costs $3,000 per month as a base fee before data transfer charges. Azure DDoS Protection runs about $2,944/month per subscription. These prices make sense for a single large enterprise protecting one cloud environment. They do not make sense for a SaaS platform running 40 nodes across three cloud providers, where the combined cost would exceed $9,000/month just for DDoS protection.
More critically, cloud-native protection is single-cloud by design. If you run on AWS and GCP (or have bare-metal colocation for latency-sensitive workloads), you need separate protection for each environment with no unified view. You cannot correlate an attack hitting your AWS frontend with degraded performance on your GCP backend because the two systems do not talk to each other.
Detection Speed: 1 Second vs 5 Minutes
Most network monitoring tools rely on flow sampling (NetFlow/sFlow/IPFIX) for traffic analysis. A typical flow export interval is 1-5 minutes, with a sampling rate of 1:1000 to 1:4096. This means your monitoring system sees a statistical sample of your traffic, updated every few minutes.
For capacity planning and trend analysis, flow sampling is fine. For DDoS detection, it is dangerously slow.
A modern DDoS attack ramps to full volume in under 10 seconds. If your detection relies on 5-minute flow exports, the attack has been running at full blast for nearly 5 minutes before you even know it is happening. At 50 Gbps, that is 1.8 terabytes of attack traffic that hits your infrastructure before your first alert fires.
Flowtriq uses per-second packet analysis instead of sampled flow data. Every node monitors traffic continuously and reports metrics every second. Detection triggers within 1-2 seconds of attack onset. The difference matters:
Detection Method Time to Detect Traffic Absorbed Before Alert (at 10 Gbps) ───────────────────────────────────────────────────────────────────────────────── 5-min flow sampling ~5 minutes 375 GB 1-min flow sampling ~1 minute 75 GB Flowtriq (per-sec) 1-2 seconds 1.25-2.5 GB
For a SaaS platform operating under tight SLA budgets, the difference between 1-second and 5-minute detection is the difference between a non-event and a customer-facing outage.
Dynamic Baselines: Traffic That Grows with Your Customers
SaaS traffic is inherently unpredictable. A product launch, a viral blog post, a seasonal spike, or onboarding a large enterprise customer can double your traffic overnight. Static thresholds ("alert if PPS exceeds 500K") break in this environment because they cannot distinguish between a DDoS attack and legitimate growth.
Flowtriq builds dynamic baselines per node that continuously learn your normal traffic patterns. The baseline accounts for:
- Time-of-day patterns: Your API traffic peaks during business hours in your customers' time zones. What looks like a spike at 3 AM is normal at 10 AM.
- Day-of-week seasonality: B2B SaaS platforms see 2-3x more traffic on weekdays than weekends. A Monday morning surge is not an attack.
- Growth trends: As your customer base grows, so does your baseline. The system adjusts automatically so you do not need to manually update thresholds every month.
- Per-protocol baselines: DNS traffic, HTTPS traffic, and WebSocket traffic each have their own baseline. A spike in UDP traffic while TCP remains normal is a stronger signal than a uniform increase across all protocols.
Detection fires when traffic deviates from the learned baseline by a configurable multiple - not when it crosses a fixed number. This means fewer false positives during legitimate traffic growth and faster detection during actual attacks, because the system knows what "normal" looks like for each node at each hour of each day.
4-Level Auto-Escalation: From Firewall to Cloud Scrubbing
Not every attack requires the same response. A 50 Kpps UDP flood from a handful of sources can be handled by local firewall rules. A 50 Gbps volumetric attack requires network-edge filtering. Flowtriq's auto-escalation chain selects the right mitigation level automatically based on attack volume and characteristics:
- Tier 1 - Local firewall (iptables/nftables): Rules deployed directly on the target node. Handles small attacks within the server's NIC capacity. Zero external dependencies, sub-second deployment.
- Tier 2 - BGP FlowSpec: Surgical filtering rules injected via BGP to upstream routers. Stops attack traffic at the network edge while keeping the target IP fully reachable for legitimate users. Works when the attack has identifiable L3/L4 signatures.
- Tier 3 - BGP blackhole (RTBH): Full blackhole of the target IP at the network edge. Used when attack volume exceeds link capacity or traffic is too generic for FlowSpec filtering. The target IP goes offline, but the rest of your infrastructure is protected.
- Tier 4 - Cloud scrubbing redirect: Traffic diversion to a cloud-based scrubbing service for deep inspection. Used for the largest attacks or application-layer floods that require payload inspection.
The escalation is automatic and bidirectional. As attack volume increases, Flowtriq escalates to more aggressive mitigation. As the attack subsides, it de-escalates: removing the blackhole, then the FlowSpec rules, then the local firewall rules. Every action is logged in the audit trail with exact timestamps.
Why auto-de-escalation matters: A forgotten RTBH announcement will keep your service offline long after the attack ends. A stale FlowSpec rule can silently block legitimate traffic for days. Flowtriq monitors traffic continuously and removes mitigation rules when they are no longer needed - and notifies your team when it does.
Multi-Cloud and Hybrid: Single Pane Across Everything
Most SaaS platforms outgrow a single cloud provider. You might run compute on AWS, use GCP for machine learning workloads, keep your database on Azure, and maintain bare-metal servers at Equinix for latency-sensitive APIs. Your DDoS protection needs to cover all of it.
Flowtriq's node agent is a lightweight process that runs on any Linux server regardless of where it is hosted. Deploy it on an EC2 instance, a GKE pod, an Azure VM, or a bare-metal server in a colocation facility. Every node reports to the same dashboard, giving you:
- Unified attack visibility: See all attacks across all environments in a single timeline. Correlate an attack hitting your AWS frontend with performance degradation on your GCP backend.
- Consistent detection: The same dynamic baselines, the same detection thresholds, and the same classification engine across every environment. No gaps between cloud providers.
- Cross-environment escalation: If an attack targets your AWS infrastructure, Flowtriq can proactively tighten detection sensitivity on your GCP and Azure nodes because coordinated multi-cloud attacks are increasingly common.
- Infrastructure-agnostic mitigation: FlowSpec and RTBH work the same way regardless of whether the node is in a cloud VPC or a colocation rack. Local firewall rules use iptables/nftables on all Linux servers.
This matters because cloud-native DDoS tools are, by definition, blind to your other environments. AWS Shield knows nothing about your GCP nodes. Azure DDoS Protection cannot see your bare-metal servers. Flowtriq sees everything.
Alert Routing: The Right People at the Right Time
When a DDoS attack hits your payment processing service at 2 AM, you do not want to wake up your frontend team. You want the on-call infrastructure engineer who owns that service to get a PagerDuty alert, while the broader team gets a Slack notification they can review in the morning.
Flowtriq's notification channels support per-service routing:
- Slack: Route alerts to specific channels based on which node or service is under attack.
#alerts-paymentsfor your payment nodes,#alerts-apifor your API tier. - PagerDuty: Trigger incidents with severity levels mapped to attack volume. A 100 Kpps flood triggers a P3; a 10 Mpps attack triggers a P1 and pages the on-call engineer.
- Discord: For teams that use Discord for operations communication.
- OpsGenie: Full integration with alert routing, schedules, and escalation policies.
- Email: Detailed incident summaries sent to stakeholders who need to know but do not need to act.
- SMS: Critical alerts for scenarios where internet-dependent channels might be unreachable during a large attack.
- Webhooks: Custom HTTP endpoints for integration with your internal tooling, runbooks, or ChatOps bots.
Channels are configured per notification policy, so you can create rules like "if the attack exceeds 1 Mpps and targets a production node tagged payments, page PagerDuty and post to #incidents; otherwise, post to #alerts-general."
Audit Logs, PCAP, and Post-Incident Reviews
Detecting and mitigating an attack is only half the job. After the attack ends, you need to answer questions from your customers, your leadership, and possibly your compliance team. What happened? When did you detect it? How long did it last? What did you do about it? Could it happen again?
Flowtriq captures the evidence you need for post-incident reviews:
- Full audit log: Every detection, escalation, mitigation action, and de-escalation is logged with exact timestamps, the triggering metrics, and the action taken. This is your forensic timeline.
- PCAP captures: On-demand or automatic packet captures during incidents. Download the raw PCAP file for deep analysis in Wireshark or tcpdump. Identify the exact attack vectors, source IP distributions, and payload patterns.
- Incident timeline: A structured view of each incident showing detection time, classification, escalation steps, peak attack volume, duration, and resolution. Exportable for stakeholder communication.
- Per-second metrics: PPS and BPS data at 1-second granularity during the attack window. See exactly when the attack started ramping, when mitigation kicked in, and when traffic returned to baseline.
This data is not just for your internal teams. When a customer asks "was there an outage between 2:15 and 2:22 AM?", you can respond with precise information: "We detected a 2.3 Mpps UDP flood at 02:15:03, deployed FlowSpec filtering at 02:15:05, and attack traffic was fully mitigated by 02:15:08. Your service experienced approximately 2 seconds of elevated latency. Here is the incident report."
Auto-Generated Incident Reports
Your customers expect transparency. When your status page shows a degradation event, they want to know what happened. Writing incident reports manually is time-consuming and error-prone, especially when you are still dealing with the aftermath of an attack.
Flowtriq generates structured incident reports automatically after each event. Each report includes:
- Attack classification (UDP flood, SYN flood, DNS amplification, etc.) and vector breakdown
- Timeline with detection, mitigation, and resolution timestamps
- Peak attack volume (PPS and BPS) and duration
- Mitigation actions taken at each escalation tier
- Affected nodes and services
- Customer impact assessment based on service degradation metrics
Reports are shareable via unique links. You can send the link directly to affected customers or post it to your status page. The report shows exactly what happened and what your platform did about it - turning a negative customer experience into a demonstration of your operational maturity.
SaaS customers do not expect zero incidents. They expect transparency and competent response. An auto-generated incident report that shows 2-second detection and sub-5-second mitigation builds more trust than a perfect uptime record with no visibility into how it is maintained.
Real Scenario: 40 Nodes Across 3 Cloud Providers
Consider a B2B SaaS platform running a typical multi-cloud deployment:
Environment Nodes Role ──────────────────────────────────────────────── AWS (us-east-1) 18 API servers, worker nodes GCP (us-central) 12 ML pipeline, search cluster Azure (westus2) 6 Database replicas, analytics Equinix (NY5) 4 Low-latency trading API ──────────────────────────────────────────────── Total 40 nodes
This platform serves 800+ enterprise customers, processes 2 billion API requests per day, and contractually guarantees 99.95% uptime. Their annual downtime budget is 4 hours and 22 minutes.
The protection options:
Option A: Cloud-native protection on each provider
- AWS Shield Advanced: $3,000/mo base + data transfer
- GCP Cloud Armor Plus: $200/mo + per-policy + per-request charges
- Azure DDoS Protection: $2,944/mo
- Equinix bare metal: no built-in DDoS protection
- Total: $6,000-8,000+/month, no unified dashboard, no coverage for bare metal, no cross-cloud correlation
Option B: Flowtriq across all environments
- 40 nodes at $9.99/node/month: $399.60/month
- Annual billing (40 nodes at $7.99/node/month): $319.60/month
- Every feature included: detection, classification, auto-escalation, PCAP, audit logs, incident reports, unlimited notification channels, unlimited team members
- Unified dashboard across all 4 environments
- Bare metal covered with the same agent as cloud nodes
Cloud-Native (combined) Flowtriq (40 nodes) ────────────────────────────────────────────────────────────────────────── Monthly cost $6,000-8,000+ $399.60 ($319.60/yr) Bare metal coverage None Full Unified dashboard No (3 separate consoles) Yes Cross-cloud correlation No Yes Detection speed Varies (provider-dependent) 1-2 seconds (all nodes) Auto-escalation No (manual per provider) 4-tier automatic PCAP capture Limited / manual On-demand + auto Vendor lock-in Yes (per cloud provider) None
Option B costs 95% less, covers infrastructure that cloud-native tools cannot reach, and provides a single operational view across the entire deployment.
No Vendor Lock-In
Cloud-native DDoS protection ties your mitigation strategy to a single provider. If you decide to migrate workloads from AWS to GCP, your Shield Advanced configuration does not come with you. Your detection rules, escalation policies, historical data, and operational runbooks are all provider-specific.
Flowtriq runs on your infrastructure, not your cloud provider's. The node agent is a standard Linux process that works identically on EC2, GCE, Azure VMs, DigitalOcean droplets, Hetzner dedicated servers, or a Raspberry Pi in your office closet. Your detection baselines, escalation policies, notification channels, and incident history stay consistent regardless of where your nodes run.
This means you can:
- Migrate between cloud providers without rebuilding your DDoS protection.
- Add bare-metal or edge nodes without purchasing a separate product.
- Scale across providers based on cost and performance, not based on which providers have DDoS protection you already configured.
- Maintain a single incident history and audit trail even as your infrastructure evolves.
Transparent Per-Node Pricing
Flowtriq's pricing model is intentionally simple:
- $9.99 per node per month (monthly billing)
- $7.99 per node per month (annual billing - save 20%)
- 7-day free trial with all features, no credit card required
- Every feature included at every tier: detection, classification, auto-escalation, PCAP, audit logs, reports, all notification integrations, unlimited team members
There are no base fees, no per-request charges, no data transfer surcharges, and no feature gates. A 5-node deployment costs $49.95/month. A 100-node deployment costs $999/month. The math is simple because your cloud bill is already complicated enough.
Unlimited seats included: Flowtriq does not charge per user. Your entire engineering, DevOps, and security team can access the dashboard with role-based permissions (owner, admin, analyst, read-only) at no additional cost. DDoS visibility should not be limited to the one person who owns the budget.
Getting Started
Protecting a SaaS platform from DDoS attacks does not require an enterprise contract, a dedicated TAM, or a six-figure annual commitment. It requires per-second detection, dynamic baselines that understand your traffic patterns, automatic escalation that responds faster than any human can, and a dashboard that shows you everything across every environment.
Flowtriq deploys in minutes. Install the node agent, point it at your dashboard, and your first baseline starts learning immediately. Detection is active from the first second. Auto-escalation policies can be configured from the dashboard with sensible defaults that work out of the box.
Start with a free 7-day trial - all features, no credit card, no sales call. Deploy on one node or forty. See what per-second DDoS detection looks like when it is actually built for SaaS operations.
Back to Blog