Back to Blog

The Six Most Common Culprits

When traffic spikes after 10pm and something slows down, the instinct is to check for an attack. Sometimes that instinct is right. More often it is not, and treating a backup job or a BGP event like a DDoS wastes time and introduces unnecessary risk — the wrong mitigation applied to the wrong problem can take down a service that was perfectly functional despite the underlying issue.

Here are the six causes I have seen most frequently in post-incident reviews, roughly in order of how often they actually occur.

1. Scheduled backup jobs saturating the uplink

Backup jobs are almost universally configured to run at night. rsync, Restic, Borg, and Veeam agents all default to off-peak scheduling. A single node running a nightly backup to a remote target can easily consume 200–800 Mbps of uplink for 30–90 minutes. On a 1G uplink shared with production traffic, that causes noticeable performance degradation. On a 10G uplink shared across a rack, it is usually invisible until three nodes start their backups simultaneously.

The telltale signature: sustained high BPS with low PPS. Backup traffic sends large sequential chunks — typical packet sizes are 32KB–64KB at the TCP layer, which translates to 1,500-byte Ethernet frames at full MTU. This produces a BPS/PPS ratio of roughly 1 Mbps per 83 PPS. An actual DDoS flood typically inverts this: high PPS with moderate or low BPS, especially for UDP floods.

2. BGP route flaps during low-traffic windows

ISPs and transit providers schedule maintenance during off-peak hours. BGP sessions reset, routes are withdrawn and re-advertised, and your traffic temporarily takes a longer path or gets briefly blackholed. The effect at the application layer looks like packet loss and increased latency, not necessarily bandwidth saturation. A BGP flap typically lasts 30–180 seconds and may repeat if the upstream is cycling through maintenance tasks.

Check your router or looking glass for BGP UPDATE messages around the time of the incident. If you see prefix withdrawals from your transit provider's ASN, that is your answer.

3. ISP maintenance windows

Most ISP maintenance windows are scheduled between 11pm and 4am local time. They will reduce link speed, perform optical work, or upgrade line cards during this window. The effect ranges from brief packet loss to sustained speed reduction while traffic is rerouted through backup paths. Your colocation contract or SLA usually includes a maintenance window schedule — check if the incident time aligns.

4. Opportunistic DDoS attacks when NOC staff is minimal

This one is real and increasingly common. Booter services are fully automated, and experienced attackers know that the window between midnight and 6am is when response times are slowest. Attack traffic launched at 2am has a much better chance of persisting for a useful amount of time before being mitigated. The attacks are often shorter and lower-volume than daytime attacks — designed to cause disruption rather than overwhelm defenses — because the goal is to maximize impact per dollar spent on the booter service.

5. CDN cache purges and re-warming

Large CDN networks purge and re-warm caches on schedules that often land at night. When a CDN edge PoP's cache is cold, all cache misses fall back to your origin servers. If your CDN has a scheduled purge at 11pm, your origin may suddenly see 10–30x its normal request rate for 20–40 minutes while caches re-warm. This looks like an HTTP flood to a naive detection system.

6. Batch processing and database maintenance jobs

Database VACUUM operations, index rebuilds, log shipping, and ETL batch jobs are commonly scheduled overnight. These can saturate I/O and CPU, which causes network-layer symptoms (timeout retransmissions, TCP backlog exhaustion) even though the bottleneck is on disk. Network traffic may actually look normal while the system is degraded.

Sampling /proc/net/dev to Distinguish Traffic Types

Before pulling up any monitoring dashboard, start with raw kernel counters. /proc/net/dev is updated in real time with byte and packet counts per interface. The following one-liner samples it twice, one second apart, and computes the delta:

IF=eth0
R1=$(awk "/$IF:/{print \$2,\$3}" /proc/net/dev)
sleep 1
R2=$(awk "/$IF:/{print \$2,\$3}" /proc/net/dev)
B1=$(echo $R1 | cut -d' ' -f1); P1=$(echo $R1 | cut -d' ' -f2)
B2=$(echo $R2 | cut -d' ' -f1); P2=$(echo $R2 | cut -d' ' -f2)
echo "BPS: $(( (B2 - B1) * 8 )) | PPS: $(( P2 - P1 ))"

Run this every few seconds during a degradation event. The BPS-to-PPS ratio tells you a lot immediately. Divide BPS by 8 to get bytes per second, then divide by PPS to get average packet size in bytes.

  • Average packet size > 900 bytes: Likely legitimate bulk transfer (backup, CDN origin fill, database replication). Not a typical DDoS flood.
  • Average packet size 60–120 bytes: Consistent with UDP flood packets or SYN packets. Investigate further.
  • Average packet size 1,400–1,500 bytes: Near-MTU packets at high rate. Could be amplification attack traffic (DNS, NTP, memcached amplification responses are large) or legitimate bulk transfer. Check source port distribution next.

Stop guessing at 2am

Flowtriq detects attacks like this in under 2 seconds, classifies them automatically, and alerts your team instantly. 7-day free trial.

Start Free Trial →

Using Flowtriq Analytics to Find Recurring Patterns

The most valuable thing you can do after a late-night slowdown is determine whether it has happened before. One-time events are usually infrastructure or ISP issues. Recurring events at the same time of night are almost always scheduled jobs or recurring attacks.

In the Flowtriq analytics dashboard, navigate to Analytics → Traffic Overview and set the time range to the past 14 days. Look at the hourly traffic heatmap. Legitimate scheduled jobs appear as consistent bands at specific hours across multiple days — a backup job running at 1am every night will show up as a faint horizontal stripe at that hour. Attack traffic is irregular across days even if it tends to cluster in off-peak hours.

The incident timeline view is even more direct. Filter incidents to the past 30 days and sort by start time. If you see five incidents clustered between 11pm and 2am, click through to each. Consistent attack type and similar PPS values across multiple nights is the signature of a persistent attacker running scheduled attack campaigns — a common pattern from booter services that allow recurring attack scheduling.

One practical tip: check your crontab files before assuming an attack. Run for u in $(cut -f1 -d: /etc/passwd); do crontab -l -u $u 2>/dev/null; done to dump all user crontabs on a host. You will often find backup scripts, log rotation jobs, or monitoring agents that nobody remembers configuring, all scheduled for the same time window.

A Triage Decision Tree

When performance degrades after 10pm, work through this in order:

  1. Check /proc/net/dev delta. High BPS with large average packet size — check for backup jobs and CDN misses before escalating. High PPS with small average packet size — likely attack traffic, proceed to step 2.
  2. Check for active backup or batch jobs: ps aux | grep -E 'rsync|restic|borg|mysqldump|pg_dump'. If present and running, correlate their start time with the degradation.
  3. Check BGP session state on your router or via your transit provider's looking glass. Any recent UPDATE events?
  4. Open Flowtriq. If an incident is active, you have attack traffic confirmed with classification, duration, and PPS. If no Flowtriq incident is open, the issue is very likely infrastructure-side, not an attack.
  5. If attack confirmed: review the incident's attack type and PCAP, apply appropriate mitigation, and document the timeline for the post-mortem.

This sequence takes under 3 minutes to execute. It prevents the most common mistake in late-night incident response: calling out the security team for what is actually a runaway rsync job, or conversely, spending 20 minutes debugging a database maintenance window while an attack continues unchecked.

Protect your infrastructure with Flowtriq

Per-second DDoS detection, automatic attack classification, PCAP forensics, and instant multi-channel alerts. $9.99/node/month.

Start your free 7-day trial →
Back to Blog

Related Articles