Monitor your website's availability from global locations and get instant alerts when downtime occurs.
What is Uptime Monitoring?
Uptime monitoring continuously checks if your website is accessible:
- Synthetic monitoring: Regular checks from multiple locations
- Real-user monitoring: Detect issues from actual visitors
- Instant alerts: Know immediately when problems occur
- Historical tracking: Availability percentage over time
Setting Up Uptime Monitoring
Add a Monitor
- Go to Performance → Uptime
- Click Add Monitor
- Configure:
Monitor Configuration
Name: [Production Website]
URL: [https://example.com]
Check Settings:
Interval: [1 minute]
Timeout: [30 seconds]
Method: [GET]
Expected Response:
Status: [200]
Contains: [optional text]
Locations:
[x] US East (Virginia)
[x] US West (Oregon)
[x] Europe (Frankfurt)
[x] Asia (Singapore)
[x] Australia (Sydney)
Check Types
| Type | Description | Use Case |
|---|---|---|
| HTTP(S) | Web page check | Websites, APIs |
| TCP | Port connectivity | Databases, services |
| DNS | Domain resolution | DNS health |
| Ping | ICMP ping | Basic connectivity |
| SSL | Certificate check | SSL expiry monitoring |
Dashboard Overview
Uptime Status
Uptime Status
Current: ✓ All Systems Operational
Last 30 Days:
████████████████████████████████ 99.95%
Incidents: 2
Total Downtime: 23 minutes
Status by Location
Location Status
Location Status Latency Last Check
US East ✓ Up 45ms 30s ago
US West ✓ Up 52ms 30s ago
Europe ✓ Up 120ms 30s ago
Asia ✓ Up 180ms 30s ago
Australia ✓ Up 210ms 30s ago
Uptime Timeline
Last 24 Hours
00:00 ████████████████████████ ✓
04:00 ████████████████████████ ✓
08:00 ████████████████████░░░░ ⚠ Degraded (2 min)
12:00 ████████████████████████ ✓
16:00 ████████████████████████ ✓
20:00 ████████████████████████ ✓
Check Configuration
HTTP Check Options
HTTP Check Settings
Request:
Method: GET / POST / HEAD
URL: https://example.com/health
Headers:
Authorization: Bearer xxx
User-Agent: Zenovay-Monitor/1.0
Body (for POST):
{"ping": true}
Validation:
Expected Status: 200-299
Response Contains: "healthy"
Response Time: < 5000ms
Follow Redirects: Yes
Verify SSL: Yes
TCP Check
TCP Check Settings
Host: db.example.com
Port: 5432
Timeout: 10 seconds
Expected: Connection successful
DNS Check
DNS Check Settings
Domain: example.com
Record Type: A / AAAA / CNAME / MX
Expected Value: 93.184.216.34 (optional)
DNS Server: 8.8.8.8 (or default)
SSL Certificate Check
SSL Certificate Check
Domain: example.com
Alert Before Expiry: 30 days
Verify Chain: Yes
Check OCSP: Yes
Check Intervals
| Plan | Minimum Interval | Locations |
|---|---|---|
| Pro | 1 minute | 3 |
| Scale | 30 seconds | 5 |
| Enterprise | 10 seconds | 10 |
Choosing Interval
| Interval | Best For |
|---|---|
| 10s | Critical production systems |
| 30s | Important services |
| 1m | Standard monitoring |
| 5m | Less critical services |
Alerting
Alert Configuration
Uptime Alert Settings
Trigger Alert When:
Failure Count: [2] consecutive failures
From Locations: [Any 2 of 5]
Notify:
[x] Email: ops@example.com
[x] Slack: #incidents
[x] PagerDuty: On-call
[x] SMS: +1-555-0123
Alert On:
[x] Site Down
[x] SSL Expiring (30 days)
[x] Slow Response (> 5s)
[x] Site Recovered
Alert Message
🚨 DOWNTIME ALERT
Monitor: Production Website
URL: https://example.com
Status: DOWN
Details:
Error: Connection timeout
Duration: 3 minutes
Locations Affected: US East, Europe
Timeline:
10:15:00 - First failure (US East)
10:15:30 - Confirmed (Europe)
10:15:30 - Alert triggered
[View Incident →]
Recovery Alert
✓ RECOVERY
Monitor: Production Website
URL: https://example.com
Status: UP
Downtime Duration: 8 minutes
Affected Locations: US East, Europe
Recovery Time: 10:23:00
[View Incident Report →]
Incident Management
Incident Timeline
Incident #124 - Production Website
Timeline:
10:15:00 First failure detected (US East)
10:15:30 Confirmed by second location
10:15:30 Alert sent to on-call
10:16:00 Acknowledged by @john
10:20:00 Root cause: Database connection
10:23:00 Service restored
10:23:00 Recovery alert sent
Duration: 8 minutes
Impact: 2,340 users
Root Cause: Database failover
Status Page Integration
Connect to your status page:
- Go to Settings → Integrations
- Select status page provider:
- Statuspage.io
- Cachet
- Custom webhook
- Configure auto-update rules
Maintenance Windows
Schedule Maintenance
Maintenance Window
Name: Database Upgrade
Start: 2025-01-20 02:00 UTC
End: 2025-01-20 04:00 UTC
Affected Monitors:
[x] Production Website
[x] API Endpoint
During Maintenance:
[x] Pause monitoring
[x] Suppress alerts
[ ] Show on status page
Recurring Maintenance
Recurring Schedule
Name: Weekly Backup Window
Frequency: Every Sunday
Time: 03:00 - 04:00 UTC
Actions:
[x] Pause monitoring
[x] Suppress alerts
Global Monitoring Locations
Available Locations
| Region | Locations |
|---|---|
| North America | Virginia, Oregon, Ohio, Montreal |
| Europe | Frankfurt, London, Paris, Amsterdam |
| Asia Pacific | Singapore, Tokyo, Sydney, Mumbai |
| South America | São Paulo |
Location Strategy
- Minimum: 3 locations for redundancy
- Global sites: Use locations matching user base
- Regional check: Nearby location for accurate latency
Response Time Tracking
Latency Metrics
Response Time (Last 24 Hours)
Location Avg P95 Max
US East 45ms 120ms 450ms
US West 52ms 135ms 520ms
Europe 120ms 280ms 890ms
Asia 180ms 420ms 1.2s
Latency Alerts
Response Time Alert
Condition: Average response > 2 seconds
Duration: 5 minutes
Locations: Any 2 of 5
This is often an early warning before full downtime.
API Endpoint Monitoring
Monitor API Health
API Health Check
URL: https://api.example.com/health
Method: GET
Headers:
Authorization: Bearer xxx
Expected:
Status: 200
Response:
{
"status": "healthy",
"database": "connected",
"cache": "connected"
}
Multi-Step Checks
Multi-Step API Check
Step 1: Login
POST /api/auth/login
Body: {"user": "monitor", "pass": "xxx"}
Store: token = response.token
Step 2: Fetch Data
GET /api/users/me
Header: Authorization: Bearer ${token}
Expect: Status 200
Step 3: Logout
POST /api/auth/logout
Header: Authorization: Bearer ${token}
Reports & Analytics
Uptime Report
Monthly Uptime Report - January 2025
Overall Uptime: 99.95%
Total Downtime: 22 minutes
Incidents: 3
Availability by Week:
Week 1: 100.00%
Week 2: 99.92%
Week 3: 100.00%
Week 4: 99.88%
Top Incidents:
1. Database failover (8 min)
2. CDN issue (10 min)
3. DNS propagation (4 min)
SLA Tracking
SLA Status
Target: 99.9% (43.8 min/month allowed)
Current: 99.95% (22 min used)
Remaining: 21.8 min
Status: ✓ On Track
Export Data
Export uptime data:
- CSV: Raw check results
- PDF: Formatted report
- API: Programmatic access
Best Practices
Monitoring Strategy
- Monitor critical paths: Homepage, checkout, API
- Use multiple locations: Detect regional issues
- Set appropriate intervals: Balance coverage vs. cost
- Define escalation: Clear alert routing
- Regular review: Analyze incidents monthly
What to Monitor
| Type | Examples |
|---|---|
| Public pages | Homepage, product pages |
| Critical flows | Checkout, signup, login |
| APIs | Public and internal endpoints |
| Infrastructure | Database, cache, CDN |
| Third-party | Payment provider, auth service |
Alert Best Practices
- Require 2+ failures before alerting
- Use multiple locations to confirm
- Set up escalation for unacknowledged
- Include recovery notifications
- Have a clear on-call rotation
Troubleshooting
False Positives
Causes:
- Single location network issues
- Too short timeout
- Rate limiting by server
- Geographic routing changes
Solutions:
- Require multiple location failures
- Increase timeout
- Whitelist monitoring IPs
- Add jitter to checks
Missing Alerts
Check:
- Alert configuration enabled
- Notification channels working
- Not in maintenance window
- Failure threshold met
Inconsistent Results
Review:
- Geographic variations
- Time-of-day patterns
- CDN behavior
- DNS resolution