grants/PRODUCTION_READINESS.md
gdegelas a05331128b Atlas Green Morocco — grant strategy platform
- Full grant strategy framework for renewable energy & green hydrogen
- AI-powered grant studio, partner outreach, financial modeling
- Umami analytics with data-performance tracking
- Live Degelas metrics connected to solar.degelas.be
- Trilingual (EN/FR/AR) with i18n support
- Dockerized with Nginx frontend + Express API proxy
2026-06-01 09:44:03 +00:00

276 lines
6.1 KiB
Markdown

# Production Readiness Checklist
## Pre-Deployment Verification
### Code Quality
- [x] TypeScript compilation passes (`npm run typecheck`)
- [x] Build succeeds without errors (`npm run build`)
- [x] Bundle size acceptable (<300 KB gzipped)
- [x] No console.log statements in production code
- [x] Error boundaries in place for React components
### Security
- [x] API keys never exposed to frontend
- [x] CORS configured with allowed origins
- [x] Rate limiting enabled on API endpoints
- [x] Input validation on all API endpoints
- [x] Security headers configured (Helmet + Nginx)
- [x] Sensitive data redacted in logs
- [x] `.env` in `.gitignore`
- [x] `.env.example` provided with documentation
### Performance
- [x] Gzip compression enabled
- [x] Static assets cached (6 months)
- [x] API rate limiting configured
- [x] Request timeouts configured
- [x] Buffer limits configured
- [x] Health check endpoint available
### Reliability
- [x] Health check endpoint (`/api/health`)
- [x] Docker health checks configured
- [x] Service restart policies set
- [x] Service dependencies with health conditions
- [x] Structured logging enabled
- [x] Error handling middleware in place
- [x] 404 handler configured
### Monitoring
- [ ] Uptime monitoring configured (UptimeRobot/Pingdom)
- [ ] Log aggregation set up (optional: ELK, Grafana)
- [ ] Error alerting configured (email/Slack)
- [ ] Disk space monitoring configured
- [ ] Memory/CPU monitoring configured
### Backup & Recovery
- [x] Backup script created
- [x] Backup schedule configured (cron)
- [ ] Backup restoration tested
- [x] `.env` backup procedure documented
- [ ] Disaster recovery plan documented
### SSL/TLS
- [ ] SSL certificate obtained (Let's Encrypt)
- [ ] HTTPS redirect configured
- [ ] HSTS enabled
- [ ] SSL Labs test passed (A+ rating)
- [ ] Certificate auto-renewal tested
### Access Control
- [ ] SSH key-based auth only
- [ ] Root login disabled
- [ ] Firewall configured (UFW)
- [ ] fail2ban installed and running
- [ ] Non-root Docker user configured (recommended)
### Documentation
- [x] Deployment guide created
- [x] Security checklist created
- [x] Environment variables documented
- [x] Troubleshooting guide created
- [x] Rollback procedure documented
---
## Deployment Steps
### 1. Environment Setup
```bash
# Copy and configure environment
cp .env.example .env
nano .env # Edit with production values
```
### 2. SSL Certificate
```bash
# Install Certbot
sudo apt install certbot python3-certbot-nginx -y
# Obtain certificate
sudo certbot --nginx -d yourdomain.com
# Test auto-renewal
sudo certbot renew --dry-run
```
### 3. Deploy
```bash
# Build and deploy
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build
# Verify
docker compose ps
curl http://localhost/api/health
```
### 4. Verify HTTPS
```bash
# Test redirect
curl -I http://yourdomain.com
# Should return 301 redirect to HTTPS
# Test HTTPS
curl -I https://yourdomain.com
# Should return 200 OK
```
### 5. Monitor
```bash
# Check logs
docker compose logs -f
# Check resource usage
docker stats
# Check disk space
df -h
```
---
## Post-Deployment Validation
### Functional Tests
- [ ] Homepage loads correctly
- [ ] All navigation links work
- [ ] AI Studio tools functional
- [ ] Document Vault works
- [ ] Project export/import works
- [ ] Multi-language switching works (EN/FR/AR)
- [ ] API health check returns 200
### Security Tests
- [ ] HTTPS redirect works
- [ ] Security headers present (check with curl -I)
- [ ] Rate limiting triggers after 100 requests
- [ ] Invalid API requests return 400
- [ ] Non-existent routes return 404
### Performance Tests
- [ ] Page load < 3 seconds on 3G
- [ ] API response < 500ms (excluding AI generation)
- [ ] Gzip compression active (check Content-Encoding header)
- [ ] Static assets cached (check Cache-Control header)
### Monitoring Tests
- [ ] Health check accessible
- [ ] Logs rotating properly
- [ ] Backup script runs successfully
- [ ] Uptime monitoring receiving pings
---
## Rollback Procedure
If deployment fails or issues arise:
### 1. Immediate Rollback
```bash
# Stop current deployment
docker compose down
# Revert code
cd /opt/atlasgreen
git checkout <previous-tag>
# Rebuild with previous version
docker compose up -d --build
```
### 2. Verify Rollback
```bash
# Check containers
docker compose ps
# Check logs
docker compose logs -f
# Test functionality
curl http://localhost/api/health
```
### 3. Document Issue
- Record what went wrong
- Document steps taken to resolve
- Update deployment checklist if needed
---
## Maintenance Schedule
### Daily
- [ ] Check error logs
- [ ] Verify uptime monitoring
- [ ] Check disk space
### Weekly
- [ ] Review access logs for anomalies
- [ ] Check backup completion
- [ ] Review API usage patterns
### Monthly
- [ ] Update system packages
- [ ] Update Docker images
- [ ] Review and rotate API keys
- [ ] Test backup restoration
- [ ] Review user feedback
### Quarterly
- [ ] Security audit (dependencies, configs)
- [ ] Performance review
- [ ] Update documentation
- [ ] Review and update firewall rules
- [ ] Penetration testing (optional)
---
## Emergency Contacts
| Role | Name | Contact |
|------|------|---------|
| Primary Admin | [Name] | [Phone/Email] |
| Secondary Admin | [Name] | [Phone/Email] |
| VPS Provider Support | [Provider] | [Support URL/Phone] |
| Domain Registrar | [Registrar] | [Support URL/Phone] |
---
## Incident Response
### Severity Levels
- **P1 (Critical)**: Site down, data breach
- **P2 (High)**: Major functionality broken
- **P3 (Medium)**: Minor functionality issues
- **P4 (Low)**: Cosmetic issues, minor bugs
### Response Times
- P1: Immediate (< 15 minutes)
- P2: Within 2 hours
- P3: Within 24 hours
- P4: Within 1 week
### Escalation Path
1. Primary Admin
2. Secondary Admin
3. External consultant (if needed)
---
## Success Criteria
Deployment is considered successful when:
- [ ] All functional tests pass
- [ ] All security tests pass
- [ ] Performance benchmarks met
- [ ] Monitoring active and alerting
- [ ] Backups running successfully
- [ ] Documentation complete
- [ ] Team trained on procedures
---
**Version**: 1.0
**Last Updated**: January 2026
**Next Review**: February 2026