Comprehensive guide for monitoring system health, maintaining optimal performance, and proactive troubleshooting of your Koha deployment.


System Health Monitoring

Quick Health Check

Connect to your instance via EC2 Instance Connect and run:

# System services status
sudo systemctl status koha-plack
sudo systemctl status koha-worker
sudo systemctl status koha-zebra-daemon
sudo systemctl status apache2
sudo systemctl status mysql  # Basic/Standard only

# Disk space
df -h /
df -h /var/lib/mysql  # Basic/Standard only

# Memory usage
free -h

# CPU load
uptime

Expected output:

  • All services: active (running)
  • Disk usage: < 80%
  • Memory: At least 20% free
  • Load average: < number of CPU cores

AWS CloudWatch Monitoring

Enable Detailed Monitoring

All tiers include basic CloudWatch metrics. For enhanced monitoring:

  1. Go to EC2 Console
  2. Select your instance
  3. Actions → Monitor and troubleshoot → Manage detailed monitoring
  4. Enable (additional charges apply)

Key Metrics to Watch

EC2 Instance Metrics

CPU Utilization:

  • Normal: 10-40% average
  • High: 60-80% (consider scaling up)
  • Critical: > 90% sustained

Network I/O:

  • Monitor for unusual spikes
  • Basic tier: Typically 1-10 MB/min
  • Enterprise tier: Can be higher with multiple instances

Disk I/O:

  • Read/Write Operations
  • High sustained I/O may indicate:
    • Database queries need optimization
    • Insufficient memory (swapping)
    • Need for SSD volumes

Database Metrics (Enterprise Aurora Only)

CPU Utilization:

  • Normal: < 50%
  • Review: 50-80%
  • Scale: > 80% sustained

Connections:

  • Monitor connection count
  • Default max: 100 (adjustable)
  • High connections may indicate connection pooling issues

Aurora Capacity Units (ACU):

  • Monitor scaling events
  • Adjust min/max ACU if frequent scaling

CloudWatch Alarms Setup

Create CPU alarm:

aws cloudwatch put-metric-alarm \
  --alarm-name koha-high-cpu \
  --alarm-description "Alert when CPU exceeds 80%" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 2 \
  --dimensions Name=InstanceId,Value=i-xxxxx

# Replace i-xxxxx with your instance ID

Create disk space alarm:

# First, install CloudWatch agent
sudo apt-get install -y amazon-cloudwatch-agent

# Configure agent to monitor disk
sudo tee /opt/aws/amazon-cloudwatch-agent/etc/config.json > /dev/null << EOF
{
  "metrics": {
    "namespace": "Koha/DiskSpace",
    "metrics_collected": {
      "disk": {
        "measurement": [
          {"name": "used_percent", "unit": "Percent"}
        ],
        "metrics_collection_interval": 60,
        "resources": {
          "*": "*"
        }
      }
    }
  }
}
EOF

# Start agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config \
  -m ec2 \
  -s \
  -c file:/opt/aws/amazon-cloudwatch-agent/etc/config.json

Log Monitoring

Log Locations

Koha application logs:

/var/log/koha/library/intranet-error.log    # Staff interface errors
/var/log/koha/library/opac-error.log        # OPAC errors
/var/log/koha/library/plack.log             # Plack application server
/var/log/koha/library/plack-error.log       # Plack errors

System logs:

/var/log/apache2/error.log                  # Apache errors
/var/log/apache2/access.log                 # Access logs
/var/log/mysql/error.log                    # MySQL errors (Basic/Standard)
/var/log/syslog                             # System messages

Real-Time Log Monitoring

# Watch all Koha errors
sudo tail -f /var/log/koha/library/*error*.log

# Watch Apache errors
sudo tail -f /var/log/apache2/error.log

# Watch database errors (Basic/Standard)
sudo tail -f /var/log/mysql/error.log

# Search for specific errors
sudo grep -i "error" /var/log/koha/library/*.log | tail -20
sudo grep -i "fatal" /var/log/koha/library/*.log | tail -20

Log Analysis

Check for common issues:

# Database connection errors
sudo grep -c "DBI connect" /var/log/koha/library/*error*.log

# Memory exhaustion
sudo grep -c "Out of memory" /var/log/syslog

# Permission errors
sudo grep -c "Permission denied" /var/log/koha/library/*.log

# 500 errors
sudo grep -c "500" /var/log/apache2/error.log

Log Rotation

Logs are automatically rotated. Check configuration:

# Koha log rotation
cat /etc/logrotate.d/koha-common

# Apache log rotation
cat /etc/logrotate.d/apache2

Typical configuration:

  • Rotate: Daily
  • Compress: Yes
  • Retention: 14 days
  • Size limit: 100M per file

CloudWatch Logs Integration (Optional)

Install CloudWatch Logs agent:

# Install agent
sudo apt-get install -y awslogs

# Configure
sudo tee /etc/awslogs/config/koha.conf > /dev/null << EOF
[/var/log/koha/library/intranet-error.log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /var/log/koha/library/intranet-error.log
buffer_duration = 5000
log_stream_name = {instance_id}/koha-intranet-error
initial_position = start_of_file
log_group_name = /koha/application

[/var/log/apache2/error.log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /var/log/apache2/error.log
buffer_duration = 5000
log_stream_name = {instance_id}/apache-error
initial_position = start_of_file
log_group_name = /koha/apache
EOF

# Start service
sudo systemctl start awslogsd
sudo systemctl enable awslogsd

Performance Monitoring

Database Performance (Basic/Standard)

Check slow queries:

# Enable slow query log
sudo mysql -e "SET GLOBAL slow_query_log = 'ON';"
sudo mysql -e "SET GLOBAL long_query_time = 2;"  # Log queries > 2 seconds

# View slow queries
sudo mysqldumpslow /var/log/mysql/mysql-slow.log | head -20

Monitor database size:

# Database size
sudo mysql -e "
  SELECT 
    table_schema AS 'Database',
    ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
  FROM information_schema.tables
  WHERE table_schema = 'koha_library'
  GROUP BY table_schema;
"

# Largest tables
sudo mysql -e "
  SELECT 
    table_name AS 'Table',
    ROUND((data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
  FROM information_schema.tables
  WHERE table_schema = 'koha_library'
  ORDER BY (data_length + index_length) DESC
  LIMIT 10;
"

Database Performance (Enterprise Aurora)

Monitor from RDS Console:

  1. Go to RDS Console
  2. Select your Aurora cluster
  3. Click “Monitoring” tab
  4. Review:
    • CPU utilization
    • Database connections
    • Read/Write IOPS
    • Network throughput

Performance Insights:

  • Go to RDS Console → Performance Insights
  • Analyze slow queries
  • Identify bottlenecks
  • Review wait events

Apache Performance

Check Apache status:

# Enable Apache status module (if not already enabled)
sudo a2enmod status

# View status
curl http://localhost/server-status

# Monitor active connections
watch -n 1 'curl -s http://localhost/server-status?auto | grep -E "Total Accesses|BusyWorkers|IdleWorkers"'

Apache worker processes:

# Check current configuration
apache2ctl -M | grep -E "mpm_|worker"

# View process count
ps aux | grep apache2 | wc -l

Search Index Performance

Monitor Zebra:

# Check Zebra process
ps aux | grep zebra

# Test search performance
time sudo koha-shell library << EOF
use C4::Search;
my (\$error, \$results) = SimpleSearch("ti:test");
print "Found: ", scalar @\$results, " results\n";
EOF

Rebuild if slow:

# Incremental rebuild
sudo koha-rebuild-zebra -v library

# Full rebuild (if searches are very slow)
sudo koha-rebuild-zebra -f -v library

Automated Health Checks

Create Health Check Script

# Create script
sudo tee /usr/local/bin/koha-health-check.sh > /dev/null << 'EOF'
#!/bin/bash
# Koha Health Check Script

EMAIL="admin@yourlibrary.org"
LOG="/var/log/koha-health-check.log"
ERRORS=0

echo "=== Koha Health Check ===" >> $LOG
date >> $LOG

# Check Koha services
for service in koha-plack koha-worker koha-zebra-daemon apache2; do
  if ! systemctl is-active --quiet $service; then
    echo "ERROR: $service is not running" >> $LOG
    ERRORS=$((ERRORS + 1))
  fi
done

# Check MySQL (Basic/Standard only)
if systemctl list-units --type=service --all | grep -q mysql; then
  if ! systemctl is-active --quiet mysql; then
    echo "ERROR: MySQL is not running" >> $LOG
    ERRORS=$((ERRORS + 1))
  fi
fi

# Check disk space
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ $DISK_USAGE -gt 80 ]; then
  echo "WARNING: Disk usage is ${DISK_USAGE}%" >> $LOG
  ERRORS=$((ERRORS + 1))
fi

# Check memory
MEM_USAGE=$(free | awk 'NR==2 {printf "%.0f", $3/$2*100}')
if [ $MEM_USAGE -gt 90 ]; then
  echo "WARNING: Memory usage is ${MEM_USAGE}%" >> $LOG
  ERRORS=$((ERRORS + 1))
fi

# Send email if errors found
if [ $ERRORS -gt 0 ]; then
  cat $LOG | mail -s "Koha Health Check: $ERRORS issues found" $EMAIL
fi

echo "Health check completed with $ERRORS errors" >> $LOG
echo "---" >> $LOG
EOF

# Make executable
sudo chmod +x /usr/local/bin/koha-health-check.sh

Schedule Health Checks

# Add to crontab (runs every hour)
sudo crontab -e

# Add this line:
0 * * * * /usr/local/bin/koha-health-check.sh

Regular Maintenance Tasks

Daily Tasks

1. Check backups:

# For Standard tier with S3
aws s3 ls s3://your-backup-bucket/ --recursive | tail -5

# For Basic/Free tier
ls -lh /var/lib/koha/backups/ | tail -5

# For Enterprise tier
aws rds describe-db-cluster-snapshots \
  --db-cluster-identifier your-cluster \
  --query 'DBClusterSnapshots[0:3].[SnapshotCreateTime,DBClusterSnapshotIdentifier,Status]'

2. Review error logs:

# Check for new errors since yesterday
sudo find /var/log/koha/library/ -name "*error*.log" -mtime -1 -exec tail -20 {} \;

3. Monitor disk space:

df -h / | awk 'NR==2 {print "Root: " $5}'
df -h /var/lib/mysql | awk 'NR==2 {print "Database: " $5}'  # Basic/Standard

Weekly Tasks

1. Database optimization:

# For Basic/Standard
sudo koha-mysql library << 'EOF'
-- Optimize tables
OPTIMIZE TABLE biblio;
OPTIMIZE TABLE items;
OPTIMIZE TABLE borrowers;
OPTIMIZE TABLE issues;
OPTIMIZE TABLE old_issues;

-- Check fragmentation
SELECT 
  table_name,
  ROUND(data_length / 1024 / 1024, 2) AS data_mb,
  ROUND(data_free / 1024 / 1024, 2) AS free_mb,
  ROUND((data_free / data_length) * 100, 2) AS fragmentation
FROM information_schema.tables
WHERE table_schema = 'koha_library'
  AND data_free > 0
ORDER BY fragmentation DESC;
EOF

2. Clear temporary files:

# Clear old sessions
sudo find /var/lib/koha/library/sessions/ -type f -mtime +7 -delete

# Clear temp files
sudo find /tmp -name "Koha*" -mtime +7 -delete

3. Review system updates:

# Check for security updates
sudo apt-get update
sudo apt list --upgradable

# Apply security updates (during maintenance window)
sudo apt-get upgrade -y

Monthly Tasks

1. Review CloudWatch metrics:

  • Check average CPU usage trends
  • Review disk I/O patterns
  • Analyze network traffic
  • Identify performance degradation

2. Database backup test:

# Test backup restoration on test instance
# See Migration Guide for detailed procedures

3. Update documentation:

  • Document any configuration changes
  • Update runbook procedures
  • Review and update alarm thresholds

4. Capacity planning:

  • Review growth trends
  • Forecast future capacity needs
  • Plan for scaling if needed

Quarterly Tasks

1. Security audit:

  • Review IAM permissions
  • Audit security group rules
  • Check for unused resources
  • Review access logs

2. Performance review:

  • Analyze slow query logs
  • Optimize database indices
  • Review Apache configuration
  • Consider tier upgrade if needed

3. Disaster recovery test:

  • Test backup restoration
  • Verify recovery procedures
  • Update DR documentation
  • Train staff on procedures

Maintenance Windows

Scheduling Maintenance

Best practices:

  • Schedule during lowest usage (typically weekend evenings)
  • Notify users 48-72 hours in advance
  • Communicate expected downtime
  • Have rollback plan ready

Communication template:

Subject: Scheduled Maintenance - [Date/Time]

Dear Library Users,

We will be performing scheduled maintenance on our library system:

Date: [Day, Month Date, Year]
Time: [Start Time] - [End Time] [Timezone]
Expected Duration: [X hours]

During this time:
- The catalog will be unavailable
- You will not be able to place holds or renew items
- All current loans will remain active

We apologize for any inconvenience.

[Your Library] IT Team

Maintenance Checklist

Before maintenance:

  • Announce maintenance window
  • Create full backup
  • Document current system state
  • Prepare rollback procedures
  • Test changes in staging (if available)

During maintenance:

  • Stop user-facing services (if needed)
  • Perform updates/changes
  • Test functionality
  • Review logs for errors
  • Verify services are running

After maintenance:

  • Monitor system for 1 hour
  • Check error logs
  • Verify all services functional
  • Announce completion
  • Document changes made

Scaling and Optimization

When to Scale Up (Basic/Standard)

Indicators:

  • CPU consistently > 70%
  • Memory usage > 85%
  • Disk I/O wait times increasing
  • Response times degrading
  • Database slow query log growing

Scaling procedure:

# 1. Create snapshot/backup
# 2. Stop instance
# 3. Change instance type
# 4. Start instance
# 5. Verify functionality

# Via AWS CLI
aws ec2 stop-instances --instance-ids i-xxxxx
aws ec2 modify-instance-attribute \
  --instance-id i-xxxxx \
  --instance-type m8g.large
aws ec2 start-instances --instance-ids i-xxxxx

When to Scale Up (Enterprise)

Auto Scaling handles instance count automatically.

Adjust Aurora capacity:

# Modify Aurora cluster
aws rds modify-db-cluster \
  --db-cluster-identifier your-cluster \
  --serverless-v2-scaling-configuration MinCapacity=1.0,MaxCapacity=8.0

Add more instances to ASG:

  1. Go to EC2 Console → Auto Scaling Groups
  2. Select your ASG
  3. Edit → Desired capacity: Increase
  4. Wait for new instances to launch
  5. Verify in Target Group health checks

Monitoring Tools Summary

Tool Free/Basic Standard Enterprise Purpose
CloudWatch Metrics Basic monitoring
CloudWatch Logs Optional Optional Optional Centralized logging
Performance Insights Aurora query analysis
RDS Enhanced Monitoring Detailed DB metrics
Application Load Balancer Request metrics
Health Check Script Automated checks

Getting Help

For monitoring assistance:

  • Email: support@kohasupport.com
  • Subject: “Monitoring Help - [Your Library]”
  • Include: Tier, metrics, timeframe


Last Updated: December 2025