Koha Monitoring & Maintenance

Comprehensive guide for monitoring system health, maintaining optimal performance, and proactive troubleshooting of your Koha deployment.

System Health Monitoring

Quick Health Check

Connect to your instance via EC2 Instance Connect and run:

# System services status
sudo systemctl status koha-plack
sudo systemctl status koha-worker
sudo systemctl status koha-zebra-daemon
sudo systemctl status apache2
sudo systemctl status mysql  # Basic/Standard only

# Disk space
df -h /
df -h /var/lib/mysql  # Basic/Standard only

# Memory usage
free -h

# CPU load
uptime

Expected output:

All services: active (running)
Disk usage: < 80%
Memory: At least 20% free
Load average: < number of CPU cores

AWS CloudWatch Monitoring

Enable Detailed Monitoring

All tiers include basic CloudWatch metrics. For enhanced monitoring:

Go to EC2 Console
Select your instance
Actions → Monitor and troubleshoot → Manage detailed monitoring
Enable (additional charges apply)

Key Metrics to Watch

EC2 Instance Metrics

CPU Utilization:

Normal: 10-40% average
High: 60-80% (consider scaling up)
Critical: > 90% sustained

Network I/O:

Monitor for unusual spikes
Free tier: Typically 1-10 MB/min
Enterprise tier: Can be higher with multiple instances

Disk I/O:

Read/Write Operations
High sustained I/O may indicate:
- Database queries need optimization
- Insufficient memory (swapping)
- Need for SSD volumes

Database Metrics (Enterprise Aurora Only)

CPU Utilization:

Normal: < 50%
Review: 50-80%
Scale: > 80% sustained

Connections:

Monitor connection count
Default max: 100 (adjustable)
High connections may indicate connection pooling issues

Aurora Capacity Units (ACU):

Monitor scaling events
Adjust min/max ACU if frequent scaling

CloudWatch Alarms Setup

Create CPU alarm:

aws cloudwatch put-metric-alarm \
  --alarm-name koha-high-cpu \
  --alarm-description "Alert when CPU exceeds 80%" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 2 \
  --dimensions Name=InstanceId,Value=i-xxxxx

# Replace i-xxxxx with your instance ID

Create disk space alarm:

# First, install CloudWatch agent
sudo apt-get install -y amazon-cloudwatch-agent

# Configure agent to monitor disk
sudo tee /opt/aws/amazon-cloudwatch-agent/etc/config.json > /dev/null << EOF
{
  "metrics": {
    "namespace": "Koha/DiskSpace",
    "metrics_collected": {
      "disk": {
        "measurement": [
          {"name": "used_percent", "unit": "Percent"}
        ],
        "metrics_collection_interval": 60,
        "resources": {
          "*": "*"
        }
      }
    }
  }
}
EOF

# Start agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config \
  -m ec2 \
  -s \
  -c file:/opt/aws/amazon-cloudwatch-agent/etc/config.json

Log Monitoring

Log Locations

Koha application logs:

/var/log/koha/library/intranet-error.log    # Staff interface errors
/var/log/koha/library/opac-error.log        # OPAC errors
/var/log/koha/library/plack.log             # Plack application server
/var/log/koha/library/plack-error.log       # Plack errors

System logs:

/var/log/apache2/error.log                  # Apache errors
/var/log/apache2/access.log                 # Access logs
/var/log/mysql/error.log                    # MySQL errors (Basic/Standard)
/var/log/syslog                             # System messages

Real-Time Log Monitoring

# Watch all Koha errors
sudo tail -f /var/log/koha/library/*error*.log

# Watch Apache errors
sudo tail -f /var/log/apache2/error.log

# Watch database errors (Basic/Standard)
sudo tail -f /var/log/mysql/error.log

# Search for specific errors
sudo grep -i "error" /var/log/koha/library/*.log | tail -20
sudo grep -i "fatal" /var/log/koha/library/*.log | tail -20

Log Analysis

Check for common issues:

# Database connection errors
sudo grep -c "DBI connect" /var/log/koha/library/*error*.log

# Memory exhaustion
sudo grep -c "Out of memory" /var/log/syslog

# Permission errors
sudo grep -c "Permission denied" /var/log/koha/library/*.log

# 500 errors
sudo grep -c "500" /var/log/apache2/error.log

Log Rotation

Logs are automatically rotated. Check configuration:

# Koha log rotation
cat /etc/logrotate.d/koha-common

# Apache log rotation
cat /etc/logrotate.d/apache2

Typical configuration:

Rotate: Daily
Compress: Yes
Retention: 14 days
Size limit: 100M per file

CloudWatch Logs Integration (Optional)

Install CloudWatch Logs agent:

# Install agent
sudo apt-get install -y awslogs

# Configure
sudo tee /etc/awslogs/config/koha.conf > /dev/null << EOF
[/var/log/koha/library/intranet-error.log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /var/log/koha/library/intranet-error.log
buffer_duration = 5000
log_stream_name = {instance_id}/koha-intranet-error
initial_position = start_of_file
log_group_name = /koha/application

[/var/log/apache2/error.log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /var/log/apache2/error.log
buffer_duration = 5000
log_stream_name = {instance_id}/apache-error
initial_position = start_of_file
log_group_name = /koha/apache
EOF

# Start service
sudo systemctl start awslogsd
sudo systemctl enable awslogsd

Performance Monitoring

Database Performance (Basic/Standard)

Check slow queries:

# Enable slow query log
sudo mysql -e "SET GLOBAL slow_query_log = 'ON';"
sudo mysql -e "SET GLOBAL long_query_time = 2;"  # Log queries > 2 seconds

# View slow queries
sudo mysqldumpslow /var/log/mysql/mysql-slow.log | head -20

Monitor database size:

# Database size
sudo mysql -e "
  SELECT 
    table_schema AS 'Database',
    ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
  FROM information_schema.tables
  WHERE table_schema = 'koha_library'
  GROUP BY table_schema;
"

# Largest tables
sudo mysql -e "
  SELECT 
    table_name AS 'Table',
    ROUND((data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
  FROM information_schema.tables
  WHERE table_schema = 'koha_library'
  ORDER BY (data_length + index_length) DESC
  LIMIT 10;
"

Database Performance (Enterprise Aurora)

Monitor from RDS Console:

Go to RDS Console
Select your Aurora cluster
Click “Monitoring” tab
Review:
- CPU utilization
- Database connections
- Read/Write IOPS
- Network throughput

Performance Insights:

Go to RDS Console → Performance Insights
Analyze slow queries
Identify bottlenecks
Review wait events

Apache Performance

Check Apache status:

# Enable Apache status module (if not already enabled)
sudo a2enmod status

# View status
curl http://localhost/server-status

# Monitor active connections
watch -n 1 'curl -s http://localhost/server-status?auto | grep -E "Total Accesses|BusyWorkers|IdleWorkers"'

Apache worker processes:

# Check current configuration
apache2ctl -M | grep -E "mpm_|worker"

# View process count
ps aux | grep apache2 | wc -l

Search Index Performance

Monitor Zebra:

# Check Zebra process
ps aux | grep zebra

# Test search performance
time sudo koha-shell library << EOF
use C4::Search;
my (\$error, \$results) = SimpleSearch("ti:test");
print "Found: ", scalar @\$results, " results\n";
EOF

Rebuild if slow:

# Incremental rebuild
sudo koha-rebuild-zebra -v library

# Full rebuild (if searches are very slow)
sudo koha-rebuild-zebra -f -v library

Automated Health Checks

Create Health Check Script

# Create script
sudo tee /usr/local/bin/koha-health-check.sh > /dev/null << 'EOF'
#!/bin/bash
# Koha Health Check Script

EMAIL="[email protected]"
LOG="/var/log/koha-health-check.log"
ERRORS=0

echo "=== Koha Health Check ===" >> $LOG
date >> $LOG

# Check Koha services
for service in koha-plack koha-worker koha-zebra-daemon apache2; do
  if ! systemctl is-active --quiet $service; then
    echo "ERROR: $service is not running" >> $LOG
    ERRORS=$((ERRORS + 1))
  fi
done

# Check MySQL (Basic/Standard only)
if systemctl list-units --type=service --all | grep -q mysql; then
  if ! systemctl is-active --quiet mysql; then
    echo "ERROR: MySQL is not running" >> $LOG
    ERRORS=$((ERRORS + 1))
  fi
fi

# Check disk space
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ $DISK_USAGE -gt 80 ]; then
  echo "WARNING: Disk usage is ${DISK_USAGE}%" >> $LOG
  ERRORS=$((ERRORS + 1))
fi

# Check memory
MEM_USAGE=$(free | awk 'NR==2 {printf "%.0f", $3/$2*100}')
if [ $MEM_USAGE -gt 90 ]; then
  echo "WARNING: Memory usage is ${MEM_USAGE}%" >> $LOG
  ERRORS=$((ERRORS + 1))
fi

# Send email if errors found
if [ $ERRORS -gt 0 ]; then
  cat $LOG | mail -s "Koha Health Check: $ERRORS issues found" $EMAIL
fi

echo "Health check completed with $ERRORS errors" >> $LOG
echo "---" >> $LOG
EOF

# Make executable
sudo chmod +x /usr/local/bin/koha-health-check.sh

Schedule Health Checks

# Add to crontab (runs every hour)
sudo crontab -e

# Add this line:
0 * * * * /usr/local/bin/koha-health-check.sh

Regular Maintenance Tasks

Daily Tasks

1. Check backups:

# For Standard tier with S3
aws s3 ls s3://your-backup-bucket/ --recursive | tail -5

# For Basic/Free tier
ls -lh /var/lib/koha/backups/ | tail -5

# For Enterprise tier
aws rds describe-db-cluster-snapshots \
  --db-cluster-identifier your-cluster \
  --query 'DBClusterSnapshots[0:3].[SnapshotCreateTime,DBClusterSnapshotIdentifier,Status]'

2. Review error logs:

# Check for new errors since yesterday
sudo find /var/log/koha/library/ -name "*error*.log" -mtime -1 -exec tail -20 {} \;

3. Monitor disk space:

df -h / | awk 'NR==2 {print "Root: " $5}'
df -h /var/lib/mysql | awk 'NR==2 {print "Database: " $5}'  # Basic/Standard

Weekly Tasks

1. Database optimization:

# For Basic/Standard
sudo koha-mysql library << 'EOF'
-- Optimize tables
OPTIMIZE TABLE biblio;
OPTIMIZE TABLE items;
OPTIMIZE TABLE borrowers;
OPTIMIZE TABLE issues;
OPTIMIZE TABLE old_issues;

-- Check fragmentation
SELECT 
  table_name,
  ROUND(data_length / 1024 / 1024, 2) AS data_mb,
  ROUND(data_free / 1024 / 1024, 2) AS free_mb,
  ROUND((data_free / data_length) * 100, 2) AS fragmentation
FROM information_schema.tables
WHERE table_schema = 'koha_library'
  AND data_free > 0
ORDER BY fragmentation DESC;
EOF

2. Clear temporary files:

# Clear old sessions
sudo find /var/lib/koha/library/sessions/ -type f -mtime +7 -delete

# Clear temp files
sudo find /tmp -name "Koha*" -mtime +7 -delete

3. Review system updates:

# Check for security updates
sudo apt-get update
sudo apt list --upgradable

# Apply security updates (during maintenance window)
sudo apt-get upgrade -y

Monthly Tasks

1. Review CloudWatch metrics:

Check average CPU usage trends
Review disk I/O patterns
Analyze network traffic
Identify performance degradation

2. Database backup test:

# Test backup restoration on test instance
# See Migration Guide for detailed procedures

3. Update documentation:

Document any configuration changes
Update runbook procedures
Review and update alarm thresholds

4. Capacity planning:

Review growth trends
Forecast future capacity needs
Plan for scaling if needed

Quarterly Tasks

1. Security audit:

Review IAM permissions
Audit security group rules
Check for unused resources
Review access logs

2. Performance review:

Analyze slow query logs
Optimize database indices
Review Apache configuration
Consider tier upgrade if needed

3. Disaster recovery test:

Test backup restoration
Verify recovery procedures
Update DR documentation
Train staff on procedures

Maintenance Windows

Scheduling Maintenance

Best practices:

Schedule during lowest usage (typically weekend evenings)
Notify users 48-72 hours in advance
Communicate expected downtime
Have rollback plan ready

Communication template:

Subject: Scheduled Maintenance - [Date/Time]

Dear Library Users,

We will be performing scheduled maintenance on our library system:

Date: [Day, Month Date, Year]
Time: [Start Time] - [End Time] [Timezone]
Expected Duration: [X hours]

During this time:
- The catalog will be unavailable
- You will not be able to place holds or renew items
- All current loans will remain active

We apologize for any inconvenience.

[Your Library] IT Team

Maintenance Checklist

Before maintenance:

Announce maintenance window
Create full backup
Document current system state
Prepare rollback procedures
Test changes in staging (if available)

During maintenance:

After maintenance:

Scaling and Optimization

When to Scale Up (Basic/Standard)

Indicators:

CPU consistently > 70%
Memory usage > 85%
Disk I/O wait times increasing
Response times degrading
Database slow query log growing

Scaling procedure:

# 1. Create snapshot/backup
# 2. Stop instance
# 3. Change instance type
# 4. Start instance
# 5. Verify functionality

# Via AWS CLI
aws ec2 stop-instances --instance-ids i-xxxxx
aws ec2 modify-instance-attribute \
  --instance-id i-xxxxx \
  --instance-type m8g.large
aws ec2 start-instances --instance-ids i-xxxxx

When to Scale Up (Enterprise)

Auto Scaling handles instance count automatically.

Adjust Aurora capacity:

# Modify Aurora cluster
aws rds modify-db-cluster \
  --db-cluster-identifier your-cluster \
  --serverless-v2-scaling-configuration MinCapacity=1.0,MaxCapacity=8.0

Add more instances to ASG:

Go to EC2 Console → Auto Scaling Groups
Select your ASG
Edit → Desired capacity: Increase
Wait for new instances to launch
Verify in Target Group health checks

Monitoring Tools Summary

Tool	Free/Basic	Standard	Enterprise	Purpose
CloudWatch Metrics	✓	✓	✓	Basic monitoring
CloudWatch Logs	Optional	Optional	Optional	Centralized logging
Performance Insights	✗	✗	✓	Aurora query analysis
RDS Enhanced Monitoring	✗	✗	✓	Detailed DB metrics
Application Load Balancer	✗	✗	✓	Request metrics
Health Check Script	✓	✓	✓	Automated checks

Getting Help

For monitoring assistance:

Email: [email protected]
Subject: “Monitoring Help - [Your Library]”
Include: Tier, metrics, timeframe

Last Updated: December 2025