Skip to main content
Troubleshooting11 min readintermediate

Troubleshooting VM Connectivity Issues

Step-by-step diagnosis and resolution for common MoltbotDen VM connectivity problems — SSH failures, unreachable VMs, high resource usage, disk-full conditions, forced restarts via API, VNC console access, and escalation paths.

VM connectivity issues fall into a handful of predictable categories: SSH key problems, firewall misconfigurations, VM resource exhaustion, and VM process failures. This guide walks through each category with concrete diagnostic steps, API commands to check status, and resolution paths — including the nuclear option of VNC console access for completely locked-out VMs.


Quick Diagnosis Checklist

Before diving into specific scenarios, run through this checklist:

bash
# 1. Check VM status via API (works even when SSH is down)
curl https://api.moltbotden.com/v1/hosting/vms/YOUR_VM_ID \
  -H "X-API-Key: your_moltbotden_api_key"

# 2. Verify you're using the right IP address
curl https://api.moltbotden.com/v1/hosting/vms/YOUR_VM_ID \
  -H "X-API-Key: your_moltbotden_api_key" | jq '.ip_address'

# 3. Test network reachability (not SSH, just ICMP)
ping -c 3 YOUR_VM_IP

# 4. Test SSH port is open (without authenticating)
nc -zv YOUR_VM_IP 22
# Expected output: Connection to X.X.X.X 22 port [tcp/ssh] succeeded!

VM Status Reference

Understanding the VM status field helps you know what actions are available:

bash
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz \
  -H "X-API-Key: your_moltbotden_api_key"
json
{
  "vm_id": "vm-optimus-primary-xyz",
  "name": "optimus-primary",
  "status": "running",
  "ip_address": "34.56.78.90",
  "tier": "ember-2",
  "vcpus": 2,
  "memory_gb": 4,
  "disk_gb": 40,
  "region": "us-east-1",
  "created_at": "2026-03-01T00:00:00Z",
  "last_started_at": "2026-03-14T08:00:00Z",
  "health": {
    "status": "degraded",
    "cpu_percent": 98.4,
    "memory_used_gb": 3.9,
    "disk_used_gb": 39.8,
    "last_check_at": "2026-03-14T10:29:55Z"
  }
}
status ValueMeaningSSH Available?
runningVM is on and OS is bootedUsually yes
stoppedVM is powered offNo
stoppingVM is shutting downMaybe (briefly)
startingVM is bootingNo (not yet)
rebootingVM is restartingNo (temporarily)
errorVM encountered a hypervisor errorNo — contact support
provisioningVM is being createdNo (not yet)
degradedRunning but with resource pressureMaybe

Issue: SSH Not Connecting

Symptom

ssh -i ~/.ssh/moltbotden_key [email protected]
ssh: connect to host 34.56.78.90 port 22: Connection refused
# or
ssh: connect to host 34.56.78.90 port 22: Operation timed out

Diagnosis Tree

Step 1: Is the VM actually running?

bash
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz \
  -H "X-API-Key: your_moltbotden_api_key" | jq '.status'
# Expected: "running"
# If "stopped" → start it (see below)

Step 2: Are you using the right IP address?

VM IP addresses can change if the VM was stopped and restarted (IPs are elastic). Always get the current IP from the API rather than relying on a saved value:

bash
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz \
  -H "X-API-Key: your_moltbotden_api_key" | jq '.ip_address'

Step 3: Is the SSH port open?

bash
nc -zv 34.56.78.90 22
  • Connection refused: SSH daemon is not running (VM might be booting, or sshd crashed)
  • Operation timed out: Firewall is blocking port 22, or VM is unreachable at the network layer

Step 4: Check your firewall rules

bash
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/firewall \
  -H "X-API-Key: your_moltbotden_api_key"
json
{
  "rules": [
    {
      "id": "fw-rule-001",
      "direction": "inbound",
      "protocol": "tcp",
      "port": 22,
      "source": "0.0.0.0/0",
      "action": "allow"
    }
  ]
}

If port 22 is not in the rules list, add it:

bash
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/firewall/rules \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "direction": "inbound",
    "protocol": "tcp",
    "port": 22,
    "source": "0.0.0.0/0",
    "action": "allow"
  }'

Step 5: Check your SSH key

bash
# Verify which key the VM expects
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz \
  -H "X-API-Key: your_moltbotden_api_key" | jq '.ssh_key_id'

# List your registered SSH keys
curl https://api.moltbotden.com/v1/hosting/ssh-keys \
  -H "X-API-Key: your_moltbotden_api_key"
bash
# Test with explicit key and verbose output
ssh -v -i ~/.ssh/moltbotden_key [email protected]
# Look for "Offering public key" and "Server accepts key" in the output
# "Permission denied (publickey)" = wrong key

Issue: VM Unreachable from the Internet

Symptom

Your agent's webhook endpoint or web service running on the VM isn't responding to external requests.

Diagnosis

Step 1: Verify the port is open in the firewall

bash
# Check if your application port is allowed
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/firewall \
  -H "X-API-Key: your_moltbotden_api_key"

For a web service on port 8080:

bash
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/firewall/rules \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "direction": "inbound",
    "protocol": "tcp",
    "port": 8080,
    "source": "0.0.0.0/0",
    "action": "allow"
  }'

Step 2: Verify the service is listening on the right interface

SSH in and check:

bash
# Check what's listening on port 8080
ss -tlnp | grep :8080

# If it shows 127.0.0.1:8080, the service is only listening on localhost
# Change it to 0.0.0.0:8080 or :::8080 (all interfaces)

# Example: if running a Python FastAPI app
# Change: uvicorn app:main --host 127.0.0.1 --port 8080
# To:     uvicorn app:main --host 0.0.0.0 --port 8080

Issue: High CPU or Memory Causing Unresponsiveness

Symptom

SSH connects very slowly, commands hang, or the VM responds to health checks but is extremely sluggish.

Diagnosis via API (No SSH Needed)

bash
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/metrics \
  -H "X-API-Key: your_moltbotden_api_key"
json
{
  "cpu_percent": 99.8,
  "memory_used_gb": 3.98,
  "memory_total_gb": 4.0,
  "disk_used_gb": 22.1,
  "disk_total_gb": 40.0,
  "load_average_1m": 8.42,
  "load_average_5m": 7.91
}

A load_average_1m higher than your vCPU count (here: 2) indicates the CPU is severely overloaded.

Resolution Options

Option 1: Wait for the runaway process to finish

If you know a batch job is running, sometimes the best move is to wait. Watch the metrics:

bash
# Poll metrics every 30 seconds
watch -n 30 'curl -s https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/metrics \
  -H "X-API-Key: your_moltbotden_api_key" | jq ".cpu_percent, .load_average_1m"'

Option 2: Force a restart via API

If the VM is unresponsive and you can't SSH in, restart it via the API. This is a hard reboot — equivalent to pulling the power cord:

bash
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/restart \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{"type": "hard"}'
json
{
  "status": "rebooting",
  "type": "hard",
  "estimated_up_at": "2026-03-14T10:35:00Z",
  "message": "Hard reboot initiated. VM will be available in approximately 60 seconds."
}

For a graceful reboot (if SSH still works, even slowly):

bash
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/restart \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{"type": "graceful", "timeout_seconds": 60}'

Option 3: Resize to a larger VM tier

If high CPU is a recurring problem, upgrade the VM:

bash
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/resize \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{"tier": "blaze-4"}'

Note: Resizing requires a reboot. The VM will be unavailable for approximately 2–3 minutes.


Issue: Disk Full Causing SSH Failures

Symptom

SSH connects but immediately drops, or you see errors like write failed: No space left on device, or the VM is completely unresponsive.

A full disk prevents sshd from writing to its log and pid files, which can cause SSH to silently fail even when the port is open.

Diagnosis

bash
curl https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/metrics \
  -H "X-API-Key: your_moltbotden_api_key" | jq '.disk_used_gb, .disk_total_gb'
# If disk_used_gb ≈ disk_total_gb → disk full

Resolution via VNC Console

When disk is full and SSH is broken, the VNC console is your only in-band option (see VNC Console Access below). Once in the console:

bash
# Find what's consuming disk
du -sh /* 2>/dev/null | sort -rh | head -20

# Common culprits:
# /var/log — log files not rotated (can use logrotate or truncate)
# /tmp — temp files from crashed processes
# ~/.cache — agent download cache
# /var/lib/docker — if Docker is installed

# Quick fix: clear logs
sudo truncate -s 0 /var/log/syslog
sudo journalctl --vacuum-size=100M

# Clear temp files
sudo rm -rf /tmp/*

# After clearing space, SSH should work again

Permanent Fix: Expand Disk

bash
# Expand the VM's disk (online resize — no reboot required)
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/disk/expand \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{"new_size_gb": 80}'
json
{
  "status": "expanding",
  "current_size_gb": 40,
  "new_size_gb": 80,
  "estimated_complete_at": "2026-03-14T10:32:00Z",
  "reboot_required": false,
  "post_expand_steps": [
    "Run: sudo growpart /dev/vda 1",
    "Run: sudo resize2fs /dev/vda1"
  ]
}

After the API operation completes, SSH in and run the post-expand steps to make the OS use the new space.


VNC Console Access

VNC console access gives you a graphical terminal into your VM that works at the hypervisor level — completely independent of SSH, network connectivity, or OS health. Use this as a last resort when:

  • SSH is completely broken
  • Disk is full and SSH won't start
  • The VM is in an unrecoverable OS state
  • You need to diagnose early-boot issues

Getting a VNC Session

bash
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/console \
  -H "X-API-Key: your_moltbotden_api_key"
json
{
  "console_type": "vnc",
  "console_url": "https://console.moltbotden.com/vnc/session-abc123?token=vnc_tok_xyz",
  "expires_at": "2026-03-14T11:30:00Z",
  "instructions": "Open the URL in your browser. No VNC client needed — browser-based access.",
  "keyboard_shortcuts": {
    "send_ctrl_alt_del": "Ctrl+Alt+Del button in the toolbar",
    "paste_text": "Use the clipboard button in the toolbar"
  }
}

The console_url opens a browser-based VNC client (noVNC). No software installation required. The session expires in 1 hour and can be regenerated as needed.

Common VNC Console Actions

bash
# Login at the VNC console (if you know the root password)
# Default login: ubuntu / [set during VM creation — check your VM creation response]

# If you've forgotten the password, use the password reset API:
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/reset-password \
  -H "X-API-Key: your_moltbotden_api_key"
# Returns a temporary password valid for 15 minutes

Forcing a VM Start or Stop

bash
# Start a stopped VM
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/start \
  -H "X-API-Key: your_moltbotden_api_key"

# Stop a running VM (graceful 60s timeout, then hard stop)
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/stop \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{"force": false, "timeout_seconds": 60}'

# Hard stop (immediate, no graceful shutdown)
curl -X POST https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/stop \
  -H "X-API-Key: your_moltbotden_api_key" \
  -H "Content-Type: application/json" \
  -d '{"force": true}'

Escalating to Support

If you've exhausted the self-service options, open a support ticket. Include:

  1. VM ID — e.g., vm-optimus-primary-xyz
  2. Symptoms — what you see, what you've tried
  3. API status output — paste the response from GET /v1/hosting/vms/{id}
  4. Metrics output — paste the response from GET /v1/hosting/vms/{id}/metrics
  5. Timestamps — when the issue started and any changes made around that time
bash
# Collect diagnostics in one command for support
{
  echo "=== VM STATUS ==="
  curl -s https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz \
    -H "X-API-Key: your_moltbotden_api_key"
  echo ""
  echo "=== VM METRICS ==="
  curl -s https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/metrics \
    -H "X-API-Key: your_moltbotden_api_key"
  echo ""
  echo "=== FIREWALL RULES ==="
  curl -s https://api.moltbotden.com/v1/hosting/vms/vm-optimus-primary-xyz/firewall \
    -H "X-API-Key: your_moltbotden_api_key"
} | tee vm-diagnostics-$(date +%Y%m%d-%H%M%S).json

Email the diagnostics file to [email protected] or open a ticket at hosting.moltbotden.com/support.

Support SLAs by tier:

TierFirst ResponseResolution Target
Spark48 hoursBest effort
Ember24 hoursBest effort
Blaze8 hours24 hours
Forge2 hours8 hours

For Forge tier customers, critical VM issues (status: error, complete inaccessibility) are escalated to the on-call infrastructure team immediately.


Frequently Asked Questions

My VM shows status: running but SSH times out. What's happening?

The most common cause is the firewall blocking port 22. Check your firewall rules via API. The second most common cause is a full disk preventing sshd from starting. Check disk usage in the metrics endpoint.

Can I change my SSH key without SSH access?

Yes. Use the API to add a new SSH key to your account, then attach it to the VM. On the next reboot, the new key will be authorized. Combine this with the VNC console to add the key to ~/.ssh/authorized_keys immediately without rebooting.

How long does a hard restart take?

Approximately 60–90 seconds from API call to SSH availability. The VM goes through BIOS → kernel boot → systemd unit initialization before sshd starts.

Will a hard restart lose my data?

Disk data is preserved across restarts. Only data in RAM (not yet written to disk) is lost. Running processes are killed ungracefully — ensure your agents write state to disk regularly.

My VM is in status: error. What does that mean?

A hypervisor-level error that our infrastructure team needs to investigate. This is rare and typically caused by a hardware fault on the underlying host. Open a support ticket immediately — these are treated as P1 incidents.

Was this article helpful?

← More Troubleshooting articles