Troubleshooting
This guide covers common errors and how to resolve them.
Connection Errors
"certificate signed by unknown authority"
Cause: Client doesn't trust the CA that signed the server's certificate.
Solutions:
- Same certificate directory: Ensure both nodes use certificates from the same CA.
// Both nodes should generate certs in the same directory // OR share the same CA files WithCertDir("./shared-certs") - Check CA files: Verify
ca-cert.pemexists and is the same on both sides. - Custom CA: If using external certificates, ensure the CA is in the trust pool:
opts := &aegis.TLSOptions{ Source: aegis.CertSourceFile, CAFile: "/path/to/ca.crt", CertFile: "/path/to/node.crt", KeyFile: "/path/to/node.key", }
"connection refused"
Cause: Server not running or wrong address.
Solutions:
- Server started? Ensure
node.StartServer()was called. - Correct address? Check the peer's address matches the server's listening address.
// Server WithAddress("localhost:8443") // Client peer info must match node.AddPeer(aegis.PeerInfo{ ID: "server-node", Type: aegis.NodeTypeGeneric, Address: "localhost:8443", }) - Firewall: Check if the port is blocked.
"context deadline exceeded"
Cause: Connection or RPC timeout.
Solutions:
- Network reachable? Verify nodes can reach each other.
- Increase timeout:
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() - Check server load: Server may be overwhelmed.
Service Discovery Errors
"no providers available for service"
Cause: No nodes in topology provide the requested service.
Solutions:
- Service declared? Provider must declare the service:
WithServices(aegis.ServiceInfo{Name: "identity", Version: "v1"}) - Topology synced? Consumer's topology must include the provider:
// Add provider as peer node.AddPeer(aegis.PeerInfo{...}) // Sync topology node.SyncTopology(ctx, providerID) // Now query should work providers := node.Topology.GetServiceProviders("identity", "v1") - Version match? Service name AND version must match exactly.
"node has no TLS configuration"
Cause: Attempting to connect without TLS setup.
Solutions:
- Use NodeBuilder: Always use
NewNodeBuilder()instead ofNewNode(). - Specify cert source:
WithCertDir("./certs") // OR WithTLSOptions(&aegis.TLSOptions{...})
Certificate Errors
"certificate has expired"
Cause: Node certificate past validity period.
Solutions:
- Regenerate certificates: Delete old certs and restart node.
rm certs/node-1-cert.pem certs/node-1-key.pem - Check CA expiry: CA certificates expire after 365 days by default.
- Disable expiry check (development only):
opts := &aegis.TLSOptions{ AllowExpired: true, // NOT for production }
"certificate does not contain any IP SANs"
Cause: Certificate missing Subject Alternative Names for the connection address.
Solutions:
- Use hostname: Connect via hostname, not IP, if cert has DNS SANs only.
- Regenerate with correct SANs: Delete cert and let aegis regenerate with proper SANs.
Topology Issues
Topology not syncing
Cause: Version comparison or connectivity issues.
Debug steps:
- Check versions:
log.Printf("Local version: %d", node.Topology.GetVersion()) // After sync log.Printf("Local version: %d", node.Topology.GetVersion()) - Verify peer connection:
peer, exists := node.GetPeer(peerID) if !exists { log.Println("Peer not found") } if !peer.IsConnected() { log.Println("Peer not connected") } - Manual sync:
err := node.SyncTopology(ctx, peerID) if err != nil { log.Printf("Sync failed: %v", err) }
Stale topology data
Cause: Topology not refreshed after changes.
Solutions:
- Periodic sync:
ticker := time.NewTicker(30 * time.Second) for range ticker.C { node.SyncTopologyWithAllPeers(ctx) } - Event-driven sync: Sync when peers connect/disconnect.
Health Check Issues
Health status not updating
Cause: Health checker not being called.
Solutions:
- Call CheckHealth explicitly:
err := node.CheckHealth(ctx, checker) - Periodic health checks:
go func() { ticker := time.NewTicker(10 * time.Second) for range ticker.C { node.CheckHealth(ctx, checker) } }()
Debugging Strategies
Enable gRPC logging
import "google.golang.org/grpc/grpclog"
grpclog.SetLoggerV2(grpclog.NewLoggerV2(os.Stdout, os.Stdout, os.Stderr))
Inspect certificates
# View certificate details
openssl x509 -in certs/node-1-cert.pem -text -noout
# Verify certificate chain
openssl verify -CAfile certs/ca-cert.pem certs/node-1-cert.pem
Check connection state
peer, _ := node.GetPeer(peerID)
state := peer.Conn.GetState()
log.Printf("Connection state: %v", state)
// IDLE, CONNECTING, READY, TRANSIENT_FAILURE, SHUTDOWN
Next Steps
- Services Guide — Registering domain services
- Certificates Guide — Certificate management details