TSG Troubleshooting Guide
This guide helps you diagnose and resolve common issues when deploying and operating TSG (TNO Security Gateway).
General Troubleshooting Steps
1. Check System Status
Verify Kubernetes cluster health:
kubectl cluster-info
kubectl get nodes
kubectl get pods --all-namespaces
Check TSG deployment status:
# List all TSG-related pods
kubectl get pods -n tsg-ecosystem
# Check specific namespace
kubectl get all -n tsg-ecosystem
Verify ingress configuration:
kubectl get ingress -A
kubectl describe ingress <ingress-name> -n <namespace>
2. Check Logs
Application logs:
# Control Plane logs
kubectl logs -l app.kubernetes.io/name=tsg-control-plane -n tsg-ecosystem --tail=100
# Data Plane logs
kubectl logs -l app.kubernetes.io/name=tsg-http-data-plane-plane -n tsg-ecosystem --tail=100
# Wallet logs
kubectl logs -l app.kubernetes.io/name=tsg-wallet -n tsg-ecosystem --tail=100
System logs:
# Ingress controller logs
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx
# cert-manager logs
kubectl logs -n cert-manager -l app=cert-manager
CLI Tool Issues
Installation Problems
Error: npm install -g @tsg-dsp/cli
fails
Solution:
# Clear npm cache
npm cache clean --force
# Update npm
npm install -g npm@latest
# Install with specific Node.js version
nvm use 22
npm install -g @tsg-dsp/cli@latest
Error: tsg: command not found
Solution:
# Check npm global path
npm config get prefix
# Add to PATH (add to ~/.bashrc or ~/.zshrc)
export PATH="$(npm config get prefix)/bin:$PATH"
# Verify installation
which tsg
tsg --version
Configuration Validation Errors
Error: Invalid YAML syntax
Solution:
# Validate YAML syntax
yamllint ecosystem.yaml
# Check for common issues
cat -A ecosystem.yaml # Shows hidden characters
Error: Invalid ingress host format
Solution:
ingress:
host: "dataspace.example.com" # Valid domain
# NOT: "http://dataspace.example.com" # Invalid
Deployment Issues
Bootstrap Command Failures
Error: Output directory not writable
Solution:
# Check permissions
ls -la ./output
# Create directory with correct permissions
mkdir -p output
chmod 755 output
# Or specify different output directory
tsg bootstrap ecosystem -o /tmp/tsg-output
Deploy Command Failures
Error: kubectl: connection refused
Solution:
# Verify kubectl configuration
kubectl config current-context
kubectl cluster-info
# Check cluster connectivity
kubectl get nodes
# Switch context if needed
kubectl config use-context <correct-context>
Error: Helm not found
Solution:
# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Verify installation
helm version
# Install helm-diff plugin
helm plugin install https://github.com/databus23/helm-diff
Error: Namespace already exists
Solution:
# Check existing namespaces
kubectl get namespaces
# Delete existing namespace if safe
kubectl delete namespace tsg-ecosystem
Network and Connectivity Issues
Ingress Problems
Error: 502 Bad Gateway
or 503 Service Unavailable
Diagnosis:
# Check ingress controller
kubectl get pods -n ingress-nginx
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx
# Check backend services
kubectl get svc -n tsg-ecosystem
kubectl get endpoints -n tsg-ecosystem
Solution:
# Verify service endpoints
kubectl describe service control-plane -n tsg-ecosystem
# Check pod readiness
kubectl get pods -n tsg-ecosystem -o wide
# Restart ingress controller if needed
kubectl rollout restart deployment/ingress-nginx-controller -n ingress-nginx
Error: 404 Not Found
for TSG endpoints
Solution:
# Verify ingress configuration
kubectl describe ingress -n tsg-ecosystem
# Check ingress annotations
kubectl get ingress -n tsg-ecosystem -o yaml
# Verify DNS resolution
nslookup dataspace.example.com
TLS Certificate Issues
Error: TLS handshake failed
Diagnosis:
# Check certificate status
kubectl get certificates -A
kubectl describe certificate <cert-name> -n <namespace>
# Check cert-manager logs
kubectl logs -n cert-manager -l app=cert-manager
Solution:
# Delete and recreate certificate
kubectl delete certificate <cert-name> -n <namespace>
tsg deploy ecosystem # Will recreate certificate
# Check certificate issuer
kubectl describe clusterissuer letsencrypt-prod
Error: Certificate not ready
Solution:
# Wait for certificate provisioning (can take 5-10 minutes)
kubectl get certificate -n tsg-ecosystem -w
# Check ACME challenge
kubectl get challenges -A
kubectl describe challenge <challenge-name>
# Verify DNS is pointing to ingress
dig dataspace.example.com
Application-Specific Issues
Control Plane Problems
Error: Database connection failed
Solution:
# Check database pod
kubectl get pods -l app.kubernetes.io/name=postgresql
kubectl logs -l app.kubernetes.io/name=postgresql
# Verify database credentials
kubectl get secrets -n tsg-ecosystem
kubectl describe secret control-plane-db
Data Plane Problems
Error: Transfer request failed
Solution:
# Check data plane logs
kubectl logs -l app.kubernetes.io/name=tsg-http-data-plane-plane -A
# Verify data plane registration
kubectl exec -it <control-plane-pod> -- curl http://http-data-plane:8080/health
# Check data source connectivity
kubectl exec -it <data-plane-pod> -- curl <your-data-source-url>
Wallet Problems
Error: DID resolution failed
Solution:
# Check wallet logs
kubectl logs -l app=wallet -n tsg-ecosystem
# Verify DID document accessibility
curl https://wallet.dataspace.example.com/.well-known/did.json
# Check ingress for wallet
kubectl describe ingress wallet -n tsg-ecosystem
Error: Credential verification failed
Solution:
# Check wallet configuration
kubectl describe configmap wallet-config -n tsg-ecosystem
# Verify key storage
kubectl get secrets -l app=wallet -n tsg-ecosystem
# Test credential endpoint
curl https://wallet.dataspace.example.com/credentials
SSO Bridge Problems
Error: OAuth token invalid
Solution:
# Check SSO Bridge logs
kubectl logs -l app=sso-bridge -n tsg-ecosystem
# Verify OAuth client configuration
kubectl describe configmap sso-bridge-clients -n tsg-ecosystem
# Test OAuth endpoints
curl https://auth.dataspace.example.com/.well-known/openid-configuration
Performance Issues
High Resource Usage
CPU/Memory limits reached:
Solution:
# Check resource usage
kubectl top pods -n tsg-ecosystem
kubectl top nodes
# Update resource limits in configuration
# Add in values.[APPLICATION].yaml:
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
# Redeploy with updated resources
tsg deploy ecosystem
Slow Response Times
Network latency issues:
Solution:
# Check pod-to-pod connectivity
kubectl exec -it <pod1> -- ping <pod2-ip>
# Verify service discovery
kubectl exec -it <pod> -- nslookup control-plane.tsg-ecosystem.svc.cluster.local
# Check ingress performance
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx | grep -i latency
Recovery Procedures
Complete System Recovery
If entire deployment is broken:
# 1. Backup any important data
kubectl get secrets -n tsg-ecosystem -o yaml > secrets-backup.yaml
# 2. Clean deployment
kubectl delete namespace tsg-ecosystem
# 3. Redeploy from scratch
tsg deploy ecosystem
# 4. Restore data if needed
kubectl apply -f secrets-backup.yaml
Partial Component Recovery
If single component is failing:
# 1. Check which component is failing
kubectl get pods -n tsg-ecosystem
# 2. Get component-specific values
helm get values control-plane -n tsg-ecosystem > control-plane-values.yaml
# 3. Restart component
kubectl rollout restart deployment/control-plane -n tsg-ecosystem
# 4. Or redeploy component
helm upgrade control-plane ./output/control-plane -n tsg-ecosystem -f control-plane-values.yaml
Getting Additional Help
Diagnostic Information Collection
Before requesting support, collect diagnostic information:
# Create diagnostic bundle
mkdir tsg-diagnostics
cd tsg-diagnostics
# Collect logs
kubectl logs -l app.kubernetes.io/managed-by=tsg-cli --all-containers=true > tsg-logs.txt
# Collect resource status
kubectl get all -n tsg-ecosystem -o yaml > tsg-resources.yaml
# Collect events
kubectl get events -n tsg-ecosystem --sort-by=.metadata.creationTimestamp > tsg-events.txt
# Collect configuration
cp ../ecosystem.yaml ./
cp -r ../output ./
# Create archive
cd ..
tar -czf tsg-diagnostics.tar.gz tsg-diagnostics/
Support Channels
- Documentation: Review TSG Documentation
- Configuration Reference: CLI Configuration Guide
- Gitlab Issues: Report bugs and feature requests
Useful External Resources
- Kubernetes Troubleshooting: Official K8s Troubleshooting Guide
- Helm Troubleshooting: Helm Documentation
- cert-manager Issues: cert-manager Troubleshooting
- NGINX Ingress: NGINX Ingress Troubleshooting