Troubleshoot

This guide is divided into two section:

  • Application deployment issues: it contains the common error you may find during the deployment of an application
  • Cluster issues: it contains the common error you may find while managing your clusters

Application deployment issues

Liveness/Readiness failed, connect: connection refused

If you encounter this kind of error on the Liveness and/or Readiness probe during an application deployment phase:

Readiness probe failed: dial tcp 100.64.2.230:80: connect: connection refused
Liveness probe failed: dial tcp 100.64.2.230:80: connect: connection refused

That means your application may not able to start, or has started but takes too many time to start.

Here are the possible reasons for starting issues you should check:

  1. The declared port on Qovery (here 80), does not match your application's opening port. Check your application port, and set the correct port to your application configuration.

  2. Ensure your application is not listening onto localhost (127.0.0.1) or a specific IP. But set it to all interfaces (0.0.0.0).

  3. Your application takes too long to start and the liveness probe is flagging your application as unhealthy. Try to grow the liveness_probe.initial_delay_seconds parameter, to inform Kubernetes to delay the time before checking your application availability. Set it for example to 120.

My app is crashing, how do I connect to investigate?

Goal: You want to connect to your container's application to debug your application

First, try to use qovery shell command from the Qovery CLI. It's a safe method to connect to your container and debug your application.

If your app is crashing in the first seconds, you'll lose the connection to your container, making the debug almost impossible, then continue reading.

Your app is crashing very quickly, here is how to keep the full control of your container:

  1. Temporary delete the application port from your application configuration. This to avoid Kubernetes to restart the container when the port is not open.

  2. Into your Dockerfile, comment your EXEC or ENTRYPOINT and add a way to make your container sleep. For example:

    #CMD ["npm", "run", "start"]
    CMD ["tail", "-f", "/dev/null"]

    Commit and push your changes to trigger a new deployment (trigger it manually from the Qovery console if it's not the case).

  3. Once the deployment done, you can use qovery shell command to connect to your container and debug.

0/x nodes are available: x insufficient cpu/ram

If you encounter this kind of error during an application deployment phase:

0/1 nodes are available: 1 Insufficient cpu (or ram).

That means that we cannot reserve the necessary resources to deploy your application or database on your cluster due to an insufficient amount of CPU or RAM. Moreover, the cluster auto-scaler cannot be triggered since it has already reached the maximum number of instances for your cluster (valid only for Managed Kubernetes clusters).

Here are the possible solutions you can apply:

  • Reduce the resources (CPU/RAM) allocated to your existing/new service. Have a review of the deployed services and see if you can save up some resources by reducing their CPU/RAM setting. If you are using a K3S (EC2) cluster, stop your service before changing the settings. Remember to re-deploy the applications when you edit the resource. Have a look at the resource section for more information.

  • Select a bigger instance type for your cluster (in terms of CPU/RAM). By increasing it, it will unlock the deployment of your application (since new resources have been added). Check your cluster settings, and change the instance type of your cluster.

  • (only for Managed kubernets clusters) Increase the maximum number of nodes of your cluster. By increasing it, it will allow the cluster autoscaler to add a new node and allow the deployment of your application (since new resources have been added). Check your cluster settings, and increase the maximum number of nodes of your cluster.

Please note that application resource consumption and application resource allocation are not the same. Have a look at the resource section for more information

During a managed database delete, I've this error: SnapshotQuotaExceeded

This errors occurs because Qovery creates a snapshot before the delete of the database. This to avoid a user mistake who delete a database accidentally.

To fix this issue, you have 2 solutions:

  1. You certainly have useless snapshots, from old databases or old ones you don't want to keep anymore. Delete them directly from your Cloud Provider web interface. Here is an example on AWS:

    • Search for the database service (here RDS)
    • Select the Snapshots menu
    • Select the snapshots to delete

    Database snapshots

  2. Open a ticket to the Cloud Provider support, and as to raise this limit.

Can't get my SSL / TLS Certificate

When a custom domain is added to an application, it must be configured on your side according to the instructions displayed:

Custom Domain Configuration

You can check that your custom domain is well configured using the following command: dig A ${YOUR_CUSTOM_DOMAIN} +short:

Custom Domain Verification

You should retrieve in the output the default url configured by Qovery, e.g zdf72de71-z709e1a85-gtw.bool.sh in our example

The SSL / TLS Certificate is generated for the whole group of custom domains you define:

  • if one custom domain is misconfigured: the certificate can't be generated
  • if the certificate has been generated once, but later one custom domain configuration is changed and misconfigured: the certificate can't be generated again

If you experience some invalid certificate, here is how you can fix the issue:

  1. Identify the misconfigured custom domain(s) in your application settings.

  2. Fix or delete them.

  3. Redeploy your impacted application(s).

Cluster

I don't have Qovery access anymore, how could I delete Qovery deployed resources on my AWS account?

Unfortunately, there is no automatic way to do it with Qovery once we don't have access. However, AWS provides an easy way to retrieve those resources, so you can manually perform the delete. To do so, go on the AWS web console, and search for "Resource Groups & Tag Editor" service, then:

Resource groups search by tag

  1. Click on "Create Resource Group".
  2. In Tags, enter: "ClusterLongId".
  3. In the "Optional Tag value", enter the Qovery cluster ID. If you don't have it, let AWS suggest it for you. If you have Qovery deployed elements remainings, it will propose the Cluster long ID automatically.
  4. Click on "Add".
  5. You should see the filter with the information you just entered.
  6. Click on "Preview groups resources".
  7. You'll have all elements deployed by Qovery and you can delete what you want.

My cloud account has been blocked, what should I do?

If you encounter this kind of error during an infrastructure deployment (including managed DBs):

This account is currently blocked by your cloud provider, please contact them directly.

Or

This AWS account is currently blocked and not recognized as a valid account.
Please contact [email protected] directly to get more details.
Maybe you are not allowed to use your free tier in this region?
Maybe you need to provide billing info?

This error is likely due to a billing issue or blocked free-tier usage in the given region.

Unfortunately, there is nothing Qovery can do. You need to reach out directly to your cloud provider to get more details and get your account unblocked.

More

You are looking to troubleshoot your application with Qovery? Read this very short guide