Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Problem installing nebari on Mac (Intel) #127

Closed
edublancas opened this issue Aug 15, 2022 · 14 comments
Closed

[BUG] - Problem installing nebari on Mac (Intel) #127

edublancas opened this issue Aug 15, 2022 · 14 comments
Assignees
Labels
needs: follow-up 📥 Someone needs to get back to this issue or PR status: in progress 🏗 This task is currently being worked on type: bug 🐛 Something isn't working

Comments

@edublancas
Copy link

Describe the bug

I'm trying to install nebari locally on my Mac (Intel) but the installation is timing out:

[terraform]: module.kubernetes-ingress.kubernetes_service.main: Still creating... [9m50s elapsed]
[terraform]: ╷
[terraform]: │ Error: Waiting for default secret of "dev/qhub-traefik-ingress" to appear
[terraform]: │
[terraform]: │   with module.kubernetes-ingress.kubernetes_service_account.main,
[terraform]: │   on modules/kubernetes/ingress/main.tf line 1, in resource "kubernetes_service_account" "main":
[terraform]: │    1: resource "kubernetes_service_account" "main" {
[terraform]: │
[terraform]: ╵
[terraform]: ╷
[terraform]: │ Error: context deadline exceeded
[terraform]: │
[terraform]: │   with module.kubernetes-ingress.kubernetes_service.main,
[terraform]: │   on modules/kubernetes/ingress/main.tf line 58, in resource "kubernetes_service" "main":
[terraform]: │   58: resource "kubernetes_service" "main" {
[terraform]: │
[terraform]: ╵

Problem encountered: Terraform error

It runs for 10 minutes, and then it crashes. I tried with minikube but Dharhas recommended switching to kind. I did but encountered the same problem.

Expected behavior

deploy command to finish successfully

OS system and architecture in which you are running Nebari

macOS Intel

How to Reproduce the problem?

These are the commands that I ran:

# works
qhub init local \                                                                                    
 --project projectname \
 --domain nebaritest.io \
 --auth-provider password \
 --terraform-state=local
# times out
 qhub deploy -c qhub-config.yaml --disable-prompt

Command output

See above

Versions and dependencies used.

(nebari)  Edu@MBP Desktop/tmp » qhub --version                                                                                       130 ↵
0.4.3
(nebari)  Edu@MBP Desktop/tmp » kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:17:57Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-19T15:39:43Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
(nebari)  Edu@MBP Desktop/tmp » kind --version
kind version 0.14.0

Compute environment

No response

Integrations

No response

Anything else?

No response

@edublancas edublancas added needs: triage 🚦 Someone needs to have a look at this issue and triage type: bug 🐛 Something isn't working labels Aug 15, 2022
@viniciusdc
Copy link
Contributor

viniciusdc commented Aug 15, 2022

Hi @edublancas, thanks for opening this issue. Regarding the log you provided could you check in which stages the deployment fails? it should be a little bit above the error message in your full logs.

@viniciusdc
Copy link
Contributor

viniciusdc commented Aug 15, 2022

also, if you could provide the following:

  • the output from docker stats
  • Could you try kubectl get all,cm,secret,ing -A . This one will output a lot of things if you could write these into a file that would help a lot :)

@viniciusdc
Copy link
Contributor

also, could you post your current qhub-config.yaml file? more specifically the certificate fields

@viniciusdc
Copy link
Contributor

viniciusdc commented Aug 15, 2022

also, to better debug the ingress error we can check its logs with kubectl describe pod -n dev qhub-traefik-ingress

@edublancas
Copy link
Author

edublancas commented Aug 15, 2022

docker stats:

CONTAINER ID   NAME                 CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
cea2e44b98aa   kind-control-plane   36.73%    833.4MiB / 1.939GiB   41.97%    314kB / 6.45MB    539MB / 1.08GB    259

qhub deploy log: nebari.log

kubectl.log

qhub-config.yaml.txt

(nebari)  Edu@MBP Desktop/tmp » kubectl describe pod -n dev qhub-traefik-ingress
Error from server (NotFound): pods "qhub-traefik-ingress" not found

@iameskild
Copy link
Member

Hi @edublancas, thank you for reaching out!

Deploying Nebari locally on Mac has been a bit of a challenge (this is mainly due to how Docker is installed on Mac, as a linux VM). Our old docs make mention of this here and offer some possible workarounds (though I can't guaranteed this will work atm).

We have recently dropped Minikube in favor of Kind (for local deployments) and I am currently in the process getting local Nebari deployments to work on Mac. As a Mac user myself, this is a high priority and I expect to get this working by the end of the week.

@edublancas
Copy link
Author

Hi @iameskild, thanks for your response!

Any updates here? I'd like to get nebari up and running. Would a better option be getting a linux machine?

@iameskild
Copy link
Member

Hi @edublancas, I've made some progress but it's not quite there yet for Mac. If this is urgent I would recommend using a linux machine for the time being. This is still a high-priority issue for us but it's been a little tricky to get fully working.

As you can probably tell, we are in the middle of a name change (QHub->Nebari) so I opened this issue to track my progress for the time being: nebari-dev/nebari#1405

If you come across any issues with a local deployment on Linux, free feel to reach out as well :)

@trallard trallard added status: in progress 🏗 This task is currently being worked on needs: follow-up 📥 Someone needs to get back to this issue or PR and removed needs: triage 🚦 Someone needs to have a look at this issue and triage labels Aug 24, 2022
@edublancas
Copy link
Author

Sure, I'll try to run this on a Linux machine. Most likely, an engineer on my team will follow up here, and we'll share our feedback. Thanks!

@edublancas
Copy link
Author

Hi, update:

I tried installing it on a fresh Linux machine but still encountered the same error. Maybe I'm missing some steps?

I followed this

kind create cluster

python -m qhub init local --project=thisisatest  --domain github-actions.qhub.dev --auth-provider=password --terraform-state=local

python -m qhub deploy --config qhub-config.yaml --disable-prompt

full log: qhub.txt

@iameskild
Copy link
Member

Hi @edublancas, thanks for the update! If you're using v0.4.3 I don't think this will work since kind was only integrated recently. Although not ideal, trying to install QHub/Nebari from this commit should work (v0.4.4 should be out soon).

Lastly, you shouldn't need to create the cluster first, just try running the init then deploy commands once the recommended version of QHub/Nebari is installed.

@edublancas
Copy link
Author

ok, so I installed it from that commit, I got this:

[terraform]:
[terraform]: No changes. Your infrastructure matches the configuration.
[terraform]:
[terraform]: Your configuration already matches the changes detected above. If you'd like
[terraform]: to update the Terraform state to match, create and apply a refresh-only plan:
[terraform]:   terraform apply -refresh-only
[terraform]:
[terraform]: Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
[terraform]:
[terraform]: Outputs:
[terraform]:
[terraform]: load_balancer_address = {
[terraform]:   "hostname" = ""
[terraform]:   "ip" = "172.18.1.100"
[terraform]: }
INFO:qhub.provider.terraform:terraform apply took 3.257 [s]
INFO:qhub.provider.terraform:terraform=/var/folders/3h/_lvh_w_x5g30rrjzb_xnn2j80000gq/T/terraform/1.0.5/terraform output directory=stages/04-kubernetes-ingress
INFO:qhub.provider.terraform:terraform output took 1.190 [s]
Attempt 1 failed to connect to tcp tcp://172.18.1.100:80
Attempt 2 failed to connect to tcp tcp://172.18.1.100:80
Attempt 3 failed to connect to tcp tcp://172.18.1.100:80
Attempt 4 failed to connect to tcp tcp://172.18.1.100:80
Attempt 5 failed to connect to tcp tcp://172.18.1.100:80
Attempt 6 failed to connect to tcp tcp://172.18.1.100:80
Attempt 7 failed to connect to tcp tcp://172.18.1.100:80
Attempt 8 failed to connect to tcp tcp://172.18.1.100:80
Attempt 9 failed to connect to tcp tcp://172.18.1.100:80
Attempt 10 failed to connect to tcp tcp://172.18.1.100:80
ERROR: After stage directory=stages/04-kubernetes-ingress unable to connect to ingress host=172.18.1.100 port=80

ideas?

@viniciusdc
Copy link
Contributor

viniciusdc commented Aug 26, 2022

Hi @edublancas, try to execute the following:

echo "172.18.1.100  <domain>" |  tee -a /etc/hosts

or manually add this line to your /etc/hosts file: 172.18.1.100 github-actions.qhub.dev, where in place of <domain> you include github-actions.qhub.dev, or the value you currently have set in your config.yaml

@iameskild
Copy link
Member

Closing in favor of: nebari-dev/nebari#1405

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: follow-up 📥 Someone needs to get back to this issue or PR status: in progress 🏗 This task is currently being worked on type: bug 🐛 Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants