Skip to content

Troubleshooting

benliao1 edited this page Jul 29, 2023 · 7 revisions

Common Problems on the Raspberry Pi's

If Runtime doesn't start up or exit properly, check to make sure there are no residual objects from the previous run of Runtime. After cleaning up any residual objects from the previous run of Runtime, start up Runtime by either starting up all the services again:

service shm_start start
service net_handler start
service dev_handler start
service executor start

or by using the scripts/run.sh shell script with ./runtime run in the terminal from any directory.

Stopping Runtime / Process Cleanup

If running Runtime manually (i.e. using ./runtime run), try to stop it using ./runtime stop first. Then, check for residual processes by running the command ./runtime status, which just runs the following command:

ps -efH -u ubuntu

This command lists all processes owned by the ubuntu user on the Raspberry Pi. If you see anything Runtime-related, for example:

ubuntu       23623     1  0 11:41 pts/1    00:00:00 ./net_handler

kill the process by sending it SIGINT with kill -INT 23623 where you replace the 23623 with whatever the process ID of the residual process is.

If running Runtime using the systemd services, shutting down one of the main systemd services (dev_handler seems to be the most reliable) should stop all of Runtime:

sudo systemctl stop dev_handler

You can do a check on whether there are residual processes and kill them in the same manner as described above.

Files / Object Cleanup

Check that the challenge socket and the log FIFO are reset by removing them if they exist. Do:

ls /tmp

and if you see log-fifo in the output, remove them it rm /tmp/log-fifo, respectively.

If you are running tests, there might also be a bunch of unused virtual device sockets of in the /var directory that look like ttyACM*, where * represents some number. Remove all of those too, with rm /var/ttyACM*.

Check that the shared memory is removed if it exists. Do:

ls /dev/shm

and if you see a whole bunch of files there, remove them with rm /dev/shm/*.

Debugging Runtime

The first step to debugging a seemingly working Runtime is always to get into the robot (i.e. get an ssh session running between your computer and the robot). Without a way to see the state of Runtime through an ssh session, Runtime is basically impossible to debug. From there, generally a good next step is to open the shared memory UI to get a real-time view of the shared memory of Runtime. After that, there aren't many more general rules that you can follow to diagnose the issue. We will simply write a laundry list of common commands that can be run to help diagnose the issue.

Opening an ssh session to the Raspberry Pi

If you are sure you know that the Raspberry Pi is connected to a certain network, (for example, if raspberrypi is visible on the network router's admin page, along with its IP address), then the first step is to connect your computer to the network (try wirelessly at first). Test if you can ping the Raspberry Pi in question by running

ping <IP address>

or

ping <hostname>.local

on your computer. You should get a 100% success rate of pings. If you get Received timeout seq=<x> repeatedly, then either the Raspberry Pi is not on the network, your computer is not on the network, or the Raspberry Pi's ping service isn't working. Check to make sure that your computer is on the network, or get someone else to connect to the network and ping the Raspberry Pi to make sure your computer isn't the problem. If nobody can ping it, try to connect via Ethernet. If the ping is successful, try to ssh into the Raspberry Pi using the command

ssh ubuntu@<IP address>

or

ssh ubuntu@<hostname>.local

To connect to the Raspberry Pi via Ethernet, first connect the network router to a network switch via an Ethernet cable. Then, connect the Ethernet port of the Raspberry Pi to one of the ports on the network switch with another Ethernet cable. Lastly, connect your computer to the network switch with a third Ethernet cable. You may want to turn off the Wi-fi on your computer to ensure that your computer is using the Ethernet connection for all networking, as having two networking interfaces active on a computer can sometimes cause issues. After you do this, try checking the network's admin page. If all is well, you should see both your computer and the Raspberry Pi on the network, along with their associated IP address. Try to ping the device; if all is well, then try to ssh into the device using the same command as mentioned above.

If contact cannot be established between the Raspberry Pi and any computer, wired or wirelessly, unfortunately the Raspberry Pi or SD card (or both) may be bricked, in which case a new SD card will need to be flashed, or the old SD card be transferred to new Raspberry Pi). Try to do the latter option first, and if that doesn't work, try the former.

Running the Shared Memory UI

First, check to make sure that shared memory exists. If shared memory doesn't exist, try restarting Runtime. Usually this will remedy the problem.

To run the Shared Memory UI, navigate to the runtime/tests directory and run make shm_ui. The executable will be in runtime/tests/bin; run with ./shm_ui in that directory. This is probably the single most useful thing to do when trying to debug Runtime.

Other Steps To Take

Here are a list of other tools at your disposal to help diagnose the issue:

Switch to manual Runtime

To get more information about Runtime, sometimes it is useful to run Runtime manually, as doing so dumps all logs of all levels to stdout, i.e. your terminal screen. To do this, stop the Runtime systemd services with sudo systemctl stop dev_handler and then run Runtime manually with ./runtime run from the root directory.

Inspect the logger file

Sometimes, students will complain that a problem just happened to them but they can't reproduce it, or they claim that there's a problem with something but we can't reproduce it at the debug table. In that case, sometimes it is useful to look at both their Dawn console (if they have it open still) and/or the logger file on the SD card of the Raspberry Pi. This file is at runtime/logger/runtime.log. By default, the logger will send every log with log level WARN and up to this file. If something bad was indeed happening to the robot (lowcar device misbehaving, student code erroring / spamming causing executor to jam up, etc.) there should be a few logs indicating the issue in the file. Don't cat the file directly to the screen though, the file will likely be pretty long! Use the command less runtime.log to view the file, and then press the u and d keys to navigate the file (and show more of the file).

Check Wi-Fi connection status

It is often useful to see the status of the network connections on the Raspberry Pi. These can be viewed with the command

ifconfig

The output will show up as two or three sections, labeled lo, eth0, and wlan0.

  • lo refers to the loopback interface at IP address 127.0.0.1, and it is the interface of the device back to itself (hence the name "loopback")
  • eth0 refers to the ethernet interface, and it will appear automatically upon connecting the Raspberry Pi's Ethernet port to an active network router
  • wlan0 refers to the wireless interface, and it will appear automatically upon connecting the Raspberry Pi to a wireless network

Things to watch out for include whether there is an IP address on the wlan0 interface (e.g. inet 192.168.0.101), whether the wlan0 interface exists when you expect the Raspberry Pi to be connected to a wireless network, and whether the localhost name is reported correctly (this allows the students and PiE staff to refer to the device as <hostname>.local)

Clone this wiki locally