Skip to content

Commit

Permalink
Don't exit when slave lost because it may not be our tasks' slave.
Browse files Browse the repository at this point in the history
Exit was added because it was thought it caused the program to hang and
consume resources in mesos otherwise. This is not the case as later
commits show that printLogs() was actually causing the process to block
and then ResourceOffers() was consuming for really an unknown reason :/.

This change has been tested to be safe (not consume all mesos resources)
when:

* Random slaves are lost in cluster.
* Slave the task is running on is lost.
* Slave the task is running on becomes "unhealthy" (simulated with
  iptables).
  • Loading branch information
hekaldama committed Jun 24, 2016
1 parent 73c3e44 commit f339629
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion main.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ func (sched *MesosRunonceScheduler) FrameworkMessage(_ sched.SchedulerDriver, ei
log.Errorf("framework message from executor %q slave %q: %q", eid, sid, msg)
}
func (sched *MesosRunonceScheduler) SlaveLost(_ sched.SchedulerDriver, sid *mesos.SlaveID) {
log.Exitf("slave lost: %v", sid)
log.V(1).Infof("slave lost: %v", sid)
}
func (sched *MesosRunonceScheduler) ExecutorLost(_ sched.SchedulerDriver, eid *mesos.ExecutorID, sid *mesos.SlaveID, code int) {
log.Errorf("executor %q lost on slave %q code %d", eid, sid, code)
Expand Down

0 comments on commit f339629

Please sign in to comment.