-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Register critical services in consul. #553
Register critical services in consul. #553
Conversation
jobs/jobs.go
Outdated
} | ||
|
||
// sendHeartbeat sends a heartbeat for this Job's service | ||
func (job *Job) sendHeartbeat() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured since the tests pass this isn't a problem but what were your thoughts on changing this from an exported method? Job.SendHeartbeat
=> Job.sendHeartbeat
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a hard time seeing how that method will be used outside of the job itself, but it's a small change so I can change it back if you'd like.
Overall I think this is a great change. I can definitely imagine services that never come up to be critical state yet still registered in a service catalog. One question in the review otherwise I think we can include this in a next minor bump. |
Actually have a few more thoughts I'll post in the issue. |
Thanks for the feedback, I will try to address those thoughts in the issue. |
@eliasbrange I'd love your feedback on @cheapRoc's suggestion about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some renaming otherwise the implementation details look sound. I'll test the full version before merging.
@@ -72,7 +101,7 @@ func (service *ServiceDefinition) registerService() error { | |||
EnableTagOverride: service.EnableTagOverride, | |||
Check: &api.AgentServiceCheck{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking up that this is using Consul's api.AgentServiceCheck
I believe the configuration option should be on the job level and not within the health check. Optionally, we could expose the same initial_status
option for all health checks independently... but I think that should be a future enhancement warranted from user requests for the feature.
jobs/config.go
Outdated
@@ -24,6 +24,7 @@ type Config struct { | |||
|
|||
// service discovery | |||
Port int `mapstructure:"port"` | |||
InitialStatus string `mapstructure:"initialStatus"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call it initial_status
(snake case), I don't believe we use lower camel case anywhere else. In Go we'll continue to use InitialStatus
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stopTimeout
uses lower camel case. But I'll switch to initial_status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, you're right.
jobs/jobs.go
Outdated
// SendHeartbeat sends a heartbeat for this Job's service | ||
func (job *Job) SendHeartbeat() { | ||
// sendHeartbeat sends a heartbeat for this Job's service | ||
func (job *Job) sendHeartbeat() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename sendHeartbeat
back to exported method SendHeartbeat
.
@@ -159,6 +166,9 @@ func (job *Job) Run(pctx context.Context, completedCh chan struct{}) { | |||
completedCh <- struct{}{} | |||
}() | |||
for { | |||
// Check if job's service has been registered. Doing it inside the event | |||
// loop to retry if consul registration fails. | |||
job.checkRegistration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks handy all by itself.
}, | ||
{ | ||
name: "serviceB", | ||
initialStatus: "invalid-value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to initial_status
here.
Thank you again @eliasbrange for your continued support and patience. |
Reverted |
I tested this PR with the following (rather convoluted) test rig using Fabio, {
consul: "localhost:8500",
logging: {
level: "DEBUG",
format: "text"
},
jobs: [
{
name: "foo",
exec: "http-echo -listen=:8484 -text=hello",
initial_status: "warning",
tags: ["urlprefix-/foo"],
port: 8484,
health: {
exec: "/usr/bin/curl -o /dev/null -sfI http://localhost:8484/",
interval: 60,
ttl: 60,
},
},
],
control: {
socket: "/tmp/cp.socket"
}
} I was able to verify that the Consul service check was initially set to With the health check set to Once again, I want to thank you for a wonderful addition to ContainerPilot @eliasbrange! |
Thanks for including it @cheapRoc. Will make it much easier for our monitoring! |
Fixes #552
This PR adds the functionality to register services in consul that never gets healthy to be able to filter on failed services in consul and see that they are misbehaving.
I've added a method for registering a service with the status set to critical. This is called in
onHealthChechFailed
, but only if the service hasn't already been registered. I've also made some minor refactoring inservices/services.go
, i.e. havingMarkForMaintenance
callDeregister
instead of the other way around.