Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upDon't forward _ping requests to the primary #2554
Conversation
GordonTheTurtle
added
the
dco/no
label
Nov 28, 2016
wsong
referenced this pull request
Nov 28, 2016
Merged
#2488 Update vendoring to remove engine-api #2492
GordonTheTurtle
removed
the
dco/no
label
Nov 28, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
wsong
Nov 28, 2016
Contributor
In #2492, we noticed that the /_ping
endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info
requests to get redirected straight to the primary. This PR ensures that /_ping
requests are still entirely handled by the local node while still checking the primary's status on replicas.
In #2492, we noticed that the |
nishanttotla
added this to the 1.2.6 milestone
Nov 28, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
dongluochen
Nov 29, 2016
Contributor
we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary
Requests should be independent of each other. If this is the case we have a bug. Can you provide the reproduce sequence for this?
Requests should be independent of each other. If this is the case we have a bug. Can you provide the reproduce sequence for this? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
wsong
Nov 29, 2016
Contributor
There's more discussion in #2526. We never fully figured out a solid repro for this issue, but both me and @michelvocks saw this issue.
There's more discussion in #2526. We never fully figured out a solid repro for this issue, but both me and @michelvocks saw this issue. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
michelvocks
Nov 30, 2016
Contributor
@dongluochen @wsong You are right. This is a bug. I spent more time on investigating the issue and the root of it is this commit: docker/engine-api@5dd6452
The client is reusing the same connection. That means it first sends the _ping
request and then uses the same connection to send the /info
request. The hijacking method in swarm looks like this:
cp := func(dst io.Writer, src io.Reader, chDone chan struct{}) {
io.Copy(dst, src)
if conn, ok := dst.(interface {
CloseWrite() error
}); ok {
conn.CloseWrite()
}
close(chDone)
}
inDone := make(chan struct{})
outDone := make(chan struct{})
go cp(d, nc, inDone)
go cp(nc, d, outDone)
The go-routine reads the whole stream and then forwards it to the primary docker swarm manager and because we are using the same connection for the _ping
request as well as the /info
request it will forward both requests. I'm now not quite sure what's the right way to go when looking at the design. Should this be fixed in the docker client or at docker swarm?
@dongluochen @wsong You are right. This is a bug. I spent more time on investigating the issue and the root of it is this commit: docker/engine-api@5dd6452 The client is reusing the same connection. That means it first sends the cp := func(dst io.Writer, src io.Reader, chDone chan struct{}) {
io.Copy(dst, src)
if conn, ok := dst.(interface {
CloseWrite() error
}); ok {
conn.CloseWrite()
}
close(chDone)
}
inDone := make(chan struct{})
outDone := make(chan struct{})
go cp(d, nc, inDone)
go cp(nc, d, outDone) The go-routine reads the whole stream and then forwards it to the primary docker swarm manager and because we are using the same connection for the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Rebased on top of #2492 |
dongluochen
self-assigned this
Dec 1, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
nishanttotla
Jan 17, 2017
Contributor
@wsong was there further discussion about this PR? Have you faced the issue lately?
@wsong was there further discussion about this PR? Have you faced the issue lately? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
wsong
Jan 17, 2017
Contributor
This isn't so much an issue as it is a feature request; because Swarm replicas can occasionally lose their connection to the primary, we want to be able to hit an endpoint telling us whether or not there is currently an elected primary, from the perspective of this replica. This is useful when, for instance, you add a new Swarm manager and it takes a while to connect to the KV store; we can hit _ping
in a retry loop to determine when the new manager is ready.
This isn't so much an issue as it is a feature request; because Swarm replicas can occasionally lose their connection to the primary, we want to be able to hit an endpoint telling us whether or not there is currently an elected primary, from the perspective of this replica. This is useful when, for instance, you add a new Swarm manager and it takes a while to connect to the KV store; we can hit |
nishanttotla
modified the milestones:
1.2.7,
1.2.6
Jan 17, 2017
allencloud
added
area/API
area/replication
labels
Mar 29, 2017
nishanttotla
modified the milestones:
1.2.8,
1.2.7
May 4, 2017
wsong
closed this
Mar 19, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
wsong
Mar 19, 2018
Contributor
This is no longer relevant now that we block in api/replica.go
until we have an elected primary cluster manager.
This is no longer relevant now that we block in |
wsong commentedNov 28, 2016
No description provided.