Don't forward _ping requests to the primary #2554

wsong · Nov 28, 2016

No description provided.

In #2492, we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary. This PR ensures that /_ping requests are still entirely handled by the local node while still checking the primary's status on replicas.

wsong · Nov 28, 2016

In #2492, we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary. This PR ensures that /_ping requests are still entirely handled by the local node while still checking the primary's status on replicas.

we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary

Requests should be independent of each other. If this is the case we have a bug. Can you provide the reproduce sequence for this?

dongluochen · Nov 29, 2016

we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary

Requests should be independent of each other. If this is the case we have a bug. Can you provide the reproduce sequence for this?

There's more discussion in #2526. We never fully figured out a solid repro for this issue, but both me and @michelvocks saw this issue.

wsong · Nov 29, 2016

There's more discussion in #2526. We never fully figured out a solid repro for this issue, but both me and @michelvocks saw this issue.

@dongluochen @wsong You are right. This is a bug. I spent more time on investigating the issue and the root of it is this commit: docker/engine-api@5dd6452

The client is reusing the same connection. That means it first sends the _ping request and then uses the same connection to send the /info request. The hijacking method in swarm looks like this:

cp := func(dst io.Writer, src io.Reader, chDone chan struct{}) {
		io.Copy(dst, src)
		if conn, ok := dst.(interface {
			CloseWrite() error
		}); ok {
			conn.CloseWrite()
		}
		close(chDone)
	}
	inDone := make(chan struct{})
	outDone := make(chan struct{})
	go cp(d, nc, inDone)
	go cp(nc, d, outDone)

The go-routine reads the whole stream and then forwards it to the primary docker swarm manager and because we are using the same connection for the _ping request as well as the /info request it will forward both requests. I'm now not quite sure what's the right way to go when looking at the design. Should this be fixed in the docker client or at docker swarm?

michelvocks · Nov 30, 2016

@dongluochen @wsong You are right. This is a bug. I spent more time on investigating the issue and the root of it is this commit: docker/engine-api@5dd6452

The client is reusing the same connection. That means it first sends the _ping request and then uses the same connection to send the /info request. The hijacking method in swarm looks like this:

cp := func(dst io.Writer, src io.Reader, chDone chan struct{}) {
		io.Copy(dst, src)
		if conn, ok := dst.(interface {
			CloseWrite() error
		}); ok {
			conn.CloseWrite()
		}
		close(chDone)
	}
	inDone := make(chan struct{})
	outDone := make(chan struct{})
	go cp(d, nc, inDone)
	go cp(nc, d, outDone)

The go-routine reads the whole stream and then forwards it to the primary docker swarm manager and because we are using the same connection for the _ping request as well as the /info request it will forward both requests. I'm now not quite sure what's the right way to go when looking at the design. Should this be fixed in the docker client or at docker swarm?

Rebased on top of #2492

wsong · Nov 30, 2016

Rebased on top of #2492

@wsong was there further discussion about this PR? Have you faced the issue lately?

nishanttotla · Jan 17, 2017

@wsong was there further discussion about this PR? Have you faced the issue lately?

This isn't so much an issue as it is a feature request; because Swarm replicas can occasionally lose their connection to the primary, we want to be able to hit an endpoint telling us whether or not there is currently an elected primary, from the perspective of this replica. This is useful when, for instance, you add a new Swarm manager and it takes a while to connect to the KV store; we can hit _ping in a retry loop to determine when the new manager is ready.

wsong · Jan 17, 2017

This isn't so much an issue as it is a feature request; because Swarm replicas can occasionally lose their connection to the primary, we want to be able to hit an endpoint telling us whether or not there is currently an elected primary, from the perspective of this replica. This is useful when, for instance, you add a new Swarm manager and it takes a while to connect to the KV store; we can hit _ping in a retry loop to determine when the new manager is ready.

This is no longer relevant now that we block in api/replica.go until we have an elected primary cluster manager.

wsong · Mar 19, 2018

This is no longer relevant now that we block in api/replica.go until we have an elected primary cluster manager.

GordonTheTurtle added the dco/no label Nov 28, 2016

wsong referenced this pull request Nov 28, 2016
Merged
#2488 Update vendoring to remove engine-api #2492
2 of 2 tasks complete

GordonTheTurtle removed the dco/no label Nov 28, 2016

nishanttotla added this to the 1.2.6 milestone Nov 28, 2016

Don't forward _ping requests to the primary
Signed-off-by: Wayne Song <[email protected]>

Loading status checks…

e15bc28

dongluochen self-assigned this Dec 1, 2016

nishanttotla modified the milestones: 1.2.7, 1.2.6 Jan 17, 2017

allencloud added area/API area/replication labels Mar 29, 2017

nishanttotla modified the milestones: 1.2.8, 1.2.7 May 4, 2017

wsong closed this Mar 19, 2018

Aug	SEP	Oct
	23
2017	2018	2019

docker/swarm

Don't forward _ping requests to the primary #2554

wsong commented Nov 28, 2016

GordonTheTurtle added the dco/no label Nov 28, 2016

wsong referenced this pull request Nov 28, 2016

#2488 Update vendoring to remove engine-api #2492

GordonTheTurtle removed the dco/no label Nov 28, 2016

This comment has been minimized.

wsong commented Nov 28, 2016

nishanttotla added this to the 1.2.6 milestone Nov 28, 2016

This comment has been minimized.

dongluochen commented Nov 29, 2016

This comment has been minimized.

wsong commented Nov 29, 2016

This comment has been minimized.

michelvocks commented Nov 30, 2016 •

edited

This comment has been minimized.

wsong commented Nov 30, 2016

dongluochen self-assigned this Dec 1, 2016

This comment has been minimized.

nishanttotla commented Jan 17, 2017

This comment has been minimized.

wsong commented Jan 17, 2017

nishanttotla modified the milestones: 1.2.7, 1.2.6 Jan 17, 2017

allencloud added area/API area/replication labels Mar 29, 2017

nishanttotla modified the milestones: 1.2.8, 1.2.7 May 4, 2017

wsong closed this Mar 19, 2018

This comment has been minimized.

wsong commented Mar 19, 2018

docker/swarm

Join GitHub today

Don't forward _ping requests to the primary #2554

Conversation

wsong commented Nov 28, 2016

GordonTheTurtle added the dco/no label Nov 28, 2016

wsong referenced this pull request Nov 28, 2016

GordonTheTurtle removed the dco/no label Nov 28, 2016

This comment has been minimized.

wsong commented Nov 28, 2016

nishanttotla added this to the 1.2.6 milestone Nov 28, 2016

This comment has been minimized.

dongluochen commented Nov 29, 2016

This comment has been minimized.

wsong commented Nov 29, 2016

This comment has been minimized.

michelvocks commented Nov 30, 2016 • edited

This comment has been minimized.

wsong commented Nov 30, 2016

dongluochen self-assigned this Dec 1, 2016

This comment has been minimized.

nishanttotla commented Jan 17, 2017

This comment has been minimized.

wsong commented Jan 17, 2017

nishanttotla modified the milestones: 1.2.7, 1.2.6 Jan 17, 2017

allencloud added area/API area/replication labels Mar 29, 2017

nishanttotla modified the milestones: 1.2.8, 1.2.7 May 4, 2017

wsong closed this Mar 19, 2018

This comment has been minimized.

wsong commented Mar 19, 2018

michelvocks commented Nov 30, 2016 •

edited