New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't forward _ping requests to the primary #2554

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
6 participants
@wsong
Contributor

wsong commented Nov 28, 2016

No description provided.

@wsong wsong referenced this pull request Nov 28, 2016

Merged

#2488 Update vendoring to remove engine-api #2492

2 of 2 tasks complete

@GordonTheTurtle GordonTheTurtle removed the dco/no label Nov 28, 2016

@wsong

This comment has been minimized.

Show comment
Hide comment
@wsong

wsong Nov 28, 2016

Contributor

In #2492, we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary. This PR ensures that /_ping requests are still entirely handled by the local node while still checking the primary's status on replicas.

Contributor

wsong commented Nov 28, 2016

In #2492, we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary. This PR ensures that /_ping requests are still entirely handled by the local node while still checking the primary's status on replicas.

@nishanttotla nishanttotla added this to the 1.2.6 milestone Nov 28, 2016

@dongluochen

This comment has been minimized.

Show comment
Hide comment
@dongluochen

dongluochen Nov 29, 2016

Contributor

we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary

Requests should be independent of each other. If this is the case we have a bug. Can you provide the reproduce sequence for this?

Contributor

dongluochen commented Nov 29, 2016

we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary

Requests should be independent of each other. If this is the case we have a bug. Can you provide the reproduce sequence for this?

@wsong

This comment has been minimized.

Show comment
Hide comment
@wsong

wsong Nov 29, 2016

Contributor

There's more discussion in #2526. We never fully figured out a solid repro for this issue, but both me and @michelvocks saw this issue.

Contributor

wsong commented Nov 29, 2016

There's more discussion in #2526. We never fully figured out a solid repro for this issue, but both me and @michelvocks saw this issue.

@michelvocks

This comment has been minimized.

Show comment
Hide comment
@michelvocks

michelvocks Nov 30, 2016

Contributor

@dongluochen @wsong You are right. This is a bug. I spent more time on investigating the issue and the root of it is this commit: docker/engine-api@5dd6452

The client is reusing the same connection. That means it first sends the _ping request and then uses the same connection to send the /info request. The hijacking method in swarm looks like this:

cp := func(dst io.Writer, src io.Reader, chDone chan struct{}) {
		io.Copy(dst, src)
		if conn, ok := dst.(interface {
			CloseWrite() error
		}); ok {
			conn.CloseWrite()
		}
		close(chDone)
	}
	inDone := make(chan struct{})
	outDone := make(chan struct{})
	go cp(d, nc, inDone)
	go cp(nc, d, outDone)

The go-routine reads the whole stream and then forwards it to the primary docker swarm manager and because we are using the same connection for the _ping request as well as the /info request it will forward both requests. I'm now not quite sure what's the right way to go when looking at the design. Should this be fixed in the docker client or at docker swarm?

Contributor

michelvocks commented Nov 30, 2016

@dongluochen @wsong You are right. This is a bug. I spent more time on investigating the issue and the root of it is this commit: docker/engine-api@5dd6452

The client is reusing the same connection. That means it first sends the _ping request and then uses the same connection to send the /info request. The hijacking method in swarm looks like this:

cp := func(dst io.Writer, src io.Reader, chDone chan struct{}) {
		io.Copy(dst, src)
		if conn, ok := dst.(interface {
			CloseWrite() error
		}); ok {
			conn.CloseWrite()
		}
		close(chDone)
	}
	inDone := make(chan struct{})
	outDone := make(chan struct{})
	go cp(d, nc, inDone)
	go cp(nc, d, outDone)

The go-routine reads the whole stream and then forwards it to the primary docker swarm manager and because we are using the same connection for the _ping request as well as the /info request it will forward both requests. I'm now not quite sure what's the right way to go when looking at the design. Should this be fixed in the docker client or at docker swarm?

@wsong

This comment has been minimized.

Show comment
Hide comment
@wsong

wsong Nov 30, 2016

Contributor

Rebased on top of #2492

Contributor

wsong commented Nov 30, 2016

Rebased on top of #2492

@dongluochen dongluochen self-assigned this Dec 1, 2016

@nishanttotla

This comment has been minimized.

Show comment
Hide comment
@nishanttotla

nishanttotla Jan 17, 2017

Contributor

@wsong was there further discussion about this PR? Have you faced the issue lately?

Contributor

nishanttotla commented Jan 17, 2017

@wsong was there further discussion about this PR? Have you faced the issue lately?

@wsong

This comment has been minimized.

Show comment
Hide comment
@wsong

wsong Jan 17, 2017

Contributor

This isn't so much an issue as it is a feature request; because Swarm replicas can occasionally lose their connection to the primary, we want to be able to hit an endpoint telling us whether or not there is currently an elected primary, from the perspective of this replica. This is useful when, for instance, you add a new Swarm manager and it takes a while to connect to the KV store; we can hit _ping in a retry loop to determine when the new manager is ready.

Contributor

wsong commented Jan 17, 2017

This isn't so much an issue as it is a feature request; because Swarm replicas can occasionally lose their connection to the primary, we want to be able to hit an endpoint telling us whether or not there is currently an elected primary, from the perspective of this replica. This is useful when, for instance, you add a new Swarm manager and it takes a while to connect to the KV store; we can hit _ping in a retry loop to determine when the new manager is ready.

@nishanttotla nishanttotla modified the milestones: 1.2.7, 1.2.6 Jan 17, 2017

@nishanttotla nishanttotla modified the milestones: 1.2.8, 1.2.7 May 4, 2017

@wsong wsong closed this Mar 19, 2018

@wsong

This comment has been minimized.

Show comment
Hide comment
@wsong

wsong Mar 19, 2018

Contributor

This is no longer relevant now that we block in api/replica.go until we have an elected primary cluster manager.

Contributor

wsong commented Mar 19, 2018

This is no longer relevant now that we block in api/replica.go until we have an elected primary cluster manager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment