Browse Source

clientv3/integration: sleep less in TestLeaseRenewLostQuorum

Server Stop+Restart sometimes takes more than 500ms, so with a
one second window the lease client may not get a chance to issue
a keepalive and get a lease extension before the lease client
timer elapses. Instead, sleep for a shorter period of time (while
still guaranteeing a keepalive resend during quorum loss) and
skip the test if server restart takes longer than the lease TTL.

Fixes #7346
Anthony Romano 8 years ago
parent
commit
c654370d6d
1 changed files with 11 additions and 4 deletions
  1. 11 4
      clientv3/integration/lease_test.go

+ 11 - 4
clientv3/integration/lease_test.go

@@ -586,17 +586,24 @@ func TestLeaseRenewLostQuorum(t *testing.T) {
 	}
 	// consume first keepalive so next message sends when cluster is down
 	<-ka
+	lastKa := time.Now()
 
 	// force keepalive stream message to timeout
 	clus.Members[1].Stop(t)
 	clus.Members[2].Stop(t)
-	// Use TTL-1 since the client closes the keepalive channel if no
-	// keepalive arrives before the lease deadline.
-	// The cluster has 1 second to recover and reply to the keepalive.
-	time.Sleep(time.Duration(r.TTL-1) * time.Second)
+	// Use TTL-2 since the client closes the keepalive channel if no
+	// keepalive arrives before the lease deadline; the client will
+	// try to resend a keepalive after TTL/3 seconds, so for a TTL of 4,
+	// sleeping for 2s should be sufficient time for issuing a retry.
+	// The cluster has two seconds to recover and reply to the keepalive.
+	time.Sleep(time.Duration(r.TTL-2) * time.Second)
 	clus.Members[1].Restart(t)
 	clus.Members[2].Restart(t)
 
+	if time.Since(lastKa) > time.Duration(r.TTL)*time.Second {
+		t.Skip("waited too long for server stop and restart")
+	}
+
 	select {
 	case _, ok := <-ka:
 		if !ok {