chore: SREP-969: Reuse Ocm Connection and handle closure of connection#785
Conversation
|
@karanjitsingh01: This pull request references SREP-969 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
@karanjitsingh01: This pull request references SREP-969 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@karanjitsingh01: This pull request references SREP-969 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@karanjitsingh01: This pull request references SREP-969 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@karanjitsingh01: This pull request references SREP-969 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@karanjitsingh01: This pull request references SREP-969 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #785 +/- ##
==========================================
+ Coverage 52.65% 52.79% +0.13%
==========================================
Files 80 80
Lines 6056 6065 +9
==========================================
+ Hits 3189 3202 +13
+ Misses 2442 2435 -7
- Partials 425 428 +3
🚀 New features to boost your workflow:
|
pkg/ocm/ocm.go
Outdated
| go func(ctx context.Context, cancel context.CancelFunc, conn *ocmsdk.Connection) { | ||
| fmt.Println("starting ocm connection timeout watcher", o.Timeout) | ||
| <-ctx.Done() | ||
| o.mu.Lock() | ||
| defer o.mu.Unlock() | ||
| if o.connection != nil { | ||
| logger.Debugln(fmt.Sprintf("closing ocm connection after %v", o.Timeout)) | ||
| o.connection.Close() | ||
| o.connection = nil | ||
| } | ||
| // cancel the context to release resources | ||
| cancel() | ||
|
|
||
| }(ctx, cancel, connection) | ||
|
|
There was a problem hiding this comment.
What if the move go routine function as a new private method on DefaultOCMInterface stuct?
There was a problem hiding this comment.
yes, added as a private method for DefaultOCMInterface
pkg/ocm/ocm.go
Outdated
| func createConnection() (*ocmsdk.Connection, error) { | ||
|
|
||
| envURL := os.Getenv("OCM_URL") | ||
| if envURL != "" { | ||
| // Fetch the real ocm url from the alias and set it back to the ENV | ||
| ocmURL, err := ocmurls.ResolveGatewayURL(envURL, nil) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| os.Setenv("OCM_URL", ocmURL) | ||
| logger.Debugf("reset the OCM_URL to %s", ocmURL) | ||
| } | ||
|
|
||
| // Setup connection at the first try | ||
| connection, err := ocmocm.NewConnection().Build() | ||
| if err != nil { | ||
| if strings.Contains(err.Error(), ocmNotLoggedInMessage) { | ||
| return nil, fmt.Errorf("please ensure you are logged into OCM by using the command " + | ||
| "\"ocm login --url $ENV\"") | ||
| } else { | ||
| return nil, err | ||
| } | ||
| } | ||
| fmt.Println("OCM connection established") | ||
| return connection, nil | ||
| } |
There was a problem hiding this comment.
What if move this as a private function of DefaultOCMInterface ?
There was a problem hiding this comment.
yes, added as a private method for DefaultOCMInterface
pkg/ocm/ocm.go
Outdated
| // important to set a timeout to avoid leaking connections | ||
| // new connections are created if the previous one has been closed after the timeout | ||
| if o.Timeout == 0 { | ||
| o.Timeout = 30 * time.Second |
There was a problem hiding this comment.
may be move connection timeout as const , which may be easy to change via config in future
There was a problem hiding this comment.
added constant
pkg/ocm/ocm.go
Outdated
|
|
||
| type DefaultOCMInterfaceImpl struct { | ||
| // connection *ocmsdk.Connection | ||
| Timeout time.Duration // Timeout for the OCM connection can be set |
There was a problem hiding this comment.
maybe ocmConnectionTimeout ?
There was a problem hiding this comment.
renamed to ocmConnectionTimeout
pkg/ocm/ocm.go
Outdated
| type DefaultOCMInterfaceImpl struct { | ||
| // connection *ocmsdk.Connection | ||
| Timeout time.Duration // Timeout for the OCM connection can be set | ||
| mu sync.Mutex |
There was a problem hiding this comment.
what if add the readble var name ?
There was a problem hiding this comment.
renamed to ocmConnectionMutex
pkg/ocm/ocmsetup_test.go
Outdated
|
|
||
| ocmsdk "github.com/openshift-online/ocm-sdk-go" | ||
| "github.com/openshift/backplane-cli/pkg/ocm" | ||
| "github.com/stretchr/testify/assert" |
There was a problem hiding this comment.
shouldn't use the new test lib for asserting, pls reuse ginkgo and gomega
There was a problem hiding this comment.
Thanks for informing me, I wasn't aware that we have to strictly use ginkgo and gomega.
I have updated the tests accordingly.
pkg/ocm/ocmsetup_test.go
Outdated
| @@ -0,0 +1,141 @@ | |||
| package ocm_test | |||
There was a problem hiding this comment.
You may be able to move tests to https://github.com/openshift/backplane-cli/blob/main/pkg/ocm/ocm_test.go#L1 rather than adding a new file
There was a problem hiding this comment.
reused existing test file and added new tests
go.mod
Outdated
| github.com/spf13/cobra v1.10.1 | ||
| github.com/spf13/pflag v1.0.10 | ||
| github.com/spf13/viper v1.21.0 | ||
| github.com/stretchr/testify v1.11.1 |
There was a problem hiding this comment.
May not need new test ib , pls reuse existing test libs
There was a problem hiding this comment.
reused ginkgo, removed dependency in go mod as well
pkg/ocm/ocmsetup_test.go
Outdated
| conn1, _ := impl.SetupOCMConnection() | ||
| conn2, _ := impl.SetupOCMConnection() | ||
| assert.Equal(t, conn1, conn2, "connections should be reused before timeout") | ||
| assert.False(t, conn1.closed, "connection should be active initially") | ||
| assert.NotNil(t, conn1.ocmConnection, "real connection should not be nil") | ||
|
|
There was a problem hiding this comment.
Move all tests to the function with clear BDD patterns
ex : https://github.com/openshift/backplane-cli/blob/main/pkg/ocm/ocm_test.go#L30
There was a problem hiding this comment.
Updated tests to follow BDD pattern
pkg/ocm/ocmsetup_test.go
Outdated
| assert.NotNil(t, conn3.ocmConnection, "real connection should not be nil") | ||
|
|
||
| // wait for new connection to auto-close | ||
| time.Sleep(250 * time.Millisecond) |
There was a problem hiding this comment.
can't we inject small timeout for tests, otherwise all tests wait for 250 Millisecond
There was a problem hiding this comment.
reduced timeout for "fake ocm client" test to 50ms, total test is taking around 140-150ms.
reduced timeout for "real ocm connection" test to 100ms, I tested it with shorter duration, but real ocm connection is taking time to establish and close. overall time for test is around 450-500ms. we are creating 2 real connections and reusing them 3 times. after that we wait for closure routine to close ocm connection properly.
pkg/ocm/ocm.go
Outdated
| } | ||
|
|
||
| func (o *DefaultOCMInterfaceImpl) initiateCloseConnection(ctx context.Context, cancel context.CancelFunc, conn *ocmsdk.Connection) { | ||
| fmt.Println("starting ocm connection timeout watcher", o.Timeout) |
There was a problem hiding this comment.
can you put this to logger Debug?
There was a problem hiding this comment.
updated !!
Just for thought, Should we have creation and closure of connection logs to info log level.
I am assuming that application should be running with info loglevel (default), and we should able to see when connection was established successfully and when it was closed in the logs. Reusing of connection can be with debug log level.
There was a problem hiding this comment.
The default log level is warning, that means anything else below that would not be shown without additional parameters to the cli.
There was a problem hiding this comment.
Thanks, then we only see WARN/ERROR/PANIC logs.
Those above logs have been added to debug level 👍
pkg/ocm/ocm.go
Outdated
| return nil, err | ||
| } | ||
| } | ||
| fmt.Println("OCM connection established") |
There was a problem hiding this comment.
can you put this to logger Debug?
pkg/ocm/ocm.go
Outdated
| // connection *ocmsdk.Connection | ||
| Timeout time.Duration // Timeout for the OCM connection can be set | ||
| ocmConnectionMutex sync.Mutex | ||
| ocmConnectionTimeout *ocmsdk.Connection |
There was a problem hiding this comment.
Does this mean ocmConnectionWithTimeout?
The name is timeout but the type is connection which is a bit confusing.
There was a problem hiding this comment.
yes, it was a miss while refactoring test, I have corrected the namings now, for connection -> ocmConnection and for timeout -> OcmConnectionTimeout
|
@karanjitsingh01: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
|
||
| type DefaultOCMInterfaceImpl struct { | ||
| // connection *ocmsdk.Connection | ||
| OcmConnectionTimeout time.Duration // Timeout for the OCM connection can be set |
There was a problem hiding this comment.
Any reason to capitalize this one?
There was a problem hiding this comment.
Yes, It's made public, DefaultOCMInterfaceImpl timeout can be set while initializing,
something like - impl = &ocm.DefaultOCMInterfaceImpl{OcmConnectionTimeout: 100 * time.Millisecond}, by default the timeout is 30s if not initialized.
|
Please add "chore:" in the PR title so it can be grouped under |
something like this "chore: SREP-969: Reuse Ocm Connection and handle closure of connection" ? |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bmeng, karanjitsingh01 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |

What type of PR is this?
What this PR does / Why we need it?
This PR Focus on restructuring SetupOCMConnection for reusing existing connections rather than creating new one always.
SetupOCMConnection now reduce dependency of calling close in callers methods as cleanup is handled be a cleaner thread which starts after timeout interval (default 30s). If a connection is requested after a timeout interval, a new connection is created, again cleaner thread is also spawned to take care of resource cleanup.
Which Jira/Github issue(s) does this PR fix?
Special notes for your reviewer
Unit Test Coverage
Guidelines
Test coverage checks
Unit Test can't be run as setup initiates actual ocm connection.
I have tested the updated code by running locally and making sure no leak is there and connections being utilized within timout interval and new connection being created after a timeout interval of previous connection. Adding a snipp of testcase and result.
Pre-checks (if applicable)
/label tide/merge-method-squash