HDDS-14103. Create an option to suppress/unsuppress containers from report#9719
HDDS-14103. Create an option to suppress/unsuppress containers from report#9719sarvekshayr wants to merge 11 commits intoapache:masterfrom
Conversation
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerInfo.java
Outdated
Show resolved
Hide resolved
|
I am not sure about this idea. Surely, if the container is missing and all efforts have been made to ensure there are no copies that can be recovered, the correct thing to do is to remove the container from the system? |
|
@sodonnel I agree that safely removing would be the best long term solution. However implementing that robustly is more complicated. Even if all the keys are deleted from OM, SCM won't have any DNs to send the block delete requests to, and those DNs cannot tell SCM that their replicas are empty and safe to be deleted. We therefore need a check for orphan containers in between SCM and OM that handles the cleanup. I don't think we want to allow admins to manually remove containers from the system based on their own investigation. |
priyeshkaratha
left a comment
There was a problem hiding this comment.
Thanks @sarvekshayr for the patch. I left few comments related to admin check and other good to go changes. Please have a look into those.
...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
Show resolved
Hide resolved
...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
Show resolved
Hide resolved
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerInfo.java
Outdated
Show resolved
Hide resolved
...e/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/container/AckMissingSubcommand.java
Outdated
Show resolved
Hide resolved
sumitagrawl
left a comment
There was a problem hiding this comment.
@sarvekshayr given some comments
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerInfo.java
Outdated
Show resolved
Hide resolved
...apache/hadoop/hdds/scm/container/replication/health/AcknowledgedMissingContainerHandler.java
Outdated
Show resolved
Hide resolved
hadoop-hdds/interface-admin/src/main/proto/ScmAdminProtocol.proto
Outdated
Show resolved
Hide resolved
...server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
Outdated
Show resolved
Hide resolved
errose28
left a comment
There was a problem hiding this comment.
Thanks for working on this @sarvekshayr. I think we should try to avoid coupling the ack/suppression concept to missing containers within the code itself, even though that is the current use case. There may be other cases now or in the future where we may want to suppress containers, like having all unhealthy replicas or quasi-closed stuck, for example. Suppressed containers would just be filtered out of container report generation, regardless of their replica states.
...e/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/container/AckMissingSubcommand.java
Outdated
Show resolved
Hide resolved
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
|
Thank you for your contribution. This PR is being closed due to inactivity. Please contact a maintainer if you would like to reopen it. |
sreejasahithi
left a comment
There was a problem hiding this comment.
Thanks @sarvekshayr for working on this.
...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
Show resolved
Hide resolved
...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
Outdated
Show resolved
Hide resolved
sumitagrawl
left a comment
There was a problem hiding this comment.
@sarvekshayr given few comments
What changes were proposed in this pull request?
Implemented
--suppressand--unsuppressflags inozone admin container report(mutually exclusive), supporting multiple container IDs from command line, stdin, or files.Once the command is executed, the container report will be updated after the next Replication Manager cycle.
Added
--suppressedfiltering option toozone admin container listto show suppressed/unppressed containers.What is the link to the Apache JIRA
HDDS-14103
How was this patch tested?
Initial container report
Suppress 2 containers from report
List suppressed containers
List unsuppressed containers
Unsuppress 2 containers from report