Skip to content

Make fate clients available on all managers#6232

Draft
keith-turner wants to merge 6 commits intoapache:mainfrom
keith-turner:fate-clients-on-all-managers
Draft

Make fate clients available on all managers#6232
keith-turner wants to merge 6 commits intoapache:mainfrom
keith-turner:fate-clients-on-all-managers

Conversation

@keith-turner
Copy link
Contributor

This is a draft because it includes uncommitted changes from #6224 and #6227. This change makes a fate client available on all manager processes (was only available on the primary manager). This is needed by #6217 so that coordinators running on any manager can seed a fate operation to commit a compaction.

The set of shutting down tservers was causing system fate operations to
have to run on the primary manager because this was an in memory set.
This caused fate to have different code paths to user vs system fate,
this in turn caused problems when trying to distribute compaction
coordination.

To fix this problem moved the set from an in memory set to a set in
zookeeper.  The set is managed by fate operations which simplifies the
existing code. Only fate operations add and remove from the set and fate
keys are used to ensure only one fate operation runs at a time for a
tserver instance.  The previous in memory set had a lot of code to try to keep
it in sync with reality, that is all gone now.  There were many bugs with
this code in the past.

After this change is made fate can be simplified in a follow on commit
to remove all specialization for the primary manager.  Also the monitor
can now directly access this set instead of making an RPC to the
manager, will open a follow on issue for this.
After this change meta fate and user fate are both treated mostly the
same in the managers.  One difference is in assignment, the entire meta
fate range is assigned to a single manager.  User fate is spread across
all managers.  But both are assigned out by the primary manager using
the same RPCs now.  The primary manager used to directly start a meta
fate instance.

Was able to remove the extension of FateEnv from the manager class in
this change, that caused a ripple of test changes.  But now there are no
longer two different implementations of FateEnv
Before this change a fate client was only available on the primary
manager. Now fate clients are avaiable on all managers.  The primary
manager publishes fate assignment locations in zookeeper.  These
locations are used by managers to send notifications to other managers
when they seed a fate operation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant