RE: [ogfs-dev]RE: [Opendlm-devel] Making progress on mounting with ODLM lockmodule
Just adding a few more comments, see below ...
-- Ben --
> >
> > > Stanley's removal of a call to release_mount_lock()
> yesterday got me
> > > thinking about the internal (to lock module) MOUNT lock.
> > I'm thinking
> > > right now that we don't need it; memexp used an internal
> > MOUNT lock to
> > > determine whether a node was first-to-mount (see ogfs-memexp doc).
>
> > As I understood, the mount lock is used to make sure that
> > there is only
> > one node is doing the mount work in a same time. It serialize
> > all mount
> > request. Please have a check.
> >
> > Best Regards,
> > Stan
>
> Yes, I've taken another look ... I think I understand things
> better now.
> Here is my current analysis (comments welcome). With apologies about
> length of discussion:
>
> Summary:
>
> This topic is confusing because we need to separate the overall "mount
> work" into several different aspects/operations.
>
> One of the *most* confusing aspects of this is that memexp's "MOUNT"
> lock does not map directly to the "MOUNT" lock in the opendlm lock
> module. Memexp's "MOUNT" lock record was not just a simple lock; it
> also contained status about "first-to-mount" and "others-may-mount".
> "others-may-mount" status keeps non-first-to-mount nodes from mounting
> the filesystem until the first-to-mount node has recovered *all*
> journals ....
>
> ... The opendlm lock module uses the deadman lock mechanism as a
> replacement for determining first-to-mount ("YES", if we can grab all
> deadman locks immediately). But the deadman mechanism does not, by
> itself, handle "others-may-mount". This requires a separate lock. We
> need to be told by the filesystem code (via "others_may_mount()") when
> to release that lock. I think that this is the specific
> reason that we
> need the opendlm "MOUNT" lock.
>
> Details:
>
> There are two separate "mounts" going on, and two separate
> MOUNT locks,
> when using OpenDLM with OpenGFS:
>
> -- (first) for OpenDLM, which grabs lock #0, type LM_TYPE_MOUNT, when
> when starting setup of the deadman locks. (deadman.c,
> start_deadman_lock()). This keeps multiple nodes from simultaneously
> attempting the *initial deadman setup* (do we need this
> protection?
YES, we do ... If two nodes tried the initial deadman setup at exactly
the same time, they would both find out that they are *not*
first-to-mount. They would each be able to grab their own deadman EX,
but not be able to grab each others' deadman EX, therefore both would
conclude that they're not first-to-mount.
> Or,
> is this what the lock was really designed to do? See
> discussion below).
>
> -- (second) for OpenGFS filesystem, which grabs lock #0
> (OGFS_MOUNT_LOCK), type LM_TYPE_NONDISK. (super_linux.c,
> ogfs_read_super(), call to ogfs_glock_num()). This keeps
> multiple nodes
> from simultaneously *mounting the filesystem*.
>
> Note that these are separate and distinct locks. And, of course, the
> deadman setup must happen before the filesystem can grab any locks at
> all; opendlm must be successfully set up before OGFS can use it.
>
> So far, these locks could be viewed as pretty darn independent, the
> OpenDLM lock protecting the setup of the deadman locks (this
> protection
> is what I was thinking was not necessary), and the OpenGFS lock
> protecting the filesystem mount.
>
> However, *in addition*, there is a consideration about supporting the
> first-to-mount filesystem node. We need to keep other nodes from
> mounting until the first-to-mount has recovered *all* journals.
> Otherwise, another node might get its filesystem mounted before *all*
> journals have been recovered. The OpenGFS filesystem "MOUNT" lock is
> not sufficient for this ... OGFS grabs it too late,
> significantly after
> the deadman setup has determined whether we're first-to-mount. This
> would allow another node to:
>
> 1) Do OpenDLM deadman setup
> 2) Determine that it is not first-to-mount
> 3) Recover (only) its own journal
> 4) Mount the filesystem before we complete all-journal recovery.
>
> So, I think the OpenDLM mount lock has been doing double-duty, both as
> the deadman setup protection, *and* the first-to-mount protection.
> Actually, based on the presence of "release_mount_lock()" under
> OGFS_DLM_NONFIRST, as well as in "opendlm_others_may_mount()", I think
> it has really been doing the first-to-mount protection, not
> the deadman
> setup protection (the second pass of deadman setup happens
> *without* the
> lock being held).
>
> The first-to-mount functionality is a requirement. So, I'm going to
> re-instate the MOUNT lock functionality, but add some comments to make
> it clear what it's really doing.
Including the first-to-mount detection.
>
> Question: Do we need to add another MOUNT lock for protecting deadman
> setup all the way through the second pass? Or, in the case of
> non-first-to-mount, simply wait to release the MOUNT lock until after
> the second pass?
Still not sure if we need the mount lock all the way through the second
pass, but it would make code a little cleaner if we grabbed and released
the lock (in cases of error or *not* first-to-mount) in opendlm_mount(),
just before/after the call to start_deadman_lock(). Doing so would
protect the second pass, and put all grab/release calls in the same
dlm.c file, close to one another:
error = grab_mount_lock()
if (error) { ... }
error = start_deadman_lock(dlm, first)
if (error < 0) {
release_mount_lock();
...
}
if (!(*first)) {
release_mount_lock();
}
Any danger in wrapping the second pass with the mount lock??
>
> -- Ben --
>
>
>
> >
> > > But
> > > we're using deadman locks to do the same. I don't think
> we need the
> > > (redundant/useless?) MOUNT lock, so I commented out the
> > > grab_mount_lock() and release_mount_lock()
> implementations. We can
> > > think about that a little more before totally removing
> > those functions.
> >
> > > -- Ben --
> > >
> > >
> > >
>
>
> -------------------------------------------------------
> SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> Build and deploy apps & Web services for Linux with
> a free DVD software kit from IBM. Click Now!
> http://ads.osdn.com/?ad_id56&alloc_id438&op=ick
> _______________________________________________
> Opengfs-devel mailing list
> Opengfs-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/opengfs-devel
>
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56&alloc_id438&opÌk
_______________________________________________
Opengfs-devel mailing list
Opengfs-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/opengfs-devel
[Kernel]
[Security]
[Bugtraq]
[Photo]
[Yosemite]
[MIPS Linux]
[ARM Linux]
[Linux Clusters]
[Linux RAID]
[Yosemite Hiking]
[Linux Resources]