Re: Deadlock when remounting nested direct maps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Sun, May 27, 2012 at 11:02 PM, Ian Kent <raven@xxxxxxxxxx> wrote:
> On Wed, 2012-02-29 at 14:10 -0400, Leonardo Chiquitto wrote:
>> On Wed, Dec 14, 2011 at 2:05 AM, Ian Kent <raven@xxxxxxxxxx> wrote:
>> > On Mon, 2011-12-12 at 19:08 -0300, Leonardo Chiquitto wrote:
>> >> On Mon, Dec 12, 2011 at 3:39 AM, Ian Kent <raven@xxxxxxxxxx> wrote:
>> >> > On Mon, 2011-12-05 at 12:52 -0300, Leonardo Chiquitto wrote:
>> >> >> Hello,
>> >> >>
>> >> >> There is a problem in AutoFS that makes the daemon hang forever when
>> >> >> starting up. It can be reproduced reliably with the latest git checkout,
>> >> >> if you use the following configure options: --disable-mount-locking
>> >> >> --enable-forced-shutdown --enable-ignore-busy.
>> >> >>
>> >> >> You'll also need to setup a nested direct map:
>> >> >>
>> >> >> # cat /etc/auto.master
>> >> >> /-    /etc/auto.direct
>> >> >>
>> >> >> # cat /etc/auto.direct
>> >> >> /test \
>> >> >>       / -nfsvers=3 10.1.1.65:/nfs \
>> >> >>       /var -nfsvers=3 10.1.1.65:/nfs/var
>> >> >>
>> >> >> And then run these steps:
>> >> >>
>> >> >>   1. Start AutoFS
>> >> >>   2. cd /test
>> >> >>   3. Stop AutoFS
>> >> >>   4. Start AutoFS
>> >> >>
>> >> >> The automounter will deadlock when starting up because it calls
>> >> >> cache_writelock() in a lock that is already "readlocked" by the
>> >> >> same thread. Here's the call traces (Thread 3 is the one that
>> >> >> deadlocks):
>> >> >>
>> >> >> Thread 4 (Thread 0x7f63a0b18700 (LWP 6923)):
>> >> >> #0  0x00007f63a0702e6c in pthread_cond_wait@@GLIBC_2.3.2 ()
>> >> >>    from /lib64/libpthread.so.0
>> >> >> #1  0x00007f63a0b665c8 in master_do_mount (entry=0x7f63a0d9dd30)
>> >> >>     at master.c:1065
>> >> >> #2  0x00007f63a0b66a38 in master_mount_mounts (master=0x7f63a0d82310,
>> >> >>     age=1323091131, readall=0) at master.c:1206
>> >> >> #3  0x00007f63a0b65c03 in master_read_master (master=0x7f63a0d82310,
>> >> >>     age=1323091131, readall=0) at master.c:860
>> >> >> #4  0x00007f63a0b45d45 in main (argc=0, argv=0x7fff20d087a0)
>> >> >>     at automount.c:2202
>> >> >>
>> >> >> Thread 3 (Thread 0x7f639f564700 (LWP 6928)):
>> >> >> #0  0x00007f63a0702610 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
>> >> >> #1  0x00007f63a0b56d48 in cache_writelock (mc=0x7f63a0d9e010) at cache.c:72
>> >> >> #2  0x00007f639e706c23 in parse_mount (ap=0x7f63a0d9de50,
>> >> >>     name=0x7f639f563400 "/nfs", name_len=4,
>> >> >>     mapent=0x7f639f562400 "/ -nfsvers=3 10.1.1.65:/nfs \t/var -nfsvers=3 10.1.1.65:/nfs/var", context=0x7f63a0d83c90) at parse_sun.c:1550
>> >> >> #3  0x00007f639eb498ea in lookup_mount (ap=0x7f63a0d9de50,
>> >> >>     name=0x7f63a0da4460 "/nfs", name_len=4, context=0x7f63a0d899a0)
>> >> >>     at lookup_file.c:1074
>> >> >> #4  0x00007f63a0b523d5 in do_lookup_mount (ap=0x7f63a0d9de50,
>> >> >>     map=0x7f63a0da40d0, name=0x7f63a0da4460 "/nfs", name_len=4) at lookup.c:678
>> >> >> #5  0x00007f63a0b5258c in lookup_name_file_source_instance (ap=0x7f63a0d9de50,
>> >> >>     map=0x7f63a0d9df60, name=0x7f63a0da4460 "/nfs", name_len=4) at lookup.c:718
>> >> >> #6  0x00007f63a0b52cc3 in lookup_nss_mount (ap=0x7f63a0d9de50, source=0x0,
>> >> >>     name=0x7f63a0da4460 "/nfs", name_len=4) at lookup.c:909
>> >> >> #7  0x00007f63a0b5c678 in do_remount_direct (ap=0x7f63a0d9de50, fd=11,
>> >> >>     path=0x7f63a0da4460 "/nfs") at mounts.c:1159
>> >> >> #8  0x00007f63a0b5d003 in remount_active_mount (ap=0x7f63a0d9de50,
>> >> >>     mc=0x7f63a0d9e010, path=0x7f63a0da4460 "/nfs", devid=31, type=2,
>> >> >>     ioctlfd=0x7f639f563a30) at mounts.c:1329
>> >> >> #9  0x00007f63a0b5d0f3 in try_remount (ap=0x7f63a0d9de50, me=0x7f63a0da4390,
>> >> >>     type=2) at mounts.c:1357
>> >> >> #10 0x00007f63a0b49dee in do_mount_autofs_direct (ap=0x7f63a0d9de50,
>> >> >>     mnts=0x7f63a0da44d0, me=0x7f63a0da4390) at direct.c:339
>> >> >> #11 0x00007f63a0b4a811 in mount_autofs_direct (ap=0x7f63a0d9de50)
>> >> >>     at direct.c:511
>> >> >> #12 0x00007f63a0b42f2a in mount_autofs (ap=0x7f63a0d9de50,
>> >> >>     root=0x7f63a0d83500 "/-") at automount.c:1003
>> >> >> #13 0x00007f63a0b44a80 in handle_mounts (arg=0x7fff20d08440)
>> >> >>     at automount.c:1561
>> >> >> #14 0x00007f63a06fef05 in start_thread () from /lib64/libpthread.so.0
>> >> >> #15 0x00007f639f84853d in clone () from /lib64/libc.so.6
>> >> >>
>> >> >> Thread 2 (Thread 0x7f63a0b16700 (LWP 6925)):
>> >> >> #0  0x00007f63a0702e6c in pthread_cond_wait@@GLIBC_2.3.2 ()
>> >> >>    from /lib64/libpthread.so.0
>> >> >> #1  0x00007f63a0b5625a in st_queue_handler (arg=0x0) at state.c:1073
>> >> >> #2  0x00007f63a06fef05 in start_thread () from /lib64/libpthread.so.0
>> >> >> #3  0x00007f639f84853d in clone () from /lib64/libc.so.6
>> >> >>
>> >> >> Thread 1 (Thread 0x7f63a0b32700 (LWP 6924)):
>> >> >> #0  0x00007f63a0702e6c in pthread_cond_wait@@GLIBC_2.3.2 ()
>> >> >>    from /lib64/libpthread.so.0
>> >> >> #1  0x00007f63a0b629f8 in alarm_handler (arg=0x0) at alarm.c:186
>> >> >> #2  0x00007f63a06fef05 in start_thread () from /lib64/libpthread.so.0
>> >> >> #3  0x00007f639f84853d in clone () from /lib64/libc.so.6
>> >> >>
>> >> >> And the debug logs:
>> >> >>
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: Starting automounter version 5.0.6, master map auto.master
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: using kernel protocol version 5.02
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: lookup_nss_read_master: reading master files auto.master
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: parse_init: parse(sun): init gathered global options: (null)
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: lookup_read_master: lookup(file): read entry /-
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: master_do_mount: mounting /-
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: automount_path_to_fifo: fifo name /run/autofs.fifo--
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: lookup_nss_read_map: reading map file /etc/auto.direct
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: parse_init: parse(sun): init gathered global options: (null)
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: remount_active_mount: trying to re-connect to mount /nfs
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: mounted direct on /nfs with timeout 300, freq 75 seconds
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: lookup_mount: lookup(file): looking up /nfs
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: lookup_mount: lookup(file): /nfs -> / -nfsvers=3 10.1.1.65:/nfs #011/var -nfsvers=3 10.1.1.65:/nfs/var
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: parse_mount: parse(sun): expanded entry: / -nfsvers=3 10.1.1.65:/nfs #011/var -nfsvers=3 10.1.1.65:/nfs/var
>> >> >> Dec  5 10:18:51 linux-2alv automount[6923]: parse_mount: parse(sun): gathered options:
>> >> >>
>> >> >> Here's an initial analysis of the Thread 3 behavior:
>> >> >>
>> >> >> #15 clone ()
>> >> >> #14 start_thread ()
>> >> >> #13 handle_mounts at automount.c:1561
>> >> >> #12 mount_autofs at automount.c:1003
>> >> >> #11 mount_autofs_direct at direct.c:511
>> >> >>
>> >> >>   In mount_autofs_direct() (daemon/direct.c:486), cache_readlock()
>> >> >>   acquires the lock that will be unlocked by pthread_cleanup_pop(1)
>> >> >>   (daemon/direct.c:515). However, before the unlock, the remount
>> >> >>   code path goes on like this:
>> >> >>
>> >> >> #10 do_mount_autofs_direct at direct.c:339
>> >> >> #9  try_remount at mounts.c:1357
>> >> >> #8  remount_active_mount at mounts.c:1329
>> >> >> #7  do_remount_direct at mounts.c:1159
>> >> >> #6  lookup_nss_mount at lookup.c:909
>> >> >> #5  lookup_name_file_source_instance at lookup.c:718
>> >> >> #4  do_lookup_mount at lookup.c:678
>> >> >> #3  lookup_mount at lookup_file.c:1074
>> >> >> #2  parse_mount at parse_sun.c:1550
>> >> >>
>> >> >>   ... calling cache_writelock() in the same mapent_cache that was
>> >> >>   locked above.
>> >> >>
>> >> >> #1  cache_writelock at cache.c:72
>> >> >>
>> >> >> Any ideas on how this could be fixed?
>> >> >
>> >> > Give this patch a try please.
>> >> > I still need to test using the submount test since there may be a
>> >> > problem with them, based on the comment in the code that's been
>> >> > removed. I'll report back on how that goes.
>> >> >
>> >> > autofs-5.0.6 - fix remount deadlock
>> >> >
>> >> > From: Ian Kent <raven@xxxxxxxxxx>
>> >> >
>> >> > When reconstructing the mount tree upon restart a writelock to the map
>> >> > entry cache cannot be taken when parsing an entry because a readlock
>> >> > already is held higher up in the call tree.
>> >> >
>> >> > In the place this is done it shouldn't be necessary to alter the map
>> >> > entries in the cache and it shouldn't be necessary to delete an existing
>> >> > multi-mount cache entries to avoid a duplicate multi-mount entry error
>> >> > return. The check for a duplicate should be able to be done in the cache
>> >> > handling functions.
>> >> >
>> >> > I don't remember how this code came to be in the parsing function but
>> >> > it is likely it was due to a problem observed when testing using the
>> >> > autofs connectathon tests. With the change here the tests still pass
>> >> > and attempts to add a duplicate multi-mount entry are still detected.
>> >> >
>> >> > But the deadlock reported by Leonardo Chiquitto also appears to be
>> >> > resolved.
>> >> > ---
>> >> >
>> >> >  lib/cache.c         |    8 +++++---
>> >> >  modules/parse_sun.c |   38 ++++++--------------------------------
>> >> >  2 files changed, 11 insertions(+), 35 deletions(-)
>> >> >
>> >> >
>> >> > diff --git a/lib/cache.c b/lib/cache.c
>> >> > index 9acf736..3464e7d 100644
>> >> > --- a/lib/cache.c
>> >> > +++ b/lib/cache.c
>> >> > @@ -658,10 +658,12 @@ int cache_add_offset(struct mapent_cache *mc, const char *mkey, const char *key,
>> >> >                return CHE_FAIL;
>> >> >
>> >> >        me = cache_lookup_distinct(mc, key);
>> >> > -       if (me && me != owner)
>> >> > -               return CHE_DUPLICATE;
>> >> > +       if (me && me->age == age) {
>> >> > +               if (me != owner)
>> >> > +                       return CHE_DUPLICATE;
>> >> > +       }
>> >> >
>> >> > -       ret = cache_add(mc, owner->source, key, mapent, age);
>> >> > +       ret = cache_update(mc, owner->source, key, mapent, age);
>> >> >        if (ret == CHE_FAIL) {
>> >> >                warn(logopt, "failed to add key %s to cache", key);
>> >> >                return CHE_FAIL;
>> >> > diff --git a/modules/parse_sun.c b/modules/parse_sun.c
>> >> > index 13b8af8..0da1802 100644
>> >> > --- a/modules/parse_sun.c
>> >> > +++ b/modules/parse_sun.c
>> >> > @@ -843,12 +843,14 @@ add_offset_entry(struct autofs_point *ap, const char *name,
>> >> >                strcpy(m_mapent, loc);
>> >> >
>> >> >        ret = cache_add_offset(mc, name, m_key, m_mapent, age);
>> >> > -       if (ret == CHE_OK)
>> >> > -               debug(ap->logopt, MODPREFIX
>> >> > -                     "added multi-mount offset %s -> %s", path, m_mapent);
>> >> > -       else
>> >> > +       if (ret != CHE_UPDATED)
>> >> >                warn(ap->logopt, MODPREFIX
>> >> >                      "syntax error or duplicate offset %s -> %s", path, loc);
>> >> > +       else {
>> >> > +               debug(ap->logopt, MODPREFIX
>> >> > +                     "added multi-mount offset %s -> %s", path, m_mapent);
>> >> > +               ret = CHE_OK;
>> >> > +       }
>> >> >
>> >> >        return ret;
>> >> >  }
>> >> > @@ -1547,34 +1549,6 @@ int parse_mount(struct autofs_point *ap, const char *name,
>> >> >                        strcat(m_root, name);
>> >> >                }
>> >> >
>> >> > -               cache_writelock(mc);
>> >> > -               me = cache_lookup_distinct(mc, name);
>> >> > -               if (!me) {
>> >> > -                       int ret;
>> >> > -                       /*
>> >> > -                        * Not in the cache, perhaps it's a program map
>> >> > -                        * or one that doesn't support enumeration
>> >> > -                        */
>> >> > -                       ret = cache_add(mc, source, name, mapent, time(NULL));
>> >> > -                       if (ret == CHE_FAIL) {
>> >> > -                               cache_unlock(mc);
>> >> > -                               free(options);
>> >> > -                               return 1;
>> >> > -                       }
>> >> > -               } else {
>> >> > -                       /*
>> >> > -                        * If the entry exists it must not have any existing
>> >> > -                        * multi-mount subordinate entries since we are
>> >> > -                        * mounting this afresh. We need to do this to allow
>> >> > -                        * us to fail on the check for duplicate offsets in
>> >> > -                        * we don't know when submounts go away.
>> >> > -                        */
>> >> > -                       cache_multi_writelock(me);
>> >> > -                       cache_delete_offset_list(mc, name);
>> >> > -                       cache_multi_unlock(me);
>> >> > -               }
>> >> > -               cache_unlock(mc);
>> >> > -
>> >> >                cache_readlock(mc);
>> >> >                me = cache_lookup_distinct(mc, name);
>> >> >                if (me) {
>> >>
>> >> Hello Ian,
>> >>
>> >> Thanks for the patch! I confirm the deadlock is gone. I'm investigating a
>> >> new problem in the same test case (restarting AutoFS when a mount
>> >> with nested mounts is busy), but it's probably unrelated to the patch.
>> >> I'll gather more information and report that in a new thread to avoid
>> >> confusion.
>> >
>> > But testing using the submount test indicates there may be a problem
>> > with submounts using this patch or a new problem introduced with recent
>> > changes. I'm still working on it.
>> >
>> > The handling of nested mounts is difficult, especially upon restart in
>> > the presence of in use mounts. I look forward to your feedback on the
>> > problem.
>>
>> Hello Ian,
>>
>> Did you have time to revisit this problem? The "submounts problem" you
>> mentioned above was really introduced by the patch (fix remount deadlock)
>> you posted here?
>
> The problem I saw might not have been introduced by the patch above but
> that code most likely there for a reason that's still relevant.
>
> This patch I think is the way to do it, can you try it please.
>
> autofs-5.0.6 - fix remount deadlock
>
> From: Ian Kent <raven@xxxxxxxxxx>
>
> When reconstructing the mount tree upon restart a writelock to the map
> entry cache cannot be taken when parsing a direct map entry because a
> readlock is already held higher up in the call tree.
>
> In the place this is done it isn't be necessary to alter the direct map
> entries in the cache. Also, it shouldn't be necessary to delete existing
> multi-mount cache entries to avoid a duplicate multi-mount entry error
> return. The check for a duplicate can be done in the cache handling
> functions.
> ---
>
>  lib/cache.c         |    8 ++++---
>  modules/parse_sun.c |   60 ++++++++++++++++++++++++++-------------------------
>  2 files changed, 35 insertions(+), 33 deletions(-)
>
>
> diff --git a/lib/cache.c b/lib/cache.c
> index 9acf736..3464e7d 100644
> --- a/lib/cache.c
> +++ b/lib/cache.c
> @@ -658,10 +658,12 @@ int cache_add_offset(struct mapent_cache *mc, const char *mkey, const char *key,
>                return CHE_FAIL;
>
>        me = cache_lookup_distinct(mc, key);
> -       if (me && me != owner)
> -               return CHE_DUPLICATE;
> +       if (me && me->age == age) {
> +               if (me != owner)
> +                       return CHE_DUPLICATE;
> +       }
>
> -       ret = cache_add(mc, owner->source, key, mapent, age);
> +       ret = cache_update(mc, owner->source, key, mapent, age);
>        if (ret == CHE_FAIL) {
>                warn(logopt, "failed to add key %s to cache", key);
>                return CHE_FAIL;
> diff --git a/modules/parse_sun.c b/modules/parse_sun.c
> index d7e74e2..8b0f8b6 100644
> --- a/modules/parse_sun.c
> +++ b/modules/parse_sun.c
> @@ -843,12 +843,17 @@ add_offset_entry(struct autofs_point *ap, const char *name,
>                strcpy(m_mapent, loc);
>
>        ret = cache_add_offset(mc, name, m_key, m_mapent, age);
> -       if (ret == CHE_OK)
> +       if (ret == CHE_DUPLICATE)
> +               warn(ap->logopt, MODPREFIX
> +                    "syntax error or duplicate offset %s -> %s", path, loc);
> +       else if (ret == CHE_FAIL)
> +               debug(ap->logopt, MODPREFIX
> +                     "failed to add multi-mount offset %s -> %s", path, m_mapent);
> +       else {
> +               ret = CHE_OK;
>                debug(ap->logopt, MODPREFIX
>                      "added multi-mount offset %s -> %s", path, m_mapent);
> -       else
> -               warn(ap->logopt, MODPREFIX
> -                     "syntax error or duplicate offset %s -> %s", path, loc);
> +       }
>
>        return ret;
>  }
> @@ -1410,7 +1415,7 @@ int parse_mount(struct autofs_point *ap, const char *name,
>        char buf[MAX_ERR_BUF];
>        struct map_source *source;
>        struct mapent_cache *mc;
> -       struct mapent *me = NULL;
> +       struct mapent *me;
>        char *pmapent, *options;
>        const char *p;
>        int mapent_len, rv = 0;
> @@ -1561,33 +1566,28 @@ int parse_mount(struct autofs_point *ap, const char *name,
>                        strcat(m_root, name);
>                }
>
> -               cache_writelock(mc);
> -               me = cache_lookup_distinct(mc, name);
> -               if (!me) {
> -                       int ret;
> -                       /*
> -                        * Not in the cache, perhaps it's a program map
> -                        * or one that doesn't support enumeration
> -                        */
> -                       ret = cache_add(mc, source, name, mapent, time(NULL));
> -                       if (ret == CHE_FAIL) {
> -                               cache_unlock(mc);
> -                               free(options);
> -                               return 1;
> +               /*
> +                * Can't take the write lock for direct mount entries here
> +                * but they should always be present in the map entry cache.
> +                */
> +               if (ap->type == LKP_INDIRECT) {
> +                       cache_writelock(mc);
> +                       me = cache_lookup_distinct(mc, name);
> +                       if (!me) {
> +                               int ret;
> +                               /*
> +                                * Not in the cache, perhaps it's a program map
> +                                * or one that doesn't support enumeration.
> +                                */
> +                               ret = cache_add(mc, source, name, mapent, age);
> +                               if (ret == CHE_FAIL) {
> +                                       cache_unlock(mc);
> +                                       free(options);
> +                                       return 1;
> +                               }
>                        }
> -               } else {
> -                       /*
> -                        * If the entry exists it must not have any existing
> -                        * multi-mount subordinate entries since we are
> -                        * mounting this afresh. We need to do this to allow
> -                        * us to fail on the check for duplicate offsets in
> -                        * we don't know when submounts go away.
> -                        */
> -                       cache_multi_writelock(me);
> -                       cache_delete_offset_list(mc, name);
> -                       cache_multi_unlock(me);
> +                       cache_unlock(mc);
>                }
> -               cache_unlock(mc);
>
>                cache_readlock(mc);
>                me = cache_lookup_distinct(mc, name);

Hello Ian,

Thanks for getting back to this issue. I've tested the patch now using the test
case reported earlier on this thread and I confirm that it resolves the deadlock
on startup.

The problem of not reconnecting to the nested mounts remains, but I don't
want to mix up the threads.

Thanks again,
Leonardo
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Tools]     [DDR & Rambus]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

Add to Google