[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Opendlm-devel] Re: [ogfs-users]opengfs + opendlm question



Title: RE: [Opendlm-devel] Re: [ogfs-users]opengfs + opendlm question

Hello Stanley,

We have actually found and fixed this problem months ago.  In fact we were able to work through a bunch of issues and successfully start a 4 node cluster and allow the nodes to failover with only a small issue (can't have 2 nodes fail at the same time).  Unfortunately, we were waiting to tie up our cccp testing before releasing the patches back to the community, which has taken long than expected.  I think the patch below will fix this specific problem, but beware there are others.  I will see what I can do to release an updated patch.

Best Regards,
Don

--- /home/dzickus/work/opendlm/src/user/dlmdu/dlm_daemon.c      2002-09-27 04:52:43.000000000 -0700
+++ dlm_daemon.c        2004-05-03 10:05:14.000000000 -0700
@@ -318,7 +318,7 @@
        entry++) {

         useaddr->nodeid = entry->dlm_node_number;
-        useaddr->useaddr.s_addr = entry->dlm_node_address.s_addr;
+        useaddr->useaddr.s_addr = 0;
        useaddr->dlm_major = 0;
        useaddr->dlm_minor = 0;

@@ -804,6 +804,8 @@
           need_local = 0;
         }
       } else {
+        count++;
+      }
        /* Find message slot and adjust liveness status (in major/minor and
         * IP address).
         */
@@ -812,7 +814,9 @@
             useaddr++) {

          if (useaddr->nodeid == qnode->dlm_node_number) {
-           if (qnode->dlm_node_state == haDLM_state_up) {
+           if ((qnode->dlm_node_state == haDLM_state_up) ||
+               (qnode->dlm_node_state == haDLM_state_local))
+           {
              useaddr->dlm_major = DLM_MAJOR_VERSION;
              useaddr->dlm_minor = DLM_MINOR_VERSION;
              useaddr->useaddr.s_addr = qnode->dlm_node_address.s_addr;
@@ -824,8 +828,6 @@
            break;
          }
        }
-        count++;
-      }
     } else {
       /*
        * Reported node not found in config file, not a problem...
-----Original Message-----
From:   opendlm-devel-admin@xxxxxxxxxxxxxxxxxxxxx on behalf of Stanley Wang
Sent:   Thu 6/24/2004 2:41 AM
To:     opengfs-users@xxxxxxxxxxxxxxxxxxxxx
Cc:     Arnaud Gauthier; OpenDLM Dev Mail List
Subject:        [Opendlm-devel] Re: [ogfs-users]opengfs + opendlm question
Arnaud,

It looks like OpenDLM's bug. And actually I haven't tried to run OpenDLM
on three or more nodes. I will try to address this bug ASAP.
And I also forward your mail to opendlm-dev list, hope to get help from
all others that have experiences with running OpenDLM on more than 3
nodes.

Thanks!

Best Regards,
Stan

On Wed, 2004-06-23 at 16:29, Arnaud Gauthier wrote:
> Hi all,
>
> I couldn't find info about my current problem, so I am posting here.
>
> I have created an opendlm + opengfs 3 nodes cluster, and I am planning
> to add more nodes later. Everything works fine with the first 2 nodes.
>
> Configuration:
> RH 7.3 patched,
> Standard 2.4.22 kernel with "big" patch + qla2300 and adaptec drivers
> patches
> heartbeat 1.2.2 compiled on each node
> opendlm (from CVS) with ccm enabled compiled on each node
> opengfs (from CVS) compiled locally
>
> I have prepared a partition with 32 internal journals, just in case :-))
>
> Everything works fine for the first 2 nodes, but the third machine stays
> with "DLM recovery state: RC_DIR_INIT"
>
> Heartbeat has a connection with all servers (broadcast on eth1) and
> works fine, depending on the thist machine I start, this will be the non
> starting machine.
>
> Here is my /etc/haDLM file:
> NODECOUNT 3
> 1  server01   192.168.0.1
> 2  server02   192.168.0.2
> 3  server03   192.168.0.3
> DLMNAME haDLM
> DLMMAJOR 250
> DLMCMGR ccm
> DLMADMIN admin 0
> DLMLOCKS locks 1
>
> Does anybody can help ?
>
> Regards,
> Arnaud
--
Opinions expressed are those of the author and do not represent Intel
Corporation
"gpg --recv-keys --keyserver wwwkeys.pgp.net E1390A7F"
{E1390A7F:3AD1 1B0C 2019 E183 0CFF  55E8 369A 8B75 E139 0A7F}




-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Opendlm-devel mailing list
Opendlm-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/opendlm-devel






[Kernel]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Clusters]     [Linux RAID]     [Yosemite Hiking]     [Linux Resources]

Powered by Linux