[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Google
  Web www.spinics.net

[ogfs-dev]RE: [ogfs-users]RE: Problems with opengfs + opendlm on RHEL 3



 

> -----Original Message-----
> From: opengfs-users-admin@xxxxxxxxxxxxxxxxxxxxx 
> [mailto:opengfs-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf 
> Of Marc Swanson
> Sent: Tuesday, July 13, 2004 3:11 PM
> To: opengfs-users@xxxxxxxxxxxxxxxxxxxxx
> Subject: [ogfs-users]RE: Problems with opengfs + opendlm on RHEL 3
> 
> Solved the problems I was having before by removing all existing
> ultramonkey heartbeat rpms (including libnet) and compiling 
> from src.  I
> also needed to use LD_ASSUME_KERNEL=2.4 on top of all of that.

Congratulations!  And thanks for sharing.  I'm going to cross-post to
OpenDLM list to let that crew know about your experience.

> 
> So now that I got it working I'm running into further 
> problems, some of
> which I can work around and others I'm struggling to understand how to
> fix.
> 
> Problem 1:  Mounting more than one ogfs filesystem with 
> opendlm does not
> seem very stable at all, true?  

Yes, true ... there are a bunch of global/static variables for recovery
in current CVS, and sharing between filesystems is not healthy right now
... I'm working on some changes that will "instancize" recovery stuff.
Was hoping to check in today, but found something else that needs
attention.


> Not a biggie for me.. I can 
> run just one
> big filesystem if necessary.

For the moment, use just the one big one.

> 
> Problem 2:  Recovery is not very well documented.. how do you 
> do it!?! 
> I setup my 2 nodes with an opendlm mount.  I then unmount one of the
> nodes and stop the dlm and heartbeat.  Any attempt to have that node
> rejoin results in failure.  the dlms seem to resume communications ok
> after restarting things in order (and I even tried a full reboot too),
> but the mount command just hangs.  During all of this the mount on the
> second node stays up which is good.. but oddly I can't seem to unmount
> that node cleanly?
> 
> What is the correct procedure for recovering from this scenario while
> maintaining high availability?
> 
> Am I doing something wrong?

No, the code just isn't working right yet.  Everything *should* be
automatic (i.e. no docs required!).  The recovery support in the opendlm
lock module (OpenGFS component) is really new, and not very well tested.

And, since RedHat recently released the Sistina GFS, I'm not sure how
much more we're going to be working on this project (OpenGFS), although
it would be nice to get it into a clean state.

Have you tried the RedHat GFS?  See:

http://sources.redhat.com/cluster

http://sources.redhat.com/cluster/gfs/

On the GFS page, there is a link to download source RPMs for GFS for
RHEL 3.


-- Ben --

Opinions are mine, not Intel's

> 
> Thanks!
> 
> -Marc Swanson-
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email sponsored by Black Hat Briefings & Training.
> Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
> digital self defense, top technical experts, no vendor pitches, 
> unmatched networking opportunities. Visit www.blackhat.com
> _______________________________________________
> Opengfs-users mailing list
> Opengfs-users@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/opengfs-users
> 
> 


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Opengfs-devel mailing list
Opengfs-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/opengfs-devel


[Site Home]     [Kernel list]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [DVD Store]     [Linux Clusters]     [Linux RAID]     [Linux Resources]

Powered by Linux