RE: [Linux-HA] UDP / DHCP / LDIRECTORD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Apparently this is related to some sort of race condition (possibly a problem with my ldirectord start script which does an edit on the ipvsadm config after ldirectord has started) if ldirectord starts to receive traffic on port 67/68 before the following commands are run:

        ipvsadm -E -u 10.10.10.10:67 -o -s rr
        ipvsadm -E -u 10.10.10.10:68 -o -s rr

Then it will be stuck sending traffic to the fist server in the list. 



Brian Carpio 
Senior Systems Engineer

Office: +1.303.962.7242
Mobile: +1.720.319.8617
Email: bcarpio@xxxxxxxxxxxx


-----Original Message-----
From: linux-ha-bounces@xxxxxxxxxxxxxxxxxx [mailto:linux-ha-bounces@xxxxxxxxxxxxxxxxxx] On Behalf Of Brian Carpio
Sent: Thursday, February 24, 2011 3:47 PM
To: 'Simon Horman'
Cc: 'lvs-devel'; 'Julian Anastasov'; 'linux-ha@xxxxxxxxxxxxxxxxxx'
Subject: Re: [Linux-HA] UDP / DHCP / LDIRECTORD

All,

So this patch has been working for us flawlessly for the last 5 months or so. 

Our infrastructure is 100% virtualized, the other day our loadbalacner01 had a memory leak and crashed, since we use ldirectord with heartbeat loadbalacner02 took over, however ever since then it seems like the single packet UDP scheduling has stopped working. Even if I fail back over the loadbalacner01 VM, I still see all the DHCP traffic going to only one backend server. 

If I run ipvsadm -L -n I can see that ipvsadm thinks both of the backend servers are up since the weight is set to 1 for each server, if I reboot the second backend server the one which is not receiving any traffic then run ipvsadm -L -n I can see its weight go to 0 and in the ldirectord log I can see that its marked dead. 

I have exported one of the loadblancers and one of the backend servers (using VMware) and imported them into another ESXi server, once I boot up the loadbalacner it works perfectly... I'm very stumped why this would happen, is there any additional logging you can think of that I might want to enable to see where the exact problem is?

Here are my configs:

 
/etc/ha.d/ldirectord.conf

checktimeout=10
checkinterval=2
autoreload=yes
logfile="/var/log/ldirectord.log"
quiescent=yes
virtual=10.10.10.10:67
        real=backend_server01:67 masq
        real=backend_server02:67 masq
        protocol=udp
        checktype=ping
        scheduler=rr
virtual=10.10.10.10:68
        real=back_endserver01:68 masq
        real=backend_server02:68 masq
        protocol=udp
        checktype=ping
        scheduler=rr


I had to rewrite the ldirectord start script and added the following lines in the start and restart sections:

        ipvsadm -E -u 10.10.10.10:67 -o -s rr
        ipvsadm -E -u 10.10.10.10:68 -o -s rr


Here is the output of ipvsadm -L -n when both backend servers are up (working environment):


IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  10.10.10.10:67 rr ops
  -> backend_server01:67            Masq    1      0          16731     
  -> backend_server02:67            Masq    1      0          17447     
UDP  192.168.181.67:68 rr ops
  -> backend_server01:68            Masq    1      0          0         
  -> backend_server02:68            Masq    1      0          0         

Here is the output of ipvsadm -L -n when both backend servers are up (non-working environment):

[root@lb01 log]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  10.10.10.10:67 rr ops
  -> backend_server01:67                 Masq    1      0          1         
  -> backend_server02:67                 Masq    1      0          0         
UDP  10.10.10.10:68 rr ops
  -> backend_server01:68                 Masq    1      0          0         
  -> backend_server02:68                 Masq    1      0          0         


The only difference I see is that in my "Working" environment my InActConn number increases as I send load through it, in my "Non-Working" environment the InActConn stays at 1 the entire time.. Another difference is that in the "Working" environment I am using a DHCP load testing tool one of my developers wrote, whereas in the "NON-Working" environment we are actually getting DHCP traffic from another network device... 





Brian Carpio
Senior Systems Engineer

Office: +1.303.962.7242
Mobile: +1.720.319.8617
Email: bcarpio@xxxxxxxxxxxx


-----Original Message-----
From: Brian Carpio
Sent: Thursday, April 15, 2010 1:57 PM
To: Simon Horman
Cc: linux-ha@xxxxxxxxxxxxxxxxxx; lvs-devel; Julian Anastasov
Subject: RE: [Linux-HA] UDP / DHCP / LDIRECTORD

Simon,

Thanks again for all of your hard work, I have sent over a million UDP DHCP packets at the new kernel/ipvsadm with the patches applied and currently the only issue (which you know about already) is that ldirectord doesn't know about the -o option which causes a slight issue with heartbeat (but I just put in a cheap fix in my ldirectord start script to edit the services created by ldirectord).. 

So not only have I sent over 1,000,000 packets to this setup but I have also sent them as fast as 10 packets every 3 milliseconds, I plan to do a long term week long test but I don't foresee any issues.. 

Let me know if there is any other testing you would like us to do.. or if you would like me to send out the kernel-2.6.18-128 with the patch and the ipvsadm-1.24-10 rpm with the patch.. 

Thanks again Simon you are the man!!

Brian Carpio



-----Original Message-----
From: Simon Horman [mailto:horms@xxxxxxxxxxxx]
Sent: Monday, April 12, 2010 8:56 PM
To: Brian Carpio
Cc: linux-ha@xxxxxxxxxxxxxxxxxx; lvs-devel; Julian Anastasov
Subject: Re: [Linux-HA] UDP / DHCP / LDIRECTORD

Hi Brian,

here are some patches to test.
I have only lightly tested them to the extent that they compile and appear to configure a valid service.

You can enable one packet scheduling (OPS) by passing the -o option to ipvsadm when creating a virtual service.

	e.g.

	# ipvsadm -A -u 172.17.60.211:80 -o
	# ipvsadm -L -n
	IP Virtual Server version 1.2.1 (size=4096)
	Prot LocalAddress:Port Scheduler Flags
	  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
	UDP  172.17.60.211:80 wlc ops

There are three patches:

ops-kernel-2.6.18-128.el5.patch: Patch against CentOS-5.3's 2.6.18-128 kernel.
ops-ipvsadm-1.24-10: Patch against CentOS-5.3's ipvsadm 1.24-10.
ops-ipvsadm-1.24: Patch against upstream ipvsadm 1.24

I have not up-ported the code to the 2.6.33 kernel and ipvsadm 1.25 yet.


No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.801 / Virus Database: 271.1.1/2808 - Release Date: 04/13/10 00:32:00 _______________________________________________
Linux-HA mailing list
Linux-HA@xxxxxxxxxxxxxxxxxx
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

ÿôèº{.nÇ+?·?®?­?+%?Ëÿ±éݶ¥?wÿº{.nÇ+?·¥¾ÏÝz÷¥þ)í?æèw*jg¬±¨¶????Ý¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v?¨?wèjØm¶?ÿþø¯ù®w¥þ?àþf£¢·h??â?úÿ?Ù¥



[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Free Online Dating]     [Linux Kernel]     [Linux SCSI]     [XFree86]

Add to Google Powered by Linux