gratuitous arp and bad mac

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance




I'm troubleshooting a problem on an Active/Active Win2k SQL 2000
cluster. We have 2 instances of SQL installed and each instance is on
a separate node. Lets call them node1, node2, sql-a, and sql-b. In
this example sql-a is on node1 and sql-b is on node2. The servers are
connected to a Cisco switch.

A client PC is accessing data regularly through an odbc connection to
sql-a. This works fine for a while, but after 20 mins or so it stops.
I can no longer ping sql-a. I can ping node1 and node2 just fine.

I looked at the arp table (arp -a) and found that the mac address for
sql-a was now matching the mac for node2. So I flush the arp cache
(arp -d *) and ping sql-a again, it works! The sql-a mac is correctly
set to node1 again. The odbc app is back up and running.

About 15 minutes later, the problem happens again. Same symptoms -
wrong mac in the arp table. I checked and the sql-a and sql-b
instances are running on the correct nodes, there has not been any
failovers.

So from what I can tell, someone is updating the client PC arp table
with the wrong mac address. I suspect that the failover from a few
days ago may be contributing.

I'm guessing that this is happening as a bogus gratuitous arp message.
I understand the cluster service will send these out upon failover,
but it looks like it is happening now without a failover event.

The other reason I believe this is a bad GARP is because we added the
mac as a static arp entry, and even the static value was overwritten
with the bad mac.

I also wonder if the MAC address logic is confused because the server
is using teamed nics. The teaming driver (dell) requires a mac
address to be created and put into the config settings. I wonder if
the cluster is reading the wrong info here.

I have checked that we have unique mac addresses, and all the other
settings, so I suspect I have found some combination of events to
trigger an unexpected 'windows cluster feature'.


Anyway - I'll take any suggestions from IP & cluster gurus out
there... Thanks.





.



Relevant Pages

  • RE: gratuitous arp and bad mac
    ... Are you implementing any Layer 2 Switch Fault Tolerance? ... public network only but also NOT recommened in a cluster. ... > I looked at the arp table and found that the mac address for ... > sql-a was now matching the mac for node2. ...
    (microsoft.public.windows.server.clustering)
  • Re: Explain why change from unicast to multicast prevents port flooding
    ... > - Just consider the cluster NIC for now ... address and I think the MAC address as well. ... > 4 Router sends ARP request ... > cluster host to the switch have their MAC addresses masked ...
    (microsoft.public.windows.server.clustering)
  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
    ... Once again, ARP is an RFC standard, if you are having to make static entries in unicast mode, then your network device is not in compliance. ... Windows Server 2008 Readiness Team ... I was feeling nervous about our teaming-capable adapter as I read it might be sending out heartbeats, so I disabled it AND configured the cluster on a separate DLink card in multicast mode. ... I thought that the litmus test was that the router functions fine when no NLB is installed, but when it is, things start going screwy. ...
    (microsoft.public.windows.server.clustering)
  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
    ... ARP is defined by RFC. ... Windows Server 2008 Readiness Team ... The servers are a couple of switches away from the router so I would have thought that any duplicate MAC info. or similar would have been come irrelavent/concealed at the router. ... If you run the command 'wlbs query' and the node it is run on says it is converged with all of your nodes in the cluster, that basically says the NLB configuration is correct and the nodes are talking to each other. ...
    (microsoft.public.windows.server.clustering)
  • Re: creare un cluster SGE
    ... per scopi puramente didattici vorrei creare un piccolo cluster grid ... Tutti i pc hanno debian installata e i mac hanno Tiger e tutti i ... SGE e' ottimo, ...
    (it.comp.macintosh)