Re: NLB Cluster - Ping fails or long time to reply from outside local subnet



Something to keep in mind....if the switch is Layer 3, the ports the NLB nodes plug into must function at layer 2 or it won't work properly.

I'm still interested in the behavior with only one NIC in the picture.

With NLB, it is pretty common practice to break the problem down into its simplest components and see if it works, if it works then build up from there.

--
Chuck Timon, Jr.
Microsoft Corporation
Windows Server 2008 Readiness Team
This posting is provided 'AS IS" with no warranties, and confers no rights.

"David Morgan" <microsoft_newsgroups@xxxxxxxxxxxxxxxxxxxxxxx> wrote in message news:ejPPAjI7HHA.5772@xxxxxxxxxxxxxxxxxxxxxxx
Yes, NIC1 and NIC2 (in each machine) are dual port Broadcom cards. They are definitely not being teamed and I have tested the locally administered address is being set correctly on the cluster adapter. ARP reports everything as expected.

I will set-up the cluster using the DLink card, NIC3 which was added afterwards.

Regardless, none of this would suggest why it works locally but not from outside the LAN. It is as if the router is intermittently not accepting the packets to/from the LAA address or something.


"Chuck [MSFT]" <ctimon@xxxxxxxxxxxxxxxxxxxx> wrote in message news:ewVUlnC7HHA.1900@xxxxxxxxxxxxxxxxxxxxxxx
Let's simplify this.....disable all but one NIC in each member of the NLB cluster, setup NLB on it, plug the NICs into a hub, remember those? :{> then uplink the hub to a switch port and test.

If you have NLB in your environment....keep a spare dumb hub around....never know when you might need it.

--
Chuck Timon, Jr.
Microsoft Corporation
Windows Server 2008 Readiness Team
This posting is provided 'AS IS" with no warranties, and confers no rights.

"Bookham Measures" <bookham_measures_no_spam@xxxxxxxxx> wrote in message news:OlGc7c96HHA.5404@xxxxxxxxxxxxxxxxxxxxxxx
Hello

We have set-up NLB cluster with two servers, ultimately for an IIS application.

Everything functions perfectly when the clients are on the same subnet as defined by their IP address.

Hosts outside the subnet, i.e. come via a router, cannot get reliable ping responses. They get a mixture of "request timed out" or a delayed reply, between 54ms to 1510ms. The ping summary always reports more than 60% packet loss. This applies to the cluster IP address and the dedicated IP on the cluster NIC in each server. The mixture of timeouts and replies come at different times when running simultaneous pings to the three IP addresses.

All configuration has been performed via the Network Load Balancing Manager.

As we have plenty of adapters the cluster has been configured in Unicast mode. We have set-up a VLAN on our switch and plugged the dedicated cluster NICs in to those ports. The configuration is as follows.


Server 1
---------
NIC1: x.x.16.121 DG x.x.16.1 in VLAN
NIC2: x.x.16.123 DG x.x.16.1
NIC3: 192.168.1.21 DG not set.

Server 2
---------
NIC1: x.x.16.122 DG x.x.16.1 in VLAN
NIC2: x.x.16.124 DG x.x.16.1
NIC3: 192.168.1.22 DG not set.

Cluster
-------
Server 1 NIC1 Priority 1
Server 2 NIC1 Priority 2
Cluster IP x.x.16.125
Equal

NIC3 on each server is numbered so as to communicate with the database cluster. I have tried removing the default gateway from NIC1/2 to see if I can different results, but I cannot. Interestingly, when one of the servers is offline the problem persists. Remember, this problem only occurs from outside the x.x.16.0 subnet. Hosts on the same subnet (not in the VLAN), have no problems communicating to the cluster or NIC1 IP addresses.

When the cluster is deleted via the Manager, the IPs on NIC1 in each machine start responding to pings normally with good times.

NIC2 in both servers responds to pings from anywhere satisfactorily, the whole time, (they are not in the VLAN used by NIC1s).

NIC1 and NIC2 in each machine are a "Broadcom NetXtreme Gigabit Ethernet" dual port adapter and have the latest driver from the IBM website. Both servers are IBM System-X 3850 M2s running Windows 2003 Server R2 SP2. (32 Bit). Quad Xeon 2.5Ghz with 3 Gb RAM.

The switch is a Foundry EdgeIron 48G.

Where can I go next to troubleshoot this problem? The fact that the IPs respond normally from everywhere when there is no cluster configured, must mean that there is something wrong at the NLB driver level.

Many thanks in advance.

David







.



Relevant Pages

  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
    ... I thought that the litmus test was that the router functions fine when no ... NLB is installed, but when it is, things start going screwy. ... to the NLB cluster itself. ... Windows Server 2008 Readiness Team ...
    (microsoft.public.windows.server.clustering)
  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
    ... Windows Server 2008 Readiness Team ... I was feeling nervous about our teaming-capable adapter as I read it might be sending out heartbeats, so I disabled it AND configured the cluster on a separate DLink card in multicast mode. ... I am losing the plot with NLB, I have spent a week trying to get it working. ... I thought that the litmus test was that the router functions fine when no NLB is installed, but when it is, things start going screwy. ...
    (microsoft.public.windows.server.clustering)
  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
    ... If you run the command 'wlbs query' and the node it is run on says it is converged with all of your nodes in the cluster, that basically says the NLB configuration is correct and the nodes are talking to each other. ... Windows Server 2008 Readiness Team ... Now of course when the permanent MAC is reinstated the router will get confused over the IP/MAC combination of the dedicated cluster NIC. ...
    (microsoft.public.windows.server.clustering)
  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
    ... Windows Server 2008 Readiness Team ... NLB is installed, but when it is, things start going screwy. ... The servers are a couple of switches away from the router so I ... 'exterior' to the NLB cluster itself. ...
    (microsoft.public.windows.server.clustering)
  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet - SOLVED
    ... Windows Server 2008 Readiness Team ... now allows me to ping the cluster and dedicated IP from a remote location. ... Of course, now if I remove the cluster configuration, I will not be able to connect as the permanent MAC address will be assigned back to the dedicated adapter's IP address in the ARP tables. ... I have disabled one of the Broadcom NICs, (NIC1 which had an IP from the same subnet). ...
    (microsoft.public.windows.server.clustering)