[svlug] Cannot TCP-connect to several hosts (including Google) After a While (Solved by Reboot)

Shlomi Fish shlomif at gmail.com
Thu Feb 21 05:58:22 PST 2008


Hi all!

[ This is my first post to the list. ]

I've recently started encountering a strange and annoying networking problem
that necessitates me to reboot my computer often. I blogged about it here:

http://community.livejournal.com/shlomif_tech/7034.html

Here is a copy of it. Any help in resolving it would be appreciated.

Regards,

       Shlomi Fish

As a follow-up to [1]my previous post about a networking problem
I've encountered on my Linux box I'd like to make a follow-up and
re-summarise all the currently available information.

  [1] http://community.livejournal.com/shlomif_tech/3938.html

Someone suggested the problem may be caused due to a bad Ethernet
card, so I borrowed an Intel Ethernet card, replaced my Ethernet
card with it, and tried it for a few days. After keeping the
computer on for a while, it started exhibiting similar problems as
the old one: no connectivity to the host of www.shlomifish.org, bad
connectivity to Google, etc. Solvable only after a reboot.

I've uploaded [2]Wireshark pcap dumps of a "lynx
http://www.shlomifish.org/" command from both cards before and after
the problem is exhibited.

  [2] http://freehackers.org/~shlomif/files/files/www.sf.org-conn-problem/

The Linux system that exhibits the problem is a relatively old Pentium 4
2.4GHz with 2.5 GB of RAM and no swap, an Nvidia GeForce 4 MX card,
using Mandriva Cooker with the latest Linus -rc kernel, and the Nouveau
drivers (although I think the problem is also exhibited with "nv").

So here's what I know now:
1. The symptom is that after the computer is on for a while (two
   days or so), one cannot connect using TCP to some hosts, and the
   connection times out.
2. The IP that causes the problems is 212.143.218.31, but it also
   affects www.google.com and possibly other hosts.
3. It is exhibited by kernels 2.6.23, 2.6.24-rc1, 2.6.24-rc2,
   2.6.24-rc8, 2.6.24.2, 2.6.25-rc2 and 2.6.24.2-desktop-2mdv (At least).
4. A different computer on the same Home LAN connected via a
   NAT/router has no problem with that IP, at the same time the
   machine running Linux exhibits the problem.
5. During one time this happened, I could connect using telnet to
   port 80 eventualy, but it took an awfully long time.
6. I have problem with both HTTP and port 80, POP and SSH.
7. Restarting the network ("/etc/init.d/network restart") does not
   help - only a reboot.
8. Some other hosts in the network, like Yahoo work fine.
9. Doing "echo 0 > /proc/sys/net/ipv4/tcp_window_scaling" after the
   problem appeared didn't solve the problem after at least 30
   minutes.
10. The problem is exhibited by both a RealTek card I have and an
   Intel Ethernet card. (both 100 Mbps).
11. [3]pcap dumps of an HTTP connection to the offending site before
   and after the problem occurs are available for both cards.

  [3] http://freehackers.org/~shlomif/files/files/www.sf.org-conn-problem/

12. I can't ping www.shlomifish.org, because it doesn't answer to
   pings at all, but when the problem occurs again, I can try
   pinging www.google.com and see what happens.
13. iptables is completely off:
[root at telaviv1 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
14. My household is connected to the Internet through a Sweex
   NAT/Router [4]that doesn't have any updates yet.

[4] http://www.sweex.com/producten.php?sectie=&subsectie=&item=80&artikel=892&detail=d

15. [5]Quoting srlamb:

  [5] http://community.livejournal.com/shlomif_tech/3938.html?thread=3170#t3170

 The difference in these two packet captures seems to be that in
 the "bad" case, the www.shlomifish.org->you packets have a bad
 checksum and are ignored (see it retrying the SYN after it's
 already gotten a SYN/ACK?). I don't know why this would be, but it
 seems worthy of mention to the mailing lists you are asking for
 help.

16. Pinging www.google.com seems to work after the problem is exhibited:

http://community.livejournal.com/shlomif_tech/7034.html?thread=7802#t7802

---------------

Some extra discussion can be found at [6]the previous entry.

  [6] http://community.livejournal.com/shlomif_tech/3938.html



-- 
------------------------------------------
Shlomi Fish http://www.shlomifish.org/

Electrical Engineering studies. In the Technion. Been there. Done
that. Forgot a lot. Remember too much.



More information about the svlug mailing list