There is a machine. Ubuntu 8.04 Desktop.
In the last week, since a power failure, it has started acting oddly - the internet connection will come and go, it will stop talking to the outside world suddenly and without warning, etc.
The catch: The machine is *always* rock-solid on the internal network. I can SSH into it and, over the one single solitary network connection, watch *internet* access come and go. It can always reach internal addresses - just anything outside 192.168.1.0/24 is simply not there.
The router is not blocking any traffic. DNS is working - it's correctly resolving all the host names and attempting to connect via IP gives the same results.
The router's logging capabilities are utter crap, but they *do* see the attempt by the machine to connect out - if I tell it to wget www.google.com, the router's outgoing connection log will show a connection from this machine to something in google's IP range.
No other machine on the network is having a problem.
Changing this machine to have a different static IP (like the other, working server), or to run via DHCP (like the other, working desktops) doesn't change the behavior.
So: My latest thought was a Routing problem - no default route! A static route to somewhere! Right? Well, no. Route looked like crap, so I cleared 'em all, restarted, and now this is what "route" gives me:
Looks good to me! netstat -rn gives the same results! Unfortunately, I still can't get a single packet out except to 192.168.1.1
So: Those are my symptoms, and they do not make sense.
/etc/network/interfaces:
Traceroute from the machine to the router:
Traceroute from the machine to another machine on the network:
Traceroute to the external IP, on the far side of the router:
Traceroute to the external IP from another machine on the network, run at the same time:
... this HAS to be a routing problem, right? The symptoms don't match *anything* else. I just can't figure out *why* the routing is busted.
Help me, interwebs. What am I doing wrong?
EDIT: And now the bastard thing is working. And I don't know why. And I can't break it again.
In the last week, since a power failure, it has started acting oddly - the internet connection will come and go, it will stop talking to the outside world suddenly and without warning, etc.
The catch: The machine is *always* rock-solid on the internal network. I can SSH into it and, over the one single solitary network connection, watch *internet* access come and go. It can always reach internal addresses - just anything outside 192.168.1.0/24 is simply not there.
The router is not blocking any traffic. DNS is working - it's correctly resolving all the host names and attempting to connect via IP gives the same results.
The router's logging capabilities are utter crap, but they *do* see the attempt by the machine to connect out - if I tell it to wget www.google.com, the router's outgoing connection log will show a connection from this machine to something in google's IP range.
No other machine on the network is having a problem.
Changing this machine to have a different static IP (like the other, working server), or to run via DHCP (like the other, working desktops) doesn't change the behavior.
So: My latest thought was a Routing problem - no default route! A static route to somewhere! Right? Well, no. Route looked like crap, so I cleared 'em all, restarted, and now this is what "route" gives me:
Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.1.0 * 255.255.255.0 U 0 0 0 eth0 link-local * 255.255.0.0 U 1000 0 0 eth0 default 192.168.1.1 0.0.0.0 UG 100 0 0 eth0
Looks good to me! netstat -rn gives the same results! Unfortunately, I still can't get a single packet out except to 192.168.1.1
So: Those are my symptoms, and they do not make sense.
/etc/network/interfaces:
# The primary network interface auto eth0 iface eth0 inet static address 192.168.1.15 netmask 255.255.255.0 gateway 192.168.1.1
Traceroute from the machine to the router:
traceroute to 192.168.1.1 (192.168.1.1), 30 hops max, 40 byte packets 1 192.168.1.1 (192.168.1.1) 0.456 ms 0.757 ms 1.068 ms
Traceroute from the machine to another machine on the network:
traceroute to 192.168.1.4 (192.168.1.4), 30 hops max, 40 byte packets 1 oasis.local (192.168.1.4) 0.163 ms 0.153 ms 0.152 ms
Traceroute to the external IP, on the far side of the router:
traceroute to XXX.XXX.XXX.XXX (XXX.XXX.XXX.XXX), 30 hops max, 40 byte packets 1 * * * 2 * * * 3 * * *
Traceroute to the external IP from another machine on the network, run at the same time:
traceroute to XXX.XXX.XXX.XXX, 30 hops max, 40 byte packets 1 XXX.XXX.XXX.XXX 1.518 ms 2.108 ms 2.715 ms
... this HAS to be a routing problem, right? The symptoms don't match *anything* else. I just can't figure out *why* the routing is busted.
Help me, interwebs. What am I doing wrong?
EDIT: And now the bastard thing is working. And I don't know why. And I can't break it again.

