
Jeremy, For the record, we at Internap do take connectivity issues seriously. I'd suggest having your provider reach out to our NOC, so that we may investigate comprehensively. You (and others on this list) are, of course, welcome to mail me privately in addition. (FWIW: a quick look into 76.96.0.0/11 in Dallas shows we've not been routing to it over any of Comcast's congested paths.) Regards, -a On Sat, Mar 8, 2014 at 10:52 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:
1. Thanks -- the problem is that in my experience Company X will blame Company Y for the device, but the device is owned/maintained by Company X, and this nonsense goes on for about a week before someone finally owns up to something (by which time the problem is usually gone). It's a depressing and sad modus operandi; sometimes I think it's done intentionally (stall tactic).
2. No, because I don't think it's necessary -- when I can clearly "feel" the slowdown via SSH (which is TCP-based) the issue isn't related to ICMP prio. Plus, showing a network provider hping results doesn't necessarily convince them of anything if they're unfamiliar with the tool. That's been my experience anyway. It'd be akin to giving them packet captures and doing a 10-page write-up showing how TCP packet with PSH+ACK seq no 123456789 wasn't seen by the remote end until 2-3 retries.
3. No I haven't, because the process would be significantly more convoluted than that. This is what would have to happen, starting with the forward path:
- I'd have to open a ticket with Comcast through standard 800-COMCAST means, i.e. complaining to someone in the Philippines about packet loss (read: likelihood of someone screwing this up: 99% likely) - Comcast would have to hand it off to the Comcast NOC - Comcast NOC would have to care enough to open a ticket with AT&T
For the reverse path:
- I'd have to open a ticket with my VPS provider, RootBSD - RootBSD would have to open a ticket with InterNAP (assuming they have relationship with them directly; it may be more convoluted, for example it may be they have to open a ticket with their co-lo provider who then opens a ticket with InterNAP) - InterNAP would have to care enough to open a ticket with Qwest/Centurylink - Qwest/Centurylink would have to care enough to open a ticket with Comcast
Historically I've mailed things of this nature to outages@outages.org because there are lurkers on the list who quietly go behind the scenes and start trying to fix/rectify things. Other times it's purely about bringing to light something that's happening on the Internet in hopes that one or more of the involved peers are, in a roundabout way, publicly shamed for not having better monitoring.
P.S. -- Issue is still ongoing and appears worse than before (at least now there aren't sporadic times of 0% loss at intermediary hops).
-- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB |
On Sat, Mar 08, 2014 at 06:27:15PM -0800, Michael Smith wrote:
A couple of things
- Hop 8's IP is an AT&T so likely an interface on an AT&T router, since you're headed towards it in your traceroute (next_hop). - Have you tried something like hping that will allow you to use TCP for your test? - Have you contacted InterNAP and told them to open a ticket with AT&T to open a ticket with AT&T using the data you have?
Mike
On Mar 8, 2014, at 4:53 PM, Jeremy Chadwick <jdc@koitsu.org> wrote:
Since roughly Friday, I've been seeing what appears to be packet loss somewhere within Comcast/AT&T network mesh. Source and destination IPs are provided below as well, ditto with some mtrs from src->dst and dst->src. I keep periodic mtrs (both directions) going all the way back to 03/04. I can make all of those logs available if asked.
The issue started on 03/07 @ 21:33 PST suddenly -- not a "gradual" increase -- and lasted until an undetermined time (very hard to tell from mtrs) but I'd estimate ~02:00 PST on 03/08 (today).
The issue then appeared to start back up again ~07:00 PST, though it's hard to give an exact time (seems sort of a gradual increase, thus hard to pinpoint). It's been ongoing since.
The loss varies from 3% to 20%, but you can definitely "feel" it across an SSH session, so it's not ICMP prio.
I will make myself clear: it's very hard to "show" someone the way this problem manifests itself, because the packet loss will vary all over the place between different hops. It *definitely* starts at a particular point and "trickles down", but due to the fact that the loss is a smaller percentage, there are times where a hop will suddenly show 0%. TL;DR -- You'd really have to see a longer log (say, an hour's worth) to be able to say "ah yes, this really is a problem" and not blow it off as ICMP prio.
And as usual, there's one of those "mystery routers" (hop #8 in the first example) that peering providers looooooove to use as a scapegoat when it comes to shifting blame, ex. provider A says "that's a device owned by provider B", provider B says "that device is provider A's responsibility", and neither side does anything about the issue. However I should note that the "mystery router" usually does show some degree of loss even when this issue isn't occurring (likely ICMP prio on the device), but that makes it even more difficult to determine where the issue begins.
src IP: 76.102.14.35 (Comcast; Mountain View, CA) dst IP: 204.109.61.174 (RootBSD; Dallas, TX)
=== Sat Mar 8 16:22:00 PST 2014 (1394324520) Start: Sat Mar 8 16:22:00 2014 HOST: icarus.home.lan Loss% Snt Rcv Last Avg Best Wrst 1.|-- gw.home.lan (192.168.1.1) 0.0% 30 30 0.4 0.3 0.2 0.4 2.|-- 76.102.12.1 0.0% 30 30 8.0 8.8 8.0 12.2 3.|-- te-0-2-0-5-ur06.santaclara.ca.sfba.comcast.net (68.86.249.253) 0.0% 30 30 8.2 9.0 8.2 16.5 4.|-- te-1-1-0-1-ar01.oakland.ca.sfba.comcast.net (69.139.198.94) 0.0% 30 30 11.9 12.2 10.1 15.0 5.|-- be-90-ar01.sfsutro.ca.sfba.comcast.net (68.85.155.14) 0.0% 30 30 12.0 12.5 10.1 15.1 6.|-- he-3-8-0-0-cr01.sanjose.ca.ibone.comcast.net (68.86.94.85) 0.0% 30 30 13.0 14.0 11.8 18.0 7.|-- pos-0-3-0-0-pe01.11greatoaks.ca.ibone.comcast.net (68.86.87.18) 0.0% 30 30 15.6 17.2 15.3 19.8 8.|-- 192.205.37.1 70.0% 30 9 54.8 67.7 53.6 102.2 9.|-- cr2.sffca.ip.att.net (12.122.86.202) 13.3% 30 26 65.2 63.5 61.0 65.7 10.|-- cr2.la2ca.ip.att.net (12.122.31.133) 6.7% 30 28 63.3 63.5 60.9 75.2 11.|-- cr2.dlstx.ip.att.net (12.122.28.177) 3.3% 30 29 65.2 63.7 61.1 69.7 12.|-- ggr6.dlstx.ip.att.net (12.122.138.113) 6.7% 30 28 60.3 64.5 59.9 153.6 13.|-- 12.90.228.14 6.7% 30 28 60.3 60.7 60.2 62.5 14.|-- border1.pc1-bbnet1.dal004.pnap.net (216.52.191.19) 3.3% 30 29 60.3 60.2 59.8 60.5 15.|-- giglinx-60.border1.dal004.pnap.net (216.52.189.46) 3.3% 30 29 59.9 60.2 59.8 61.4 16.|-- 204.109.62.46 6.7% 30 28 60.1 60.5 60.1 62.7 17.|-- mambo.koitsu.org (204.109.61.174) 3.3% 30 29 60.7 60.9 60.1 63.4 === END
src IP: 204.109.61.174 (RootBSD; Dallas, TX) dst IP: 76.102.14.35 (Comcast; Mountain View, CA)
=== Sat Mar 8 16:22:00 PST 2014 (1394324520) Start: Sat Mar 8 16:22:00 2014 HOST: mambo.koitsu.org Loss% Snt Rcv Last Avg Best Wrst 1.|-- 204.109.61.173 0.0% 30 30 0.5 1.3 0.4 15.0 2.|-- 204.109.62.45 0.0% 30 30 0.5 0.5 0.3 1.2 3.|-- border1.ge1-6.giglinx-60.dal004.pnap.net (216.52.189.45) 0.0% 30 30 0.5 0.6 0.4 4.7 4.|-- core3.pc1-bbnet1.ext1a.dal.pnap.net (216.52.191.41) 0.0% 30 30 0.9 1.0 0.9 1.2 5.|-- dax-edge-03.inet.qwest.net (67.133.189.93) 0.0% 30 30 0.6 2.0 0.5 22.8 6.|-- 63-235-82-234.dia.static.qwest.net (63.235.82.234) 0.0% 30 30 1.4 1.3 1.0 1.7 7.|-- be-13-cr01.dallas.tx.ibone.comcast.net (68.86.82.141) 0.0% 30 30 1.3 2.7 1.0 4.9 8.|-- he-0-14-0-0-cr01.losangeles.ca.ibone.comcast.net (68.86.85.141) 0.0% 30 30 35.6 33.6 31.8 35.7 9.|-- he-1-8-0-0-ar01.oakland.ca.sfba.comcast.net (68.86.89.54) 3.3% 30 29 52.8 53.5 51.5 55.5 10.|-- te-0-4-0-5-ur06.santaclara.ca.sfba.comcast.net (68.86.143.97) 0.0% 30 30 52.1 52.2 51.9 52.3 11.|-- te-6-0-acr03.santaclara.ca.sfba.comcast.net (68.86.249.66) 6.7% 30 28 53.0 53.0 52.8 53.8 12.|-- c-76-102-14-35.hsd1.ca.comcast.net (76.102.14.35) 3.3% 30 29 60.4 60.5 59.9 63.8 === END
-- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB |
_______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages
Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages