flapping issues reaching East Coast AWS

We have seen flapping all morning reaching our AWS East Coast servers from our Thousand Oaks location. They drop off for a minute or so and then come back. It seems to occur at 3-5 minute intervals for several cycles, then everything is fine for 30 minutes or so. Then the alarms start to come in again so wash rinse and repeat. The issue is definately not local, as our local monitoring to our peer connections are fine as are our connections to AWS WestCoast servers. The only complaint I got so far was from a customer on the East cost trying to VPN to a server here and getting booted out a lot. Traceroutes imply there may be an issue with Frontier as the few times I have caught the problem while logged into the remote East Coast server the path outbound from AWS East goes to Dallas and then seems to have issue reaching Los Angeles. I also show those traceroute pages moving around a bit between those locations. The connection from here to the the East Cost show weirdness involving Charter so maybe the issue is on the reverse trip. this is pretty typical. 8 lag-401.dllstx976iw-bcr00.netops.charter.com (66.109.5.229) 36.002 ms 36.316 ms lag-16.dllstx976iw-bcr00.netops.charter.com (66.109.6.1) 36.451 ms 9 lag-0.pr3.dfw10.netops.charter.com (66.109.5.121) 39.187 ms lag-302.pr3.dfw10.netops.charter.com (209.18.43.77) 42.279 ms 45.338 ms 10 99.83.71.242 (99.83.71.242) 35.427 ms 99.83.71.240 (99.83.71.240) 36.389 ms 99.82.176.170 (99.82.176.170) 35.339 ms 11 * 150.222.206.169 (150.222.206.169) 34.071 ms * 12 * * 15.230.48.42 (15.230.48.42) 33.717 ms Finally its not within AWS as our WestCoast servers can maintain a steady ping with the affected East Coast sites. So is anyone else seeing weirdness on cross country trips? William Kern PixelGate

also to forgot to mention that DownDetector is showing an increase in Frontier complaints. On 7/31/23 12:22 PM, William Kern via Outages wrote:
We have seen flapping all morning reaching our AWS East Coast servers from our Thousand Oaks location. They drop off for a minute or so and then come back.
It seems to occur at 3-5 minute intervals for several cycles, then everything is fine for 30 minutes or so. Then the alarms start to come in again so wash rinse and repeat.
The issue is definately not local, as our local monitoring to our peer connections are fine as are our connections to AWS WestCoast servers. The only complaint I got so far was from a customer on the East cost trying to VPN to a server here and getting booted out a lot.
Traceroutes imply there may be an issue with Frontier as the few times I have caught the problem while logged into the remote East Coast server the path outbound from AWS East goes to Dallas and then seems to have issue reaching Los Angeles. I also show those traceroute pages moving around a bit between those locations.
The connection from here to the the East Cost show weirdness involving Charter so maybe the issue is on the reverse trip.
this is pretty typical.
8 lag-401.dllstx976iw-bcr00.netops.charter.com (66.109.5.229) 36.002 ms 36.316 ms lag-16.dllstx976iw-bcr00.netops.charter.com (66.109.6.1) 36.451 ms 9 lag-0.pr3.dfw10.netops.charter.com (66.109.5.121) 39.187 ms lag-302.pr3.dfw10.netops.charter.com (209.18.43.77) 42.279 ms 45.338 ms 10 99.83.71.242 (99.83.71.242) 35.427 ms 99.83.71.240 (99.83.71.240) 36.389 ms 99.82.176.170 (99.82.176.170) 35.339 ms 11 * 150.222.206.169 (150.222.206.169) 34.071 ms * 12 * * 15.230.48.42 (15.230.48.42) 33.717 ms
Finally its not within AWS as our WestCoast servers can maintain a steady ping with the affected East Coast sites.
So is anyone else seeing weirdness on cross country trips?
William Kern
PixelGate
_______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages

This should be resolved on the Frontier side at this time. Would you mind checking again and let me know how things look? Regards, Jeff
On Jul 31, 2023, at 12:24 PM, William Kern via Outages <outages@outages.org> wrote:
also to forgot to mention that DownDetector is showing an increase in Frontier complaints.
On 7/31/23 12:22 PM, William Kern via Outages wrote: We have seen flapping all morning reaching our AWS East Coast servers from our Thousand Oaks location. They drop off for a minute or so and then come back.
It seems to occur at 3-5 minute intervals for several cycles, then everything is fine for 30 minutes or so. Then the alarms start to come in again so wash rinse and repeat.
The issue is definately not local, as our local monitoring to our peer connections are fine as are our connections to AWS WestCoast servers. The only complaint I got so far was from a customer on the East cost trying to VPN to a server here and getting booted out a lot.
Traceroutes imply there may be an issue with Frontier as the few times I have caught the problem while logged into the remote East Coast server the path outbound from AWS East goes to Dallas and then seems to have issue reaching Los Angeles. I also show those traceroute pages moving around a bit between those locations.
The connection from here to the the East Cost show weirdness involving Charter so maybe the issue is on the reverse trip.
this is pretty typical.
8 lag-401.dllstx976iw-bcr00.netops.charter.com (66.109.5.229) 36.002 ms 36.316 ms lag-16.dllstx976iw-bcr00.netops.charter.com (66.109.6.1) 36.451 ms 9 lag-0.pr3.dfw10.netops.charter.com (66.109.5.121) 39.187 ms lag-302.pr3.dfw10.netops.charter.com (209.18.43.77) 42.279 ms 45.338 ms 10 99.83.71.242 (99.83.71.242) 35.427 ms 99.83.71.240 (99.83.71.240) 36.389 ms 99.82.176.170 (99.82.176.170) 35.339 ms 11 * 150.222.206.169 (150.222.206.169) 34.071 ms * 12 * * 15.230.48.42 (15.230.48.42) 33.717 ms
Finally its not within AWS as our WestCoast servers can maintain a steady ping with the affected East Coast sites.
So is anyone else seeing weirdness on cross country trips?
William Kern
PixelGate
_______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages
Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages

the last monit failure/success pair was 14:34/36 Pacific I would have expected more warnings around 15:00. Because its 15:16 now, we seem to have at least gone longer than we did all morning. So that is promising! I'll respond back if I see another failure. -bill On 7/31/23 14:55, Jeff Richmond wrote:
This should be resolved on the Frontier side at this time. Would you mind checking again and let me know how things look?
Regards, Jeff
On Jul 31, 2023, at 12:24 PM, William Kern via Outages <outages@outages.org> wrote:
also to forgot to mention that DownDetector is showing an increase in Frontier complaints.
On 7/31/23 12:22 PM, William Kern via Outages wrote: We have seen flapping all morning reaching our AWS East Coast servers from our Thousand Oaks location. They drop off for a minute or so and then come back.
It seems to occur at 3-5 minute intervals for several cycles, then everything is fine for 30 minutes or so. Then the alarms start to come in again so wash rinse and repeat.
The issue is definately not local, as our local monitoring to our peer connections are fine as are our connections to AWS WestCoast servers. The only complaint I got so far was from a customer on the East cost trying to VPN to a server here and getting booted out a lot.
Traceroutes imply there may be an issue with Frontier as the few times I have caught the problem while logged into the remote East Coast server the path outbound from AWS East goes to Dallas and then seems to have issue reaching Los Angeles. I also show those traceroute pages moving around a bit between those locations.
The connection from here to the the East Cost show weirdness involving Charter so maybe the issue is on the reverse trip.
this is pretty typical.
8 lag-401.dllstx976iw-bcr00.netops.charter.com (66.109.5.229) 36.002 ms 36.316 ms lag-16.dllstx976iw-bcr00.netops.charter.com (66.109.6.1) 36.451 ms 9 lag-0.pr3.dfw10.netops.charter.com (66.109.5.121) 39.187 ms lag-302.pr3.dfw10.netops.charter.com (209.18.43.77) 42.279 ms 45.338 ms 10 99.83.71.242 (99.83.71.242) 35.427 ms 99.83.71.240 (99.83.71.240) 36.389 ms 99.82.176.170 (99.82.176.170) 35.339 ms 11 * 150.222.206.169 (150.222.206.169) 34.071 ms * 12 * * 15.230.48.42 (15.230.48.42) 33.717 ms
Finally its not within AWS as our WestCoast servers can maintain a steady ping with the affected East Coast sites.
So is anyone else seeing weirdness on cross country trips?
William Kern
PixelGate
_______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages
Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages

Ok great, thanks for the update. I am away at the moment so I don’t have all the details but am happy to ping the ops folks if you see any further degradation. Thanks, Jeff
On Jul 31, 2023, at 3:18 PM, William Kern <wkern@pixelgate.net> wrote:
the last monit failure/success pair was 14:34/36 Pacific
I would have expected more warnings around 15:00.
Because its 15:16 now, we seem to have at least gone longer than we did all morning. So that is promising!
I'll respond back if I see another failure.
-bill
On 7/31/23 14:55, Jeff Richmond wrote: This should be resolved on the Frontier side at this time. Would you mind checking again and let me know how things look?
Regards, Jeff
On Jul 31, 2023, at 12:24 PM, William Kern via Outages <outages@outages.org> wrote:
also to forgot to mention that DownDetector is showing an increase in Frontier complaints.
On 7/31/23 12:22 PM, William Kern via Outages wrote: We have seen flapping all morning reaching our AWS East Coast servers from our Thousand Oaks location. They drop off for a minute or so and then come back.
It seems to occur at 3-5 minute intervals for several cycles, then everything is fine for 30 minutes or so. Then the alarms start to come in again so wash rinse and repeat.
The issue is definately not local, as our local monitoring to our peer connections are fine as are our connections to AWS WestCoast servers. The only complaint I got so far was from a customer on the East cost trying to VPN to a server here and getting booted out a lot.
Traceroutes imply there may be an issue with Frontier as the few times I have caught the problem while logged into the remote East Coast server the path outbound from AWS East goes to Dallas and then seems to have issue reaching Los Angeles. I also show those traceroute pages moving around a bit between those locations.
The connection from here to the the East Cost show weirdness involving Charter so maybe the issue is on the reverse trip.
this is pretty typical.
8 lag-401.dllstx976iw-bcr00.netops.charter.com (66.109.5.229) 36.002 ms 36.316 ms lag-16.dllstx976iw-bcr00.netops.charter.com (66.109.6.1) 36.451 ms 9 lag-0.pr3.dfw10.netops.charter.com (66.109.5.121) 39.187 ms lag-302.pr3.dfw10.netops.charter.com (209.18.43.77) 42.279 ms 45.338 ms 10 99.83.71.242 (99.83.71.242) 35.427 ms 99.83.71.240 (99.83.71.240) 36.389 ms 99.82.176.170 (99.82.176.170) 35.339 ms 11 * 150.222.206.169 (150.222.206.169) 34.071 ms * 12 * * 15.230.48.42 (15.230.48.42) 33.717 ms
Finally its not within AWS as our WestCoast servers can maintain a steady ping with the affected East Coast sites.
So is anyone else seeing weirdness on cross country trips?
William Kern
PixelGate
_______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages
Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages

On Mon, 31 Jul 2023 15:23:59 -0700 Jeff Richmond via Outages <outages@outages.org> wrote:
Ok great, thanks for the update. I am away at the moment so I don’t have all the details but am happy to ping the ops folks if you see any further degradation.
Frontier.com has an AAAA RR, but it does not appear to be reachable and functioning. Not sure if this is indicative of anything bigger than that, but it would be nice if it either worded or the RR was removed. John

We are seeing similar issues from Santa Monica trying to reach our sites in the DC area today. I made a ticket with our east coast ISP Allied and they mentioned that AT&T was having peering issues with some ISPs in California. We are BGP'd with AT&T and Frontier and usually our outbound traffic rides on AT&T and inbound traffic picks whichever route is better between the two ISPs. I moved traffic from AT&T to Frontier and then our others sites started to drop so I backed off. I'll try removing the Frontier routes later and see if things improve. Jose Gomez -----Original Message----- From: Outages <outages-bounces@outages.org> On Behalf Of William Kern via Outages Sent: Monday, July 31, 2023 12:23 PM To: outages@outages.org Subject: [outages] flapping issues reaching East Coast AWS We have seen flapping all morning reaching our AWS East Coast servers from our Thousand Oaks location. They drop off for a minute or so and then come back. It seems to occur at 3-5 minute intervals for several cycles, then everything is fine for 30 minutes or so. Then the alarms start to come in again so wash rinse and repeat. The issue is definately not local, as our local monitoring to our peer connections are fine as are our connections to AWS WestCoast servers. The only complaint I got so far was from a customer on the East cost trying to VPN to a server here and getting booted out a lot. Traceroutes imply there may be an issue with Frontier as the few times I have caught the problem while logged into the remote East Coast server the path outbound from AWS East goes to Dallas and then seems to have issue reaching Los Angeles. I also show those traceroute pages moving around a bit between those locations. The connection from here to the the East Cost show weirdness involving Charter so maybe the issue is on the reverse trip. this is pretty typical. 8 lag-401.dllstx976iw-bcr00.netops.charter.com (66.109.5.229) 36.002 ms 36.316 ms lag-16.dllstx976iw-bcr00.netops.charter.com (66.109.6.1) 36.451 ms 9 lag-0.pr3.dfw10.netops.charter.com (66.109.5.121) 39.187 ms lag-302.pr3.dfw10.netops.charter.com (209.18.43.77) 42.279 ms 45.338 ms 10 99.83.71.242 (99.83.71.242) 35.427 ms 99.83.71.240 (99.83.71.240) 36.389 ms 99.82.176.170 (99.82.176.170) 35.339 ms 11 * 150.222.206.169 (150.222.206.169) 34.071 ms * 12 * * 15.230.48.42 (15.230.48.42) 33.717 ms Finally its not within AWS as our WestCoast servers can maintain a steady ping with the affected East Coast sites. So is anyone else seeing weirdness on cross country trips? William Kern PixelGate _______________________________________________ Outages mailing list Outages@outages.org https://urldefense.proofpoint.com/v2/url?u=https-3A__puck.nether.net_mailman...
participants (4)
-
Jeff Richmond
-
John Kristoff
-
Jose Gomez - IT (jgomez@execproinc.com)
-
William Kern