
There *is* no A record for star.facebook.com -- there's only a CNAME, which in turn induces per-record NS lookups. So this is normal. As for your 2nd question -- this opens a very large Pandora's box when it comes to DNS engineering and operational deployments. I can think of quite a few reasons there should be separate load balancers for the "main" facebook.com domain and ns.facebook.com bits, and the actual LBs which return A records for the webservers that return web content. Effectively what you're proposing is just to have lots of LBs handling the DNS queries for everything, rather than segregate/delineate things a bit more. I can assure you there are justifications for this, but since I don't work at Facebook, I can't provide those. What I've seen today is not that complex of a setup, though depending on what LBs they use, I might not particularly enjoy looking at those configurations. But what's shown here, at least if using a Citrix NetScaler, isn't that complex. All that said -- it is absolutely possible for them to remove use of the star.facebook.com CNAME and just have a series of LBs respond with appropriate A records when faced with a www.facebook.com A record query. However, given that there's quite a lot of crapola-nonsense that falls under the facebook.com domain (I'm talking about the equivalent of a content delivery network, their image hosting stuff, and God knows what else today -- I haven't used Facebook since January 2011), due to "Web 2.0" nutballs always wanting to make a mess of things ( ;-) ), I'm not too surprised by the above. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | On Mon, Dec 10, 2012 at 03:58:31PM -0800, Terry wrote:
Yeah, I think my result was a red herring. a.ns.facebook.com and b.ns.facebook.com still can't resolve the A record for star.facebook.com, despite things seemingly being back to normal now. The NS record is what's key and by the time I looked at it, it was fixed.
Why some people feel the need to get so clever with DNS is beyond me. How about just resolving the A records directly from the facebook.com NS servers, instead of via a CNAME to another group of DNS servers? Would that be so difficult? Then you're shocked when there's an outage.
________________________________ From: Jeremy Chadwick <jdc@koitsu.org> To: Terry <t0psecret@yahoo.com> Cc: Richard Mahoney <richard.mahoney@tracesmart.co.uk>; Corey Quinn <corey@sequestered.net>; "outages@outages.org" <outages@outages.org> Sent: Monday, December 10, 2012 6:40 PM Subject: Re: [outages] Facebook
I could have provided dig +trace output but this is shorter and reads easier.
It looks like records get looked up as follows (and I'm excluding the root server lookups, i.e. . --> .com --> facebook.com):
facebook.com.? ? ? ? ? 147814? IN? ? ? NS? ? ? b.ns.facebook.com. facebook.com.? ? ? ? ? 147814? IN? ? ? NS? ? ? a.ns.facebook.com.
And the A records:
a.ns.facebook.com.? ? ? 172573? IN? ? ? A? ? ? 69.171.239.12 b.ns.facebook.com.? ? ? 172573? IN? ? ? A? ? ? 69.171.255.12
The SOA for facebook.com (domain itself) hasn't been changed since 2012/12/07 (if SOA serial is truly kept in lines with the YYYYMMDD model).
69.171.239.12 when queried for any records for www.facebook.com results in a CNAME response to star.facebook.com.? It's probably named "star" to indicate asterisk (*):
www.facebook.com.? ? ? 338? ? IN? ? ? CNAME? star.facebook.com. star.facebook.com.? ? ? 1238? ? IN? ? ? NS? ? ? glb2.facebook.com. star.facebook.com.? ? ? 1238? ? IN? ? ? NS? ? ? glb1.facebook.com.
And the A records:
glb1.facebook.com.? ? ? 3038? ? IN? ? ? A? ? ? 69.171.239.10 glb2.facebook.com.? ? ? 3038? ? IN? ? ? A? ? ? 69.171.255.10
glb obviously stands for "global load balancer", though I have no idea what device they use (F5s, Citrix Netscalers, Alteons (god forbid), or something home-grown).
Given the below analysis from Terry, it looks to me like:
a) one or both of their load balancers may have been overloaded briefly ? and did not respond to DNS queries (or possibly something at layer 2 ? or layer 3 was affecting this) b) one or more of the nameservers *behind* glb[12].facebook.com were ? overloaded or broken in some way, or layer 2/3 was responsible for ? breakage (between glbs and nameservers)
The only people who know for certain are -- yup -- the Facebook folks.
And naturally this is me doing my testing from a single source, so its possible they use anycast to distribute some of their load, in which case the above analysis (despite speculative) is still correct, except what actual devices/networks are involved would be different.
You're welcome.? :-)
-- | Jeremy Chadwick? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? jdc@koitsu.org | | UNIX Systems Administrator? ? ? ? ? ? ? ? http://jdc.koitsu.org/ | | Mountain View, CA, US? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | | Making life hard for others since 1977.? ? ? ? ? ? PGP 4BD6C0CB |
On Mon, Dec 10, 2012 at 03:24:28PM -0800, Terry wrote:
Still broke here. Silly CNAMEs.
~ > nslookup
server a.ns.facebook.com Default server: a.ns.facebook.com Address: 69.171.239.12#53
www.facebook.com Server: ? ? ? ? a.ns.facebook.com Address: ? ? ? ?69.171.239.12#53 www.facebook.com ? ? ? ?canonical name = star.facebook.com.
star.facebook.com Server: ? ? ? ? a.ns.facebook.com Address: ? ? ? ?69.171.239.12#53
Non-authoritative answer: *** Can't find star.facebook.com: No answer
________________________________ ? From: Richard Mahoney <richard.mahoney@tracesmart.co.uk> To: Corey Quinn <corey@sequestered.net>; "outages@outages.org" <outages@outages.org> Sent: Monday, December 10, 2012 6:21 PM Subject: Re: [outages] Facebook ?
? Seems to be resolving again now on Virgin Media (UK). Guess it was just a hiccup. ? PS C:\Windows\system32> nslookup www.facebook.com Server:? (removed) Address:? (removed) ? Non-authoritative answer: Name:??? star.facebook.com Addresses:? 2a03:2880:2110:9f02:face:b00c:0:4 ????????? 69.171.247.20 Aliases:? www.facebook.com ? Kind regards ? Richard Mahoney, CEH? Systems Administrator Tracesmart T?029 2067 8534????M?07714 486543????E?richard.mahoney@tracesmart.co.uk www.tracesmartcorporate.co.uk????www.traceiq.co.uk Global Reach ?Dunleavy Drive ?Cardiff ?CF11 0SN Follow us on?Twitter ISO/IEC 27001?CERTIFICATE: GB 10/81945 We are proud to sponsor?missingpeople.org.uk This email and any attachments are confidential to Tracesmart Ltd and are solely for use by the intended recipient. If you are not the intended recipient you must not disclose, copy or distribute its contents to any other person nor make use of its contents in any way. If you have received this email in error please forward a copy to?info@tracesmart.co.uk?and remove it from your system.This email and any attachments have been scanned for the presence of computer viruses. Neither Tracesmart Ltd nor the sender accepts any responsibility for computer viruses once this email has been transmitted. The content of this message may contain personal views, which are not the views of Tracesmart Ltd, unless specifically stated. Tracesmart may monitor email traffic data and also the content of email for the purposes of security and staff training.Tracesmart Ltd is a company registered in England & Wales with company registration number 3827062 whose registered ? office is at Global Reach, Dunleavy Drive, Cardiff CF11 0SN. ?Our Data Protection Number is Z708281X and our Consumer Credit Licence Number is 565961. ? From:outages-bounces@outages.org [mailto:outages-bounces@outages.org] On Behalf Of Corey Quinn Sent: 10 December 2012 23:15 To: outages@outages.org Subject: Re: [outages] Facebook ? Can you be a bit more specific? ?"Works for me." ? cquinn@quinntel ~ % dig facebook.com ? ? ? ? ? ? ? 5344 15:14:37 Mon 12-10-2012 ? ; <<>> DiG 9.9.1-P2 <<>> facebook.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63691 ;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 2, ADDITIONAL: 3 ? ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;facebook.com.??????????????????????????????????? IN??????? A ? ;; ANSWER SECTION: facebook.com.???????????? 7200??? IN??????? A???????? 66.220.152.16 facebook.com.???????????? 7200??? IN??????? A???????? 69.171.224.32 facebook.com.???????????? 7200??? IN??????? A???????? 173.252.100.16 facebook.com.???????????? 7200??? IN??????? A???????? 69.171.229.16 facebook.com.???????????? 7200??? IN??????? A???????? 173.252.101.16 facebook.com.???????????? 7200??? IN??????? A???????? 66.220.158.16 ? ;; AUTHORITY SECTION: facebook.com.???????????? 139086??????????? IN??????? NS?????? a.ns.facebook.com. facebook.com.???????????? 139086??????????? IN??????? NS?????? b.ns.facebook.com. ? ;; ADDITIONAL SECTION: a.ns.facebook.com.????? 139086??????????? IN??????? A???????? 69.171.239.12 b.ns.facebook.com.????? 139086??????????? IN??????? A???????? 69.171.255.12 ? ;; Query time: 50 msec ;; SERVER: 10.201.1.103#53(10.201.1.103) ;; WHEN: Mon Dec 10 15:14:40 2012 ;; MSG SIZE ?rcvd: 204 ? ? On Dec 10, 2012, at 3:12 PM, Richard Mahoney <richard.mahoney@tracesmart.co.uk> wrote:
Seeing DNS issues for Facebook here. Anyone else? ? Kind regards ? Richard Mahoney, CEH? Systems Administrator Tracesmart T?029 2067 8534????M?07714 486543????E?richard.mahoney@tracesmart.co.uk www.tracesmartcorporate.co.uk????www.traceiq.co.uk Global Reach ?Dunleavy Drive ?Cardiff ?CF11 0SN Follow us on?Twitter ISO/IEC 27001?CERTIFICATE: GB 10/81945 We are proud to sponsor?missingpeople.org.uk This email and any attachments are confidential to Tracesmart Ltd and are solely for use by the intended recipient. If you are not the intended recipient you must not disclose, copy or distribute its contents to any other person nor make use of its contents in any way. If you have received this email in error please forward a copy to?info@tracesmart.co.uk?and remove it from your system.This email and any attachments have been scanned for the presence of computer viruses. Neither Tracesmart Ltd nor the sender accepts any responsibility for computer viruses once this email has been transmitted. The content of this message may contain personal views, which are not the views of Tracesmart Ltd, unless specifically stated. Tracesmart may monitor email traffic data and also the content of email for the purposes of security and staff training.Tracesmart Ltd is a company registered in England & Wales with company registration number 3827062 whose registered ? office is at Global Reach, Dunleavy Drive, Cardiff CF11 0SN. ?Our Data Protection Number is Z708281X and our Consumer Credit Licence Number is 565961. ? _______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages ? _______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages
_______________________________________________ Outages mailing list Outages@outages.org https://puck.nether.net/mailman/listinfo/outages