As far as the issue with Battle.net, this one unfortunately might have been specifically between your connection and the gaming servers. On my end, there are at least 15 hops (each representing a potential point of failure) before I hit AT&T’s enterprise hosting network (where most of Blizzard’s key servers are hosted). The only outage I logged for the login/chat server was a brief outage (TCP port 1119) at around 6AM PDT. My monitor had no issues with connectivity to the same port for the ingress point to the gaming servers (us.actual.battle.net:1119) throughout the day. The only packet loss logged to one of the only Blizzard gateways on AT&T’s network which does not drop ICMP packets was 1 packet loss at around 7PM PDT.

This is why I monitor a key service port as well as certain hops between myself and AT&T enterprise hosting (att-acdn.net) as I mainly want to get a better picture of where my recent latency spikes are coming from. With that said though, the information is still NOT granular enough because there isn’t much clarity past their networks’ edge routers. Like it tells me nothing about actual server loads, any internal network load balancing issues, latency between gaming servers and their database/file storage systems, etc, etc (all of these are monitored extensively by the AT&T NOC and should normally not present any sizable problems; not exactly perfect though since there are times when everyone in a game experiences the same lag meaning that it is server side, or everyone gets simultaneously kicked from the gaming server ).
“Is it that tough for them to implement a pause feature for the game?”
It depends. The code itself isn’t necessarily the issue. Sometimes decisions like these comes down to a matter of prioritizing resources. And some of those decisions are made at a higher level. I will qualify that I myself am not pleased with the way Blizzard handles latency and disconnect issues when it comes to Diablo III. If one is going to make the game online only, systems should be in place which allows a more graceful handling of such issues. But I am also fully aware that management has differing priorities/visions. I used to work FT in the IT industry (networking) so I’m familiar with the issues….
BTW, here is what Bashiok mentioned about the in-game latency indicator:
“The latency indicator in-game is not a simple ping like most games, and is actually a full process of the game sending an action to the service, the service processing it, and returning it to the client. This means that the latency indicator actually gives a more accurate account of what the experience is for game data, but in comparison to other games will seem high. We’re more interested in issues where there is actual performance degradation, and certainly there may be, but the in-game latency number is not a simple ping and therefore shouldn’t be used literally as a measurement against performance elsewhere.”
Because this is actually a system service (and not just a simple ping), what this basically means is that the foundation is already in place for them to hook into in order to gracefully handle situations when latency meets certain thresholds. A lot of complaints have been already made for example regarding the time it takes the server to actually acknowledge when the client has disconnected (at which point, it then begins the 10 second timer to remove the character from the game – which is usually already dead by that point). The above service should already know client state in a much shorter time frame especially if it is already using TCP (if ACK not received within the timeout period, retransmission occurs – so if ACK is still not received, perform some action).