The National Internet Segment Reliability Research explains how the outage of a single Autonomous System might affect the connectivity of the impacted region with the rest of the world. Generally, the most critical AS in the region is the dominant ISP on the market, but not always.
As the number of alternate routes between ASes increases (the "Internet" stands for "interconnected networks" - and each network is an AS), so does the fault-tolerance and stability of the Internet across the globe. Although some paths are more important than others from the beginning, establishing as many alternate routes as possible is the only viable way to ensure an adequately robust network.
The global connectivity of any given AS, whether an international giant or a regional player, depends on the quantity and quality of its path to Tier-1 ISPs.
Usually, Tier-1 implies an international company offering global IP transit service over connections with other Tier-1 providers. Nevertheless, there is no guarantee that such connectivity will always be maintained. For many ISPs at all "tiers", losing connection to even one Tier-1 peer would likely render them unreachable from some parts of the world.
The Methodology of Internet Reliability Measurement
Examining a case when an AS experiences network degradation, we want to answer the following question: "How many ASes in the same region would lose connectivity with Tier-1 operators and their global availability along with it?"
Throughout the years, we have modelled such a situation because, at the dawn of BGP and interdomain routing design, its creators assumed that every non-transit AS would have at least two upstream providers to guarantee fault tolerance in case one of them goes down.
However, the current reality is different, with less than half of all ISPs globally having only one connection to an upstream transit provider. A range of unconventional relationships among transit ISPs further reduces availability.
Have transit ISPs ever failed? The answer is yes, and it happens with increasing frequency. The more appropriate question is - under what conditions would a particular ISP experience severe service degradation we would call an outage? If such problems seem unlikely, Murphy's Law may be worth considering: "Anything that can go wrong, will".
We have applied the same model for the seventh year to model such a scenario. Although again, we did not merely repeat previous calculations - the research is expanding over the years.
The following steps were taken to rate AS reliability:
- For every AS in the world, we examine all alternate paths to Tier-1 operators with the help of an AS relationship model, the core of Qrator.Radar;
- Using the Maxmind GeoIP database, we matched countries to every IP address of every AS;
- For every AS, we calculated the share of its address space corresponding to the relevant region. ISPs that reside at an Internet Exchange Point in a region where they do not have a significant presence were filtered out. The example we are using here is Hong Kong, where traffic is exchanged among hundreds of members of HKIX - yet the biggest Asian Internet Exchange, most of which have zero presence in the local internet segment;
- After isolating regional ASs, we analyzed the potential impact of one's outage on other ASes as well as their respective countries;
- In the end, for each country, we identified the AS with the greatest/largest impact on other ASes in their region. Foreign ASes were not considered.
- We took that AS's impact value as a reliability score for the country. And used that score to rate the reliability of countries. The less score is — the better reliability.
IPv4 Reliability
Highlights
Compared with 2021, in 2022:
- Four segments left the top-20 of the IPv4 reliability rating: Thailand, Taiwan, Spain and, surprisingly, the United States, which now resides at position 28 with a score of 7.45% with the same AS it lost nine positions last year - Level3's (Lumen's) AS3356;
- Switzerland lost 8 positions in the IPv4 reliability rating, with the change of the critical AS from AS3303 in 2021 to AS6830 in 2022;
- Japan lost 7 positions, critical AS unchanged;
- Singapore gained 7 positions, critical AS unchanged;
- Luxembourg gained 5 positions, critical AS unchanged;
- Ireland is back in the top-20 of the IPv4 reliability rating after leaving in 2019, critical AS unchanged.
Every year exciting movements happen in the reliability rating, often corresponding to what is happening with the telecommunications industry inside the respective regions.
First things first - the overall global reliability, counted as an average and medium. This time we are looking at seven years of continuous research.
As you can see, the year 2022 brought the most significant changes in both Median and Average Reliability compared with all previous years. Average reliability globally now equals 26.7% - almost 9% improvement since 2021, and Median reliability improved by 6.1%.
That is the most significant over-the-year improvement in general reliability we have ever seen, and causes are yet to be determined.
In 2022 the number of countries that successfully improved reliability score to under 10%, indicating high fault tolerance, is one digit up compared with the last year - 42 national segments.
IPv6 Reliability
As usual, we should start from the Google-provided graphic of IPv6 adoption measured in % of all sessions that are using IPv6 to connect to Google servers:
As of September 2022, approximately 37% of Google users use the native IPv6 connection, which effectively translates into their ISPs supporting the v6 version of the IP protocol.
Although the main issue with IPv6 persists - that is the partial connectivity. Due to peering wars, not universal IPv6 adoption and other matters, IPv6 still has limited network visibility. To better understand this, look at the IPv6 reliability versus the partial connectivity rate.
It is evident from this IPv6 Top-20 Reliability to Partial Connectivity Comparison chart that there are several countries where the partial connectivity in IPv6 exceeds 10%: Indonesia, Ireland, Poland, Canada, Italy, Thailand and Taiwan, as well as South Africa, Kenya and the United States. Liechtenstein is precisely at 10% partial connectivity and 10% IPv6 reliability.
China, which entered the IPv6 reliability rating only in 2021, though right away in seventh place, but with an astounding 90% of partial connectivity, left the IPv6 top-20 and now is placed at the 111th position with the 44.03% reliability, but only 18% partial connectivity after a change of the critical IPv6 autonomous system from last year's AS4538 to this year AS4134.
Looking at the partial connectivity combined with the "classic" reliability percentage, showing the share of (at least partially) unavailable resources in case of an outage, we could state that, excluding African countries, the worst numbers among top-20 IPv6 belong to Indonesia (18.32%), Ireland (18.92%), Poland (22.33%), Canada (20.67%), Italy (22.86%) and Taiwan (21.37%). In the US, the combined percentage is also high - 23.23%.
In Brazil, the first country in IPv4 rating and second in IPv6, combined unavailability makes 7.26%, which is the best result among IPv6 top-20.
According to Google's data on per-country IPv6 adoption, in September 2022, the leaders are France - with a 73.12% adoption rate, India (now the most populated country in the world) - with a 69.16% adoption rate, and Germany, with a 64.29% adoption rate. Every other country, according to Google, is below 60% IPv6 adoption.
And while France and Germany are visible in our IPv6 data at the top, India this year is still relatively low in terms of reliability - placed 83, with 23.80% reliability for AS6453 and 7.38% partial connectivity, a slight improvement over the year.
The average IPv4 reliability score in 2022 is 26.7%. For IPv6, the same metric is at 27.4% - and as we measure the outage impact, the lower the metric is - the better. And in IPv6, the global average has risen almost 2% since last year, indicating poorer reliability.
Broadband Internet and PTR records
"Does a country's leading ISP always influence regional reliability more than everyone else?" - that is the question we are trying to answer with the help of additional information and investigation. We suggest that the most significant (by user base or customer base) ISP in a region is not necessarily the most critical for the region's network connectivity.
Three years ago, we started to analyze the PTR records. Generally, PTR records are used for Reverse DNS lookup: using the IP address to identify the associated hostname or domain name.
Since we already know the critical AS for every country in the world, we could count the PTR records within their network and determine their share of overall PTR records for the corresponding region. We counted only PTR records and did not calculate the ratio of IP addresses without PTR records to IP-address with them.
So, we are speaking strictly of IP addresses with PTR records present. The practice of adding those is not universal; some providers do this, and others do not.
In the PTR-based rating, we look at what part of PTR-enabled IP addresses would go offline with an outage of each country's AS and the percentage representing the relevant region.
Such an approach that considers PTR records yields very different results. In most cases, not only does the primary regional AS change, but the percentage is entirely different. In all of the generally reliable (from the global availability point of view) regions, the number of PTR-enabled IP addresses that shut down following an outage of one autonomous system is dozens of times higher. That could mean that the leading national ISP always handles end-users at one point or another.
Thus, we should assume that this percentage represents the part of the ISP's user base and customer base that would go offline (if switching to a second internet service provider were not possible) in the event of an outage. From this perspective, countries appear less reliable than they look from the transit point of view. We leave possible conclusions from this PTR-enabled rating to the reader.
ISPs With Only One Upstream (Stub networks) and Their Reliability
We found a peculiar detail in ten of the top twenty IPv4 Reliability Rating countries. Suppose we look for the critical provider for "stub networks", which are essentially networks with only one upstream provider. In that case, we will find another AS and ISP different from the one responsible for the current classical reliability metric for the corresponding national segment.
Interestingly, in the past, a critical AS for stub networks would rarely not be the classical global critical AS simultaneously. That changed in 2022. So let us look at the most visible differences between the critical AS in global transit versus the primary upstream choice in a specific region.
This year, almost all IPv6 critical ASNs for stub networks differ from the classical ones, and in IPv4, the contrast is increasing, too. Last year we wrote about the three "whales": AS174 Cogent, AS6939 Hurricane Electric, and AS3356 Level3 (Lumen).
Now you can see how in 2022, Cogent's AS174 is still the critical AS in many national segments (previous graphics with the IPv4 vs PTR comparison), but not so much for the clear stubs in IPv4. In IPv6, AS174 is no longer critical AS for clear stubs in Belgium (where it is AS8368 now). Notably, critical AS for clear stubs in the US changed from last year's AS174 Cogent to AS6939 Hurricane Electric.
The AS3356 Level3 (Lumen) in 2022 is almost invisible in the clear stubs statistics. Still, it became a classic (transit) critical AS in IPv6 for, same as last year, Great Britain, Poland and the US, and newly to Russia, Cyprus, Argentina, Peru and Tunisia.
Those are big fish, but not too big to fail in terms of reliability. Moreover, continuing centralization of networks in the larger economies poses a particular challenge to the region's reliability score, which we see, especially in the case of the United States.
Regarding the US national segment leaving the top-20 of IPv4 rating - it's probably not Lumen's fault, as the increasing reliance on a big telecommunications companies is typical for almost every country in the world. We think that the US segment drop happened because other countries within the top-20 and the critical autonomous systems related to them improved their connectivity faster and, maybe, more effectively. This thesis is further supported by the Average and Median reliability dynamics in 2022.
AS20473 Choopa (The Constant Company) is a clear stub critical in IPv6 for four countries: Great Britain, The Netherlands, Switzerland and Cyprus - a nice addition compared to last year, when it was (all the way) critical only in Switzerland. In 2022, AS20473 is also an IPv6 transit critical AS for Great Britain.
As always, our closing note would be the same - for any national segment (country), city, business and even end-user to have an acceptable level of reliability - it is necessary to have at least two upstream providers.
Thank you for reading the Reliability Research! If you have any questions, feel free to contact us at radar@qrator.net.