Our latest research has found that links on clickable websites can often redirect to malicious targets. We call these breakable hyperlinks, and we’ve found millions of them all over the web, including on trusted websites.
A paper presented at the 2024 Web Conference suggests that cybersecurity threats on the Internet can be exploited on a much larger scale than previously thought.
In fact, we have found these hackable hyperlinks on websites of large corporations, religious organizations, financial companies, and even governments. The hyperlinks on these websites can be hijacked without triggering any alarms. Only careful (some might say paranoid) users will not fall for these traps. If we could find these vulnerabilities on the Internet, others could too. Here’s what you need to know.
What are the captured bridges?
If you make a mistake when entering your bank’s web address, you may accidentally stumble upon a phishing site—a site that pretends to be your bank’s website or “spoofs” it to steal your personal information.
If you are in a hurry and do not check the website carefully, you may enter sensitive personal information and pay a high price for your mistake. This could include identity theft, account takeover, or financial loss.
Something even more dangerous happens when programmers misspell web addresses in their code. Their mistake has the potential to redirect users to an unpurchased Internet domain. We call these ghost domains.
For example, a programmer linking to theconversation.com might accidentally link to tehconversation.com – note the typo. If the bad domain name was never purchased, someone could buy the ghost domain name for around AUD$10 by capturing the traffic. In these cases, users pay the price for the programmer’s mistakes.
These developer link errors don’t just redirect users to phishing or fraudulent sites. The hijacked traffic can be targeted by a range of traps, including malicious scripts, disinformation, offensive content, viruses, and other future hacking attacks.
More than half a million ghost domains
Using high-performance computing clusters, we scanned the entire viewable Internet to identify these vulnerabilities. In total, we analyzed data from more than 10,000 hard drives at a scale never seen before in research.
We identified over 572,000 ghost domains this way. Many of the websites examined were found to have hackable hyperlinks that redirected users to them. Ironically, this even included web software designed to enforce privacy laws on websites.
We investigated what bugs caused these vulnerabilities and categorized them. Most of them were due to typos in hyperlinks, but we also discovered another type of developer-created vulnerability: placeholder fields.
When developers develop a website that does not yet have a specific domain name, they often link to a fictitious domain name in the hope that the links will be fixed later.
We’ve found that this is typical of website design patterns where the aesthetic components of the website are purchased from another developer rather than developed in-house. When a design template is then uploaded to a website, ghost domains are often not updated, making it possible for links to them to be hijacked.
To determine whether hijackable hyperlinks could be used in practice, we purchased 51 ghost domains that they pointed to and passively observed the incoming traffic. From this, we found a significant amount of traffic coming from the hijacked links. Compared to similar new domains without broken links, 88% of our ghost domains received ten times more traffic.
What can be done?
Awareness is essential for the average Internet user. Links cannot be trusted. Be vigilant. We offer a variety of technical countermeasures for companies and those responsible for their websites. The simplest solution for website operators is to “scan” their websites for broken links. There are many free tools available for this. If broken links are found, fix them before they are compromised.
We are the Internet
British scientist Sir Tim Berners-Lee first proposed the Internet at CERN in 1989. His initial statement (still widely available on the Internet, as he testifies) includes a section on security called “non-essentials.” This section includes the following dire statement:
[Безпека даних] At CERN, where the exchange of information is still more important, it is of secondary importance.
While this may have been true at CERN in 1989, the Internet is now the primary means of information exchange in modern times. We have come to view the Internet as an external component of our own brain. This is evidenced by the popularity of large language models such as ChatGPT, which learn from data on the Internet.
As our dependency deepens, it may be time to mentally reclassify web data security from “non-essential” to “critical.” Kevin Sarich, Computer Scientist and Mechatronic Engineer, CSIRO