In 2018, investigative journalist Brian Krebs warned of the nuances of internationalized domain names (IDNs). These domains, which contain non-Latin characters but appear to do so, can be used to create visual confusion that can become particularly useful in running credible punycode phishing campaigns.
In 2020, hundreds of IDNs continue to be registered but are discoverable using a new, soon to be released version of our Typosquatting data feed. We have started to take a close look at these domain names and want to highlight some common cases that we have found along with the associated cybersecurity best practices.
What are IDN or Punycode phishing attacks?
IDNs pioneered the use of non-American Standard Code for Information Interchange (ASCII) characters in domain names. IDNs help non-English speakers to create domain names in their local language using their alphabet.
Countries like Japan, China, Germany, and Poland, to name a few, can register domain names using their local non-English alphabet. However, since the Domain Name System (DNS) cannot understand such characters, domain names are converted to Punycode. As such, they would have the standard “xnâ” prefix. Here are some examples:
- ? office365[.]com (xn â ffice365-x80d[.]com)
- off? ce365[.]com (xn â offce365-ujb[.]com)
- office 365[.]com (xn â offce365-41a[.]com)
For end users, however, domain names would appear in their IDN format. And since there are characters that look a lot like ASCII characters, it’s easy to misjudge them and think they’re legitimate. For this reason, IDNs can be used effectively in punycode phishing attacks and BEC (Business Email Compromis) scams.
Commonly used non-ASCII characters
Alternatives to the letter “o”
It is easy to distinguish between Microsoft[.]com and micr0soft[.]com. The second “o” has obviously been replaced by zero (0). But, when the following similar characters are used, you can hardly notice the difference:
- ? (? office365[.]com, microsoft[.]com)
- ? (micro? ft[.]com)
- Ã¶ (microsÃ¶ft[.]com)
Alternatives to the letter “i”
We have detected several domains similar to Instagram. Among them are those that use non-ASCII characters that closely resemble the letter “i”. Microsoft, Office 365, and Instagram typosquatting domains also use a few alternate “i” characters.
- ? (? Instagram[.]com)
- Ã (instagram[.]com)
- ? (m? crosoft[.]net? nstagram[.]X Y Z)
- ? (m? crosoft[.]com)
- ? (off? ce365[.]com)
Alternatives to the letter “a”
Below are four variations of the letter “a” that malicious actors could use to register typosquatting domains:
- ? (instagr? m[.]com)
- to (instagram[.]com)
- ? (inst? gram[.]com, inst? gram[.]com)
- ? (lloydsb? nk[.]com)
Alternatives to the letter “m”
Two non-ASCII characters that could replace the letter “m” were also detected. They have been used to emulate Microsoft and Instagram.
- ? (? icrosoft[.]com, instagra?[.]com)
- ? (? icrosoft[.]com, instagra?[.]com)
Protection against IDN or Punycode Typosquatting
These are only four of the 26 characters in the alphabet that can be used in IDN-based attacks. Any mark with the letters “m”, “a”, “i” and “o” can easily be imitated using these special characters.
Most areas of typosquatting can go unnoticed by even the most vigilant eyes. For example, you won’t notice anything suspicious with the office365 domain[.]com, until you carefully look at the “Ã¬” with a grave accent.
Here are some ways organizations can protect themselves against such abuse.
Early detection of typosquatting with typosquatting data flow – Our typosquatting data feed will soon be able to detect typosquatting IDNs, especially those that are registered on the same day with other similar domains. Suspicious domains are also flagged a day after they are registered in DNS, allowing security teams to act immediately. Brand owners can also use the database to see how others are using their brand.
Compare WHOIS records with those of legitimate websites – Confirming if a domain is indeed a typosquatting domain is easy with the help of WHOIS Lookup. Once a typosquatting domain is detected, it can go through a WHOIS search to obtain information about the domain.
Take, for example, office365[.]com. The legitimate domain office[.]com has these holder details:
On the other hand, the typosquatting domain has been registered in Japan while all of its other owner details have been removed for confidentiality reasons.
Map the domain infrastructure – For maximum security, it is best to investigate typosquatting domains and see other related domains. This can be done by mapping the infrastructure of the domain using DNS records.
To illustrate, let’s go back to the offÃ¬ce365 WHOIS record.[.]com. WHOIS Lookup reveals that the domain uses these hostnames:
- gdns2[.]interlink. or[.]jp
Running these nameservers on the Reverse NS API would return all domain names that use them. For each name server, the tool detected more than 300 associated domains. While we haven’t seen any other domains emulating famous brands, some might be owned by legitimate small businesses that might share their infrastructure with potential phishers.
IDN-based typosquatting can make BEC scams, Punycode phishing attacks, and other cybercrimes more effective because users can barely distinguish them from legitimate domains. Therefore, organizations can step up their cybersecurity efforts by detecting typosquatting domains as early as possible. They can even get more domain intelligence on each typosquatting domain using the WHOIS lookup, the Reverse NS API, and other domain intelligence tools.