Recently, we got to get our hands dirty on some URL data from bitly which comprised of suspected URLs that have been clicked by Internet users in October. We thank bitly and particularly Brian David Eoff (senior data scientist) and Mark Josephson (CEO) for sharing this data with us. We analysed about 269,973 URLs marked “suspicious” by bitly to understand how these links are posted and clicked. Figure 1 shows a graph of some of the most common domains for which multiple suspicious URLs were shortened using bitly. Domains like coupons4more.net , 123direct.info , lucinda34drejka.com and apprillss.info had more that 1,000 URLs each which were marked suspicious by bitly.
bitly uses real-time spam detection services like Google safe-browsing and SURBL. However, there doesn’t seem to be a lot of measures to nail spam bitly users. There exist a lot of registered bitly users who shorten spam links regularly. From the 269,973 suspicious URLs, we extracted 4,469 registered bitly users who have posted one or more of these links. After some data crunching, we found that 4,457 bitly users have posted at least 113 suspicious bitly URLs or more (Figure 2). If we analyse the past history of shortened URLs of these users, then we may find more spam links in their profiles. We plan to do this in future. These users are allowed to stay on bitly though they regularly post spam links which are also heavily clicked by other Internet users through various media like emails, blogs and online social networks.
We look closely at top 20 users who have posted the maximum number of suspicious URLs in our dataset and observed that the highest number of suspicious links posted by a single user is as large as 500 URLs (Figure 3). Shortened URLs constitute a large fraction of spam on Internet. Sixty five percent of URLs targeting social media users are shortened URLs .
We believe that if bitly suspends the registered bitly users spreading spam constantly or publicly marks them as malicious, this would discourage the use of bitly as a spamming service and deter malicious URLs being shortened and spread on Internet. We hope to see more features from bitly in future which would help to curb out spam and malicious links to greater extent.
We are investigating this data in more detail to develop more insights. One of the student is pursuing her Masters thesis work on this topic. If you are interested in knowing more or want to give suggestions, please write to email@example.com