PhishAri : Real-Time Phishing Detection on Twitter

We, at PreCog, not only do research but also try to build products based on our work for end-users. More often than not, developing scalable, real systems can be a challenging task; much more than just developing the underlying algorithm. It feels good to be part of a research group which has given me perspective to understand the need to create a bridge between research and real-world solutions. Here goes my first PreCog blog entry on one such product we (where I’m the lead) are developing, which aims to detect phishing on Twitter.

There has been a lot of research and publications on spam detection on online social media, but there do not exist many real-world products which use these intelligent solutions. When we started with detection of phishing on Twitter, we decided to build a real-time system for Internet users based on our research which we named – PhishAri. Before we move on to how we built PhishAri, any guesses on what the name means? Well, its a combination of two words – Phish + Ari. “Phish” stands for “phishing” in short and “Ari” means “enemy” in Sanskrit; PhishAri combats phishing by detecting phishing URLs spread through Twitter.

From our previous studies and some prior work in this area, we identified various features which we could use for phishing detection on Twitter. Some of these features include attributes of the URL, properties of the tweet and Twitter user who posts the tweet. We thought that the best way to reach out to most Internet users would be by using a browser extension. So, now after someone installs PhishAri browser extension, whenever he logs on to Twitter, he sees a small color-coded indicator in front of any URL in the tweets in his timeline or Twitter search results; green indicates that the URL is safe and red indicates a phishing URL. Since this solution is seamlessly built into the browser, it is hassle free and requires no other additional software or packages to be installed other than the browser you use and the PhishAri extension. Currently, PhishAri extension is available only for Chrome browser, but we’ll soon launch it for FireFox and other browsers too.

Now, let’s dive into the nitty-gritty of PhishAri. The browser extension (written in JavaScript) is the front-end of the entire system which does very little processing and only shows the appropriate indicator beside every URL. Now comes the meat of the solution : a web-application hosted on a separate server which the extension uses to make decisions on which indicator to show in front of each URL. The web-application is written in python using framework hosted on an Apache server. The extension takes the URL from tweet & the tweet id and sends it to the web-application as a GET request. The web-application takes this URL & the tweet id and creates the feature-vector based on the attributes of the URL and the tweet which are used for phishing detection. The web-application then uses machine learning classification to classify the URL as phish / legitimate. The extension again makes a GET request to the web-application to receive a JSON object which is a string, indicating class of the URL; accordingly, extension shows a red indicator if the class is ‘phishing‘ and a green indicator if it is ‘legitimate‘.

Currently, PhishAri works with an accuracy of 87.2%, we are still in process of making it stronger and more effective. The extension is easily downloadable from Chrome Web Store. We are trying to add more features and strengthen the underlying classifier to make PhishAri more efficient. Any feedback is warmly welcomed. If you use Twitter, do give it a try!


Our presence and experiences at Research Showcase @ IIIT-Delhi

As part of the IIIT-Delhi tradition of showcasing research done at the institute, a Research Showcase (RS) is held every year in the Spring semester. This year RS’12 was held on March 23 & 24. The following four posters from the group were showcased this year:

PhishAri by Anupama Aggarwal and Ashwin Rajadesingan (Research category)

– Privacy in Open Government Data by Swetank Kumar Saha, Daksha Yadav, Sudip Mittal and Mayank Gupta (Research category). Stay tuned for more information on this work!

W.Y.S.W.Y.E: Secure Authentication in Front of Prying Eyes by Rohit Khot, Ponnurangam Kumaraguru and Kannan Srinathan (Research category)

– TrustGuru by Komal Sachdeva and Claudio Marforio. Stay tuned for more information on this work!

The poster from our group, ‘Privacy in Open Government Data’ was judged as the best poster in the research category by reviewers. Congrats to Swetank, Daksha, Sudip, and Mayank. They won Google T-shirts, Microsoft caps, and INR 5,000 as prize money!

Anupama, Komal, Mayank, Rohit, and Swetank did a fabulous job in attracting visitors to their poster! Congrats to all members of the group who participated and encouraged us at the RS’12.

In addition to these posters which are part of the work done by the students in the group, we also had Prachi Jain, our member, present a poster that was done as part of her course work.

I also attended both the invited talks, one by Natwar Modani from IRL and the other one by Prasad Naldurg from Microsoft Research India. Natwar discussed community detection, social network analysis in mobile Call Detail Records (CDR). Prasad covered Zero Day Vulnerabilities, JavaScript attacks, and how to discover new vulnerabilities. I found a couple of commonalities in both the talks: (1) research output from both talks presented are now part of the product / services in real-world, which was very cool to know; I strongly believe that this is one of the key ways to measure success of one’s research; (2) both the speakers did not speak (understandably!!!) about details of the data, demographics, and current status, anything more than what was already stated in the papers.

The quality of the posters have definitely increased from RS’11. Students seemed to be more prepared, posters were well designed and there was a lot of energy and enthusiasm among students. Thanks to the student organizers of the RS’12 for putting up such a great show!

I look forward forRS’13!!!!