The Great Precog Expedition

It all began with searching for opportunities to work in the summer of 2016. I had heard endlessly about the work culture at IIIT Delhi, the research groups and the out of this world faculty. After researching through the site, I stumbled upon Professor PK’s profile. As I read further about him, I was awestruck and amongst the many who would want to work with PK.

My first meeting with him lasted for roughly 15 minutes but I went back home with a bag full of riveting information about what it takes to be a Precog-er. This was also the first time I got to know about Randy Pausch. At home, I watched ‘The Last Lecture’ and understood why the walls of the Precog area are adorned with his quotes.

Soon after I took part in the OSMpalooza Hackathon and witnessed firsthand how quick progress is made by students here. My team came up with whatever best solution we could think of, for the problem statements given. Sadly, my team didn’t win a position but I witnessed some amazing solutions by other teams and most importantly I saw myself serious and engrossed in a project in Social Media Analysis. This was the time I was further sure of wanting to work in Precog since majority of the work is done on analysing social media content. This incident would be incomplete without quoting the following:

“Experience is what you get when you didn’t get what you wanted. And experience is often the most valuable thing you have to offer.” –Randy Pausch

Very soon, I applied for the internship. After an intricate interview process, I received my offer letter. My first day at Precog was a Brainstorming session (which is another bonus point of this internship). Before the internship, how I went through research papers was basic skimming. And in the first session itself, I witnessed the dissection of a paper and not only deriving the entire methodology, but also discussing elaborate ideas about extending the current paper and implementing those as well. This is just one example of how working at Precog means legit serious work.

I was lucky to have Prateek Dewan as my mentor during the internship period. I started working closely with Prateek and soon after there were series of things I learnt that I apply till date. Before the internship the only language I worked in was Java and by the end of it, I had another language i.e. Python, to add to my skill set. Each little doubt regarding my project was cleared by him and he promptly replied to any query I had at any odd hour. I was a little apprehensive in the beginning since the progress made at Precog is super quick but I learned it all in my own time.

The most incredible characteristics of this group are the levels of sincerity and passion shared by each Precog-er when it comes to work. Apart from the respective projects carried out by each group, the regular Brainstorming sessions covered the latest research topics extensively. Several new ideas and information about the tech world were discussed in the mailing list and very soon I got the hang of it. One particular email comprised of PK discussing his latest choice of book to read, “Eat That Frog!” By Brian Tracy. Being an avid reader, I bought it the next day itself and the book has had phenomenal influences on my life. (amazing book suggestions!; another bonus of the internship). Striking a balance between working and having fun is another take away. The binding force of Precog is PK and the smart-working researchers, known as Precog-ers, make this group what it is.

Why I chose such a heavy sounding title for this post is because Precog can’t be defined by anything less. It is indeed a great expedition and I am fortunate to have experienced it.

I would like to end by quoting my favourite Randy Pausch saying that has now adorned my room’s wall as well:

“The brick walls are there for a reason. The brick walls are not there to keep us out. The brick walls are there to give us a chance to show how badly we want something. Because the brick walls are there to stop the people who don’t want it badly enough. They’re there to stop the other people.” 

Below is a picture from one of the group photo sessions!(Missing in the picture: PK)

The Precog Journey

I had an inclination towards research ever since I joined IIIT-D. I admired my Seniors who got admits from the Best CS Universities of the world, which felt like the next milestone to achieve. Looking at the profiles of several of them, I realised that there was an eminent underlying force behind, that was PK! I was awestruck by the posts that PK used to do about the achievements of his students on Social Media.

I had done research in the Security domain during my first 2 years and Security in Social Media looked like a fascinating field to explore. I joined the DHCS course in the beginning of 2016, that’s when I got to interact with PK. His way of teaching is unique, and he instills an aura of enthusiasm and interest as he delivers lectures.  From him I got a new perspective to look and analyse the everyday interfaces.

I decided to apply for an Internship with Precog for the Summer of 2016. The interview was grilling and I kept my fingers crossed till the result was announced. After receiving acceptance from Precog I also received an Internship Offer from IBM Research Labs. I was faced with a dilemma, both being Great Research opportunities. I talked to PK and he offered me a part time internship with Precog.

I was excited to start working with the group. Anupama introduced me to the problem statement we were going to be working on, “Privacy Leaks through Browser Extensions”. I was thrilled to know that we were going to have a collaboration with Bimal Viswanath from Bell Labs. I had read his research papers before and was intrigued by his findings. Working with Precog feels like, you have been blessed with a network of the Top researchers across the globe with whom you can discuss, learn and work.

I moved to Bangalore to join IBM Research and worked remotely. I used to stay in office till late night reading Papers, and also Precoged during the weekends. PK used to be in constant touch as my guide and a mentor. Precog felt like a family and working on the Research Project was fun. Whenever I was stuck anywhere I was offered full support, I could message anytime and Anupama was lightning fast with her replies. I learnt the Skills of collaboration, time management and most importantly smart-working rather than hard-working.

Coming back from Bangalore I was in full swing to join the group with vigour. PK always has suggestions to improve efficiency and quality, his visions and ideas can be clearly seen imbibed in the group. Attending regular meetings, ‘whatsup’ and BM sessions, I have learned far too many things to be enumerated. Earlier I was intimidated by the rapid progress of the group, now I take pride with every new accomplishment of every member. Since every other day there is a new achievement being discussed in the group’s mailing list. Association with Precog has been a Roller-Coaster ride, where we have worked as team and partied as friends. I have been really lucky to be associated with the Coolest Research Group of IIIT-Delhi.

Below is the picture taken on Jan 4, 2016, Precog’s birthday!

Preventing KillFie: A crowdsourced approach

Selfies have become a prominent medium for self-portrayal on social media. Certain social media users go to extreme lengths to click dangerous selfies, which puts their lives at risk. A hundred and twenty seven individuals have died since March 2014 until December 2016 while trying to click selfies out of which 76 deaths were from India alone.

This disturbing trend can be traced to users taking selfies in “dangerous” locations which in turn can be linked to the concept of self representation on online social media. A user will be perceived more bold and adventurous if he posts from a cliff top or in front of a moving train. Such acts of portraying oneself as being a daredevil leads people to go to life threatening lengths to get the perfect selfie. The engrossing hunt for the ultimate display picture momentarily distracts the selfie taker from their surroundings, which might result in tragedies.

Our goal is to get a dataset of all such locations around the globe which are popular selfie spots, but the lives of people clicking selfies at these locations are inadvertently put at risk. We go about collecting this data through an android app and a chatbot(on facebook) that we made. Users can report locations along with the type of risk(eg : height related, water related, vehicle related etc) associated with taking a selfie on that location. Using this data, our app nudges the user (through a notification) whenever the user goes near such a location.

As we become more engrossed in mobile phones, and go to extreme lengths while being absorbed in our phones and digital world, it is possible that we lose sense of our physical surroundings. For instance, there were several incidents when people got injured while using Pokemon Go (an immersive augmented reality application). Tagging such dangerous locations might be helpful for such apps to get relevant warnings from the database.

If you know of a selfie lover or are one yourself, please download this app. Even a single location which you report may prevent many unfortunate deaths. And since what goes around comes around, one day the app might save you from unknowingly putting yourself in danger.

To interact with the chatbot, go to the facebook page and click on send message. Send a location by clicking on the location icon(can be done only through the messenger app) and just follow the instructions given by the bot.

Click here to download the app.

Click here to interact with the chatbot.

          

          

 

Call for Service : Characterizing and Modeling Police Response to Serviceable Requests on Facebook

India is going digital in a big way; from banking to manufacturing to agriculture, each field is seeing the penetration of technology. Police organizations also have started using technology for effective policing. Most police organisations now have an official website, a Facebook page and a Twitter handle. Police not only use these new media services to showcase their organisation but also to interact with citizens very regularly. Police posts on Facebook or tweets on Twitter include a variety of topics ranging from traffic advisories, to awareness creation to bragging about their achievements. Similarly, the growing technology savvy population of India is using these mediums to share their grievances, concerns, etc. with the police. With a handful of police officers serving 1.25 billion people, it is no surprise that a lot of posts/tweets by the citizens go unnoticed by the police. Even features like tagging police commissioners and police accounts do not always yield the expected response, causing a sense of resentment. The police too find themselves helpless given the multitude of things.

With our continued interest in empowering police organizations with technology which can help them in their day-to-day activities, we have been working in the space of online social media and policing for some time now. For our research publications in this space, please visit here. For effective communication between the citizens and police, it is necessary for the police to understand the vast amount of content generated on their social media accounts. In this direction, we started thinking about how to break up the content into important versus unimportant, urgent versus non-urgent, etc. Our main aim in this research was to help police identify ‘serviceable’ content which can be served quickly and efficiently. Requests to which police should respond, evaluate or take action are considered as serviceable requests.  

We analyzed 85 official Facebook pages of police organizations in India and studied the nature of posts that citizens share on police Facebook pages. Not all posts require the same amount of attention from the police, there are some cases where immediate action needs to be taken while some can wait. Based on this analysis, we came up with six textual attributes that can identify serviceable posts; posts that need some kind of police response. We find such posts are marked by high negative emotions, more factual, and objective content such as location and time of incidences.

We identify four types of response that citizens may get on their posts:

(a) Forward: Posts which had enough information and could be forwarded to appropriate authorities for action. For instance, a resident posted, Date : 4/11/2015 (Wednesday), Time : 10:17 pm, Number : [withheld], Location : [withheld], Violations : Crossing line by way too much obstructing the vehicles which were coming from [withheld] entrance later he jumped the signal ……..

(b) Give Solution: Posts mostly included queries by residents to police that could be answered without any detail; resident asks, Admin !! Can U Explain to Me How Two Challans On Same Date Same Time in Just 5 Minutes Gap !! How Its Possible ?? Any Thing Wrong ??

(c) Acknowledge with thanks: Posts to which the police wrote “thanks for sharing the information” or “thanks for the appreciation.” For instance, resident remarks, Chennai City Traffic Police a humble salute from a fellow Chennaiite for the commendable job in such rains!!

(d) Need more details: In these resident’s posts, police inquired more details so that action could be taken, e.g., a resident asks, Cops driving wrong side [of road] near XXX hotel .. what action will be taken against them ? This post lacks information such as time and date when the incident happened.

To enhance response to serviceable posts, we propose a request – response identification framework. The approach followed in the paper is shown below:

 

Understanding Requests from Citizens:

Residents often use different language styles in posts while expressing their concerns and asking queries to police. Our approach includes following six category of features to characterize serviceable posts:Emotional Attributes,Cognitive and Interpersonal Attributes, Linguistic Attributes, Question Asking Attributes, Entity-Based Attributes, and Topical Attributes. These include the both handcrafted features and LDA / NMF based features that help automatically discover the latent dimensions and induce semantic features in our data.

Our analysis shows some intriguing results:

Serviceable requests show significantly higher value of negative emotional states i.e. “anger” (+15.38%), “disgust” (+47.8%), “fear” (+60%), and “sadness” (+10%) in comparison to non-serviceable requests. Most frequent topic is includes queries / question posed to police (Complaints represents complaints against cops in- correct decisions).

Comparing serviceable sub-types, we observe that 93.10% posts in Thanks sub-type did not receive a response from police. Posts in Forward sub-type received the maximum number of responses from police (63.6%, 182 posts). Table 1 below summarizes the number of posts that did not receive police responses.

Table 1: Number of posts that received responses (N of Events) and censored event showing posts that did not get response from the police.

Automated Classifier for Serviceability:

Our work explores a series of statistical models to predict serviceable posts and its different types. The model makes use of the content based measures – emotions, cognitive attributes, linguistic, question posed, entity and topical attributes. We explore five different classification algorithms – Random Forest (RF), Logistic Regression (LR), Decision Trees (DT), Adaptive Boosted Decision Trees (ADT), and Gradient Boosting Classifier (GBC) using balanced class weights. Table 2 below reports the performance of different algorithms to correctly identify serviceable posts.

Table 2: Mean Performance after 10-fold CV of different algorithms to correctly identify serviceable posts.

Through our work, we believe technological interventions can help increase the interactions between police and citizens and thereby increase the trust people have on police. The police too may have a more directed and cost-labour efficient mechanism in dealing with any law and order situation reported on their Facebook page. This will increase the overall well-being and safety of society.

Link to the analysis portal

Link to the accounts portal:

Full citation & link to the paper: Sachdeva, N., and Kumaraguru, P. Call for Service: Characterizing and Modeling Police Response to Serviceable Requests on Facebook. Accepted at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW), 2017. PDF

 

 

 

 

What’s your MWI? : A social media based study of Mental Well-Being in college campuses

College students’ mental health concerns are a persistent issue; psychological distress in the form of depression, anxiety and other mental health challenges among college students is a growing health concern. However, very few university students actually seek help related to mental illness. This arises due to various barriers like limited knowledge about available psychiatric services and social stigma. Further, there is dearth of accurate, continuous and multi-campus data on mental well-being which presents significant challenges to intervention and mitigation strategies in college campuses.

Recent advances in HCI and social computing show that content shared on social media can enable accurate inference, tracking and understanding of the mental health concerns of users. There has also been work showing that college students appropriate social media for self-disclosure, support seeking and social connectedness. These facts, coupled with the pervasiveness of social media among college students, motivated us to examine the potential of social media as a “measure” for quantifying the mental well-being in a college population. Specifically, we focused on the following research goals:

  • Building and validating a machine learning model to identify mental health expressions of students in online communities
  • Analysing the lingusitic and temporal characteristics of the identified mental health content
  • Developing an index for the collective mental well-being in a campus, and examining it’s relationship with university attributes like academic prestige, enrollment size and student body demographics

We obtained a list of 150 ranked major universities in the US by crawling the US News website. We also obtained university metadata like gender distribution, tuition/fee during this crawl. Next, we crawled the Wikipedia pages for these 150 universities for extracting the student enrollment, type of university (public/private) and the setting (city/urban/suburban/rural) at each institute. Lastly, we obtained information on the racial diversity at each university from an article on Priceonomics. We study these universities in our work and use the metadata in our analysis.

Geographical distribution of the 150 ranked universities we studied

For social media data, we focus on Reddit. Reddit is known to be a widely used online forum and social media sites among the college student demographic. It’s forum structure allows creation of public online communities (known as “subreddits”), including many dedicated to specific college campuses. This allowed us to collect a large sample of posts shared by university students in one place. Although Facebook is likely more popular/widespread among students, it is challenging to use Facebook in such studies since the content shared is largely private, making it challenging to obtain such large data from it. Further, the semi-anonymous nature of Reddit enables candid self-disclosure around stigmatized topics like mental health.

After a manual search for subreddits for each university, we were able to identify public subreddit pages for 146 of the 150 universities. Next, we focused on correcting the “under-adoption” bias in subreddits. Subreddits which had a small fraction of Reddit users (as compared to university enrollment) were filtered out due to being under-representated. This left us with 109 universities with adequate Reddit representation. We leveraged the data on Google BigQuery (combined with some additional data collection) to get all posts ranging from June 2011 to February 2016. The final dataset used for our analysis included 446,897 posts from 152,834 unique users.

Since Reddit data does not contain any gold standard information on whether a post in a university subreddit is a mental health expression, our first goal was to use an inductive transfer learning approach to build a model to identify such content in a university subreddit. First, we include (as ground truth data) Reddit posts made on various mental health support communities. Prior work has established that, in these communities, individuals self-disclose a variety of mental health challenges explicitly. We use these posts as the “positive” posts and, parallelly, we utilize another set of Reddit posts, made on generic subreddits unrelated to mental health, as “negative” posts. We obtain 21,734 posts for each category, which we use as the positive and negative class for building a classifier. We observed a validation accuracy of 93% and an accuracy of 97% on a test set of 500 unseen, expert-annotated posts from our university subreddit data. We then proceeded to use this classifier for labelling the 446,397 other posts across the 109 university subreddits. Our classifier identified 13,914 posts (3.1%) to be mental health expressions, whereas the rest of the 432,483 posts were marked not about the topic. This corresponded to 9010 unique users out of a total of 152,834.

Next, we looked at the linguistic characteristics of the posts identified to be mental health expressions by conducting a qualitative examination of the top n-grams uniquely occuring in these posts. We found that students appropriate the Reddit communities to converse on a number of college, academic, relationship, and personal life challenges that relate to their mental well-being (“go into debt”, “doing poorly in”, “only one homework”, “up late”, “the jobs i”). The n-grams also indicated that certain posts contained explicit mentions of mental health challenges (“psychiatric”, “depression”, “killing myself”, “suicidal thoughts”), as well as the difficulties students face in their lives due to these experiences (“life isnt”, “issues with depression”, “was doing great”, “ruin”, “cheated”). Some of the top n-grams were also used in the context of seeking support (“need help”, “i really need”, “could help me”).

For the temporal analysis of mental health content, we first study the proportion of posts with mental health expression across the years. The figure below shows the content per year (along with a least squares line fit). We observed that the proportion of posts with mental health expressions has been on the rise — there is a 16% increase in 2015, compared to that in 2011.

We then looked at how this trend varies over the course of an academic year. The plots below show the trend separately for universities following the semester system and the quarter system. Between August and April, for the universities in the semester system, we observed an 18.5% increase in mental health expression; this percentage was much higher: 78% for those in the quarter system, when compared between September and May. On the other hand, we observed a reverse trend in mental health content during summer months, for both semester and quarter system universities.

Trend in the semester months
Trend in the quarter months
Trend in the summer months for quarter system universities

 

 

Trend in the summer months for semester system universities

 

 

 

Lastly, as a part of our third research goal, we formulated an index we refer to as the Mental Well-Being Index (MWI), as a measure of the collective mental well-being in a university subreddit, based on the posts labelled as mental health related by the classifier. We then computed the MWI metric for all 109 subreddits and examined it’s relationship with the university attributes.

 

 

 

 

 

By visualising these relationships (as above), we gleaned several interesting observations. We found:

  •  Universities with larger student bodies (enrollment) as well as greater proportion of undergraduates in their student bodies tend to be associated with lower MWI
  • MWI of the 66 public universities we consider, is lower, relative to that in the 43 private universities, by 332%
  •  MWI is lower in the 7 rural and 33 suburban universities by 40-266% compared to others, while it is the highest in the 31 universities categorized to be in cities (by 29-77%)
  • Universities with higher academic prestige (or low absolute value rank) and higher tuition tend to be associated with higher MWI
  • MWI tends to be lower in universities with more females (or sex ratio, male to female <= 1) by 850%

Further, although our data shows a marginally lower MWI in universities with greater racial diversity, we did not find statistical significance to support this claim.

Our work here (the complete paper accepted at CHI 2017) further details our analysis in depth. Below is an infographic for our work.

 

 

I have been Precog-ed (for life): Part 4

Holà! It’s the first day of 2017. All of us just got done with looking back at the past year, trying to fathom how time flies and life metamorphosizes. My life has taken a leap too and this is my last blog as a part of the ‘I have been Precog-ed’ series. Earlier, I have written about my first stint at research (Part 1), a wonderful summer at the Information Sciences Institute at Marina Delray, Los Angeles (Part 2), my first paper presentation at ICWSM 2016 in Germany (Part 3), and my time at Precog. This post is about the last 6 months of my journey and an attempt to express what being a Precog-er is all about (for more on this, please read the first three parts too). Being a Precog-er for more than 3 years, I have more thoughts than I can ever pen down; from being an undergrad who joined Precog as a noob to a grad student at Carnegie Mellon University, my path has always been illuminated by the light of learning and hope.

April 2016 – I was struggling with end-sem preparations, document processing and Visa applications for my trip to ICWSM and my masters in the States, and the humdrum undergrad life when an unexpected email got an unexpected reaction from me –

“Dear Megha,

We are pleased to inform you that you have been selected as an one of the 40 CERN Openlab Summer Students 2016 (out of 1461 applicants)! For nine weeks, CERN will be your host for what we hope is going to be an interesting, fun and active summer…”

I have been an amateur astronomer for 9 years, and getting to work at the ‘Mecca of Particle Physics’ would have been a dream come true. I knew I wouldn’t be able to make it. I was applying for my Schengen Visa for Germany (which would take another 2 weeks), and then I had to start my application for the US visa. I needed another Schengen Visa for Switzerland in a span of one week. On top of that, the only dates I could select for the internship were overlapping with my initial orientation schedule at CMU. I almost disrupted a meeting in PK’s office to break the news to him. I was sad. Pillars (Ph.D. students at Precog) and PK were convinced that I should try and if it doesn’t work out, so be it. That’s a Precog trait – not giving up until you have given your best shot! After cutting short the duration of my summer at CERN, pushing CMU to allow me to skip the orientations (convincing them that I’ll manage when I wasn’t sure myself I’ll), and getting my Schegen for Switzerland in a day (thanks to CERN’s administrative staff who made a special request for me to the embassy), I was ready for a summer at CERN.

I worked for 2 months at CERN’s data center on a storage system of ~125PB (one of the largest in the world). CERN openlab program includes a lecture series to helps CS students understand the Physics needed for some of the projects, trips to ETH Zürich and EPFL Lausanne, hackathons, and several means to help the students gain insights about the revolutionary projects spanning across 100 hectares in Switzerland and more than 450 hectares in France! It was a humbling experience, which entailed learning something new every day. Europeans have nailed the work-life balance too. Along with finishing my project on time, I managed to check Geneva, Lausanne, Lyon, Zürich, Paris, Montreux, Bern, Engelberg, Chamonix and many more off my list!

Delhi for 2 days, and Pittsburgh was my next destination, my home for the next 16 months. I am an MSCS student at CMU now. Last to arrive and one of the youngest of the lot, thanks to PK I had ample of background knowledge about life as a student here and the city of Pittsburgh. The experience I have gained at Precog comes in handy when I have to identify research gaps and solve hard problems. I feel more equipped and confident to take up the challenges that come along with grad life at a school like CMU.

Throughout these 6 months (Jul – Dec 2016), I have been working with a few Precog-ers on what we now call the Killfie project. It has turned out to be one of the most exciting projects I have worked on as a part of the group. It is the inclination to work on interesting problems with some brilliant people, which gives me the motivation to find time for this amongst courses and projects at CMU.

I cannot finish this blog without revisiting these lines from my first blog – “…PK, the heart and brain of Precog. He is the coolest adviser I have ever met and his skills and dexterity at work are almost mind-boggling. I came to know him as my Probability and Statistics professor, the role changed to being my adviser working at Precog and now I see him as a mentor for life..”. A lot of what I have been able to achieve in the last 3 years, I owe it to PK’s unconditional support. Thank you PK for illuminating my path always and for proving what good mentorship can accomplish!
My time at Precog has taught me how to help people, make friends, eliminate distractions and focus, improve daily, think big, fail often and give nothing short of your very best effort! I have had last minute unscheduled video calls in the middle of the night from the other end of the world with Precog-ers when I needed help. Pillars, interns, RAs – thank you each one of you for this experience. Even though I live in a different time-zone now and my attendance at the 4th floor Ph.D. lab has been at an all-time low, I know my association with the group will last forever.  As has been rightly put – ‘Once a Precog-er, always a Precog-er!’.

PS – Some pictures…

Just another day at Precog…
“It’s all about the people!”
The room where Tim Berners-Lee developed the World Wide Web at CERN!
This one doesn’t need a caption… 🙂
The Aiguille du Midi Skywalk, “Step into the Void” at Chamonix (altitude – 3842m)
CERN Openlab Summer Students 2016

 

 

#ProfGiri #LovingMyFacultyLife An amazing year of my faculty life!

A few days back, when I logged into Facebook, it asked me if I wanted to make a video of the year gone by. That was the trigger for this blog. Just a peek at the year gone by… a collection of the #ProfGiri that I did this year, 2016. It has been one roller coaster ride!

Since it is about ProfGiri, let me begin with students, my lifeline, in every sense of the term. Students who took courses with me, both at the  Undergraduate level and Masters have joined amazing institutes for their further studies, like Carnegie Mellon University (my alma mater), GaTech, UIUC, USC, ASU, and other brilliant places in the world. Some students also started working at places like Apple, GoPro, MasterCard as UX Designers, or moved from one place to another better avenue like Microsoft in the US, etc. Through the year, I was also able to host some bright students from outside IIITD, students came from IIT Guwahati, College of Engineering Guindy, NIT Trichy, IIIT Sri City, and other institutes from India.

Traveling for work and meeting IIITD alums is something I always enjoy doing; this year has been splendid in this regard. Looking back it looks like I have met many brilliant and successful alums around the world. My earlier blogs on meeting alums from Europe and alums from the US. I also got a chance to meet many other alums within different cities in the country. I graduated one PhD student Paridhi Jain this year, who is now working at Accenture Research. This year’s graduation at IIITD was a treat for me. It rained awards for students who have been working with me! Megha Arora received the chancellor’s award, Mansi Panwar & Shashank Gautham received the award for  best BTP thesis, and Sarthak Ahuja received the Best All-Rounder Student award. I couldn’t have asked for more. An exciting period in the year is when admits from graduate schools around the world start coming in. That’s when I know where all my bright students are headed for higher studies. I had written a blog “This is Why I Love My Job: Students are the backbone of Faculty life!” just dedicated to all those students who have made me and my institute proud. Most of these students and other students living in the US and Europe usually back home for a winter break and so it is alum-visiting time around Christmas/New Year. This year we have already had 4 of our Alums come-by, many are scheduled in the next 2 – 3 weeks. Last but not the least, I had a good crop of research papers published / accepted this year; to name a few venues, CHI 2017 (camera ready version getting prepared), CSCW 2017 paper, BHCI 2016 paper, SocInfo 2016, etc. Rest of the papers can be seen from our publications page.

Another integral part of #ProfGiri is Teaching. The Privacy and Security in Online Social Media course that I taught on NPTEL had 5,250 students signed up. It was a very different experience, my blog on the experience. At IIITD, I taught Designing Human Centered Systems in Spring 2016. It was highly appreciated by the students, which reflected in the course feedback. A big thankyou to all those students. The course always ends with a Building Better Interface (BBI). This year’s edition of BBI was particularly a grand success with a variety of projects and students displaying their creative front to the best. The Foundation to Computer Security course in Fall 2016 also received high rating from the students. It is probably not just about high rating, this feedback is a healthy platform where students can communicate what was good and what went wrong in a course. It gives a sense of satisfaction and creates motivation for a faculty to improvise on the course content and delivery mechanism.

ProfGiri does not end at IIITD, I do it outside too. This year I was invited to some prestigious institutes in and outside the country to give talks and lectures on various topics associated with my research. Some Indian academic institutes I visited this year were, IIIT Hyderabad, IIT KGP, IIT Guwahati and, LNMIIT. I visited The Berkman Center for Internet Safety at Harvard University, Massachusetts Institute of Technology (MIT), GaTech, Northeastern University (NEU) in Boston, Carnegie Mellon University and University of Maryland, Baltimore County in a span of one week. It was one of kind US visit, full of talks and, meeting alums in multiple cities. It was in this trip that I met my advisor after seven long years! It was really nice to relive some old memories and update each other on current activities. In the Europe, I visited ETH Zurich, GESIS Cologne, Germany and, Bern Switzerland for an APWG workshop. I also visited Singapore to give a talk in a workshop, and visited NUS; thanks to my alums who are in these places, who spend time with me when I am visiting their city / campus. This is definitely a high point of profgiri! One continent I hadn’t visited until this year was Australia, a week-long visit to University of New South Wales (UNSW) gave me an opportunity to see that part of the world too. I met some very accomplished faculty and hardworking students there. I also had the opportunity this year to be part of the Microsoft Faculty Summit in Pune. Another major highlight of this year was the TEDx talk. I was invited to give a TEDx talk by TEDx Juhu, Mumbai. I am not sure of how the talk went or if I enjoyed the whole process but it definitely was an amazing experience.

This year was one of the best years for research funds that I have raised as a faculty, I received 1,68,70,000 INR from Government of India and some other funds from different industry organizations. This year was also exciting for two very different reasons, one I got added to the ACM Distinguished Speakers of the world, and got an offer to be an Affiliate Faculty at IIIT Hyderabad (I have spent 2003 / 2004 also at IIITH, so it was a different feeling!).

This year was very productive in terms of kick starting interesting projects, some ideas getting translated into technology, used by several users and appreciated by the media.

  • KillFie – was the most talked about project that I have had in the recent years. I had written a blog just about our experience with news media. Never had any of my project garnered so much media attention. There were many friends of mine who got to know it themselves on different media and did not know that I was part of the team doing this work, and were surprised when they got to know it. Among many others (100+ popular venues), this work was on MIT Tech Review, front page of Economic Times in all editions in India, front page of Pittsburgh Post Gazette, Home page news of CMU & IIIT Delhi, many Radio channels and some TV channels. Stay tuned for the app that we are building! Paper can be found here. Below is an image which captures the different new media services where KillFie was covered, and the languages in which it was covered.
  • News Bugle, a Free basics news RSS feed app. This is an RSS aggregator service that gives a live update of news from top sources under different categories. This has about 800+ active users every day.
  • Google Chrome Spying Extensions. We analysed 43,000 browser extensions, and found 218 spying users’ sensitive information and sharing it to the creator of the extension. Anu’s blog on the results & Paper.
  • Helix – We developed this tool to help identify the tag in an image uploaded on Facebook. Chrome extension & Firefox extension.

An annual event I have been conducting for the past four years is the Security and Privacy Symposium. This year’s edition was held at IIITD campus and was attended by 160+ people, including students, faculty, government and industry. Pictures from the event.

To conclude, I can definitely say I had a very satisfying year, lots of lessons learned, lots of brick walls faced, many scaled, some in progress … Hope to continue #ProfGiri in the coming year and many more years to come 🙂

Killed it with a #Killfie: Journey from an Idea to a Global Media Phenomenon

31,000+ likes, 34,000+ shares, 1,000+ Tweets!

Most research goes through some natural phases; formulating the problem statement, collecting and analyzing data, submitting a research paper to a conference, writing a technical report, and then hoping the paper will get accepted at the conference and the work will be appreciated/acknowledged by the community (happily ever after!). I had never imagined that one such research topic, which went through some initial natural phases, will take such an interesting turn at some point and receive such an overwhelming amount of attention!

A lot has been said and written about our recent work (you can infer that from the title, and see ‘Who is talking about this research’) both in the technical community and press. I want to share my behind-the-scenes experience of going through this amazing phase of research – when it gets hard to count the number of mentions about your work returned by a quick google search! A news article about someone dying just after taking a selfie was posted on the Precog mailing list on June 2, 2016. Definitely not a conventional cause of death, this disturbing news made some members of the group to dig into the what, how and why of selfie deaths around the world. It was just a small idea that we started working on, discussions trickled, and some compelling observations followed. All culminated into a well written paper, submitted to a conference and the technical report going online on Arxiv on 7th November.

The report was first picked up by Sun UK news and some twitter handles like VickiTurk  on 9th November and what followed was a whirlwind of news articles and technology blogs across the globe, and across all media. It had become a sensation! It seemed to have touched all time zones from California (GMT-8) to New Zealand (GMT+13). The news buzz peaked on 18 November when three of us, Hemank Lamba, Megha Arora and myself, went on a spree of giving interviews. We had news reporters wanting answers over email, phone and skype, following up with us through the day. Here is what 18th November entailed for all three of us:

  • 0700 hrs IST: Call with CBC Canada, me sitting in IIT Kharagpur guest house and Hemank and Megha taking the call from Pittsburgh
  • 1000 hrs IST: Call with BBC UK Radio, I was taking it alone from in IIT Kharagpur, CSE department
  • 18 00 hrs EST: BBC World TV News, Hemank took this alone from Pittsburgh
  • 2130 hrs IST: NBC US, I took the call during my transit from Delhi Airport to IIITD
  • 2330 hrs IST: CMU, Hemank and Megha sitting in Gates building in CMU and I was at home in Delhi

This does not include the 25+ unique emails that we probably sent out answering questions or fixing timeslots for more interviews. While 3 of us were engrossed in this craziness, Mayank Vachher, Varun Bharadhwaj, and Divyansh Agarwal had their hands full providing backend support, collating the hits that we were getting in news and social media, and getting more specific insights from the data which the reporters were interested in. Three of them played an integral role in ensuring that we had a smooth run. In the meantime, we also had our group meetings to discuss the feedback that we are getting from people around the world.

The news about this research has been spreading across many newspapers, and online social networks like Facebook, Twitter, and Youtube. As of this moment, the following numbers summarize the traction garnered by this research:

  • Total Articles written (unique ones): 160
  • Total Facebook posts: 100+
  • Total Facebook likes: 32,108
  • Total Facebook shares (shares of the articles + posts): 33,937
  • Total Facebook comments: 2,795
  • Total Twitter tweets: 1000+
  • Total Twitter RTs (of all the above tweets): 1075
  • Total videos created on the project: 15
  • Radio interviews: 11
  • TV interviews: 2
  • Total number of requests for the dataset: 6

Below is a tag cloud capturing all the major news agencies which featured our work and the work was featured in 17 different languages.

Lessons learned through this media frenzy:

  • For a research to get popular, the topic has to be relevant to ‘people’
  • Reporters ask interesting ‘research’ questions, be prepared
  • Sociological/psychological studies around ‘who’ and ‘why’ of the research are important
  • Feedback from people is helpful in identifying potential issues in the research
  • Having captivating titles for the paper helps

Below is an infographic capturing the research work.

For those interested in knowing more about this research, here are some useful links:

There’s misinformation on Facebook. Here’s how you deal with it.

I’ll keep this short and to the point. There’s a sudden backlash on Facebook for hosting misinformation [1], and polar politics [2] after the recent elections in the USA. Is this new? NO.

Let me take you to back in time, to March 2014. The deeply tragic incident of the Malaysian Airlines Flight MH370 wiped off an entire aircraft and all on board [3]. A sea of prayers and solidarity followed on all social networks including Facebook. What also followed was a series of fake, misinformative posts, links, and videos claiming to show you the footage of the aircraft crashing [4], and rumors claiming that the plane had been found in the Bermuda triangle (see image of one such post below). Such footage never existed.

http://www.hoax-slayer.com/images/malaysia-airlines-MH370-scam-1.jpg

Following this incident, there have been a series of events where miscreants have exploited the context of a popular event to spread hoaxes, misinformation, rumors, fake news, etc. From the rumor of the death of comic actor Rowan Atkinson (a.k.a. Mr. Bean) to the suicide video by late legendary actor Robin Williams, misinformation has plagued Facebook for years, and is continuing to do so. While Facebook has recently acknowledged misinformation to be a serious problem, we at Precog had already started working on it when we first came across instances of misinformation. So how do you really deal with misinformation and rumors and hoaxes and fake news on Facebook?

There have been a few attempts to solve this problem. Facebook posted a series of blogs vowing to improve their algorithms to reduce misinformation, hoaxes, rumors, clickbaiting, etc. [8, 9, 10, 11, 12]. A recently conducted hackathon by Princeton University also witnessed a group of 4 students attempting to fix this problem [13]. Well, as it turns out, we took a dig at this problem over 2 years ago, and came up with a robust solution of our own. In August 2015, we publicly launched Facebook Inspector, a free, easy-to-use browser extension that identifies malicious content (including the type we just discussed above) in real time. At this moment, Facebook Inspector has over 200 daily active users, and has just crossed 5,000,000 hits (it’s 5 million; but it’s just fun to write it with so many zeros xD). We leveraged multiple crowd sourcing mechanisms to gather a pool of misinformative and other types of malicious posts, and harnessed them to generate a model to automatically identify misinformative posts, hoaxes, rumors, scams, etc.

Give it a try. Download the Chrome version at https://chrome.google.com/webstore/detail/facebook-inspector/jlhjfkmldnokgkhbhgbnmiejokohmlfc

Firefox users, download at https://addons.mozilla.org/en-US/firefox/addon/fbi-facebook-inspector/

To read the entire story behind the inception of the idea, and incarnation of Facebook Inspector, read the detailed technical report here.

So we spotted a problem a couple of years ago, took a dig at solving it (and I’d like to believe we succeeded), and apparently, the entire world is after Facebook for the same problem today. But misinformation, hoaxes, and rumors aren’t the only big problems that Facebook is surrounded by. Lets talk some more about the US elections. Facebook’s algorithms have been accused of reinforcing “political polarization” by Professor Filippo Menczer in a popular news article [2]. Apparently, Facebook is home to a big bunch of political groups which post polarized content to influence users towards / against certain political beliefs. Whether such content should be allowed on social networking websites, is debatable. After all, free speech is a thing! But the question that demands attention here is, did these politically polarized entities suddenly appear on Facebook around the election time? I mean, if they would’ve been around for long, Facebook would’ve known, right? And the effects of social network content on elections are well known and studied [5, 6, 7]. So Facebook would’ve definitely done something to at least nudge users when getting exposed to polarized political content. But polarized political content was never a point of concern for Facebook. So it probably didn’t exist until right before the elections. Right? Wrong!

Well, this is a literal “I told you so moment.” Last year, we conducted a large scale study of malicious Facebook pages, and one of our main findings was the dominant presence of politically polarized entities on Facebook among malicious pages. We analyzed the content posted by these politically polarized pages, and found that negative sentiment, anger, and religion dominated within such content. We reported our findings in the form of a technical report: https://arxiv.org/abs/1510.05828v1

It is good to know that what you work on, as part of research, connects closely to relevant, present day, real world problems, but it isn’t really a good feeling to realize that something you already knew could happen, happens anyway. We at Precog always push towards trying to make a difference and making the online world better and safer. We try our best, but we can only do so much.

To conclude, not bragging here (well, it’s not bragging if it’s true!), but we saw not one, but two real problems coming, more than a year before Facebook did.

You see, we’re called “Precog” for a reason. *mic drop*

References

[1] https://techcrunch.com/2016/11/10/facebook-admits-it-must-do-more-to-stop-the-spread-of-misinformation-on-its-platform/

[2] https://www.theguardian.com/technology/2016/nov/10/facebook-fake-news-election-conspiracy-theories

[3] https://en.wikipedia.org/wiki/Malaysia_Airlines_Flight_370

[4] https://www.scamwatch.gov.au/news/scammers-using-videos-of-malaysian-airlines-flight-mh370-to-spread-malware

[5] Williams, Christine B., and Girish J. Gulati. “Social networks in political campaigns: Facebook and the 2006 midterm elections.” annual meeting of the American Political Science Association. Vol. 1. No. 11. 2007.

[6] Williams, Christine B., and J. Girish. “Social networks in political campaigns: Facebook and the congressional elections of 2006 and 2008.” New Media & Society (2012): 1461444812457332.

[7] Douglas, Sara, et al. “Politics and young adults: the effects of Facebook on candidate evaluation.” Proceedings of the 15th Annual International Conference on Digital Government Research. ACM, 2014.

[8] https://newsroom.fb.com/news/2015/01/news-feed-fyi-showing-fewer-hoaxes/

[9] http://newsroom.fb.com/news/2016/08/news-feed-fyi-further-reducing-clickbait-in-feed/

[10] http://newsroom.fb.com/news/2014/11/news-feed-fyi-reducing-overly-promotional-page-posts-in-news-feed/

[11] http://newsroom.fb.com/news/2014/08/news-feed-fyi-click-baiting/

[12] http://newsroom.fb.com/news/2014/04/news-feed-fyi-cleaning-up-news-feed-spam/

[13] http://www.businessinsider.in/It-only-took-36-hours-for-these-students-to-solve-Facebooks-fake-news-problem/articleshow/55426656.cms

Teaching #PSOSMonNPTEL in a country of a billion: Experiences and take aways

Recently finished teaching my first course on NPTEL (National Program on Technology Enhanced Learning). NPTEL is like a Coursera of India. It is a joint initiative of the Indian Institute of Science (IISc) and the Indian Institute of Technology (IITs) and is managed by faculty from IIT, Madras.

I taught my signature course Privacy and Security in Online Social Media (PSOSM). The course was assigned noc16-cs07 number. I have taught this course previously at IIITD (CSE648, 4 times) and at Federal University of Minas Gerais (UFMG), Brazil (2 times). Below is the flier and here is the teaser video we created and used for the promotion of the course. The registration started on May 1 and went till July 15, by the end of this deadline, I had about 2200 registrations, but that number went up manifold when the registration date was extended by a couple of days. All efforts in promoting the course paid well, I had 5250+ students signed up for the course.

I had four amazing TAs assisting me on this course, all being my own Ph.D. students. Anupama Aggarwal, Prateek Dewan, Srishti Gupta and Niharika Sachdeva. They not only helped with tutorials, quizzes and tests but also functioned as tech support throughout the course. Special thanks to Prateek who took care of editing the videos and responding to the mailing list (there were even emails to prateek, referring him as faculty of the course!) and Niharika for managing the entire NPTEL portal.

I was getting mentally prepared for spending more time in preparing for this course, but it took way more time than what I had foreseen. It was my first time using Camtasia for recording lectures. Previously I have had my lectures recorded while I taught physically in a class. It feels very natural teaching a class full of curious students, interacting with them, asking / answering questions, but it is quite a different feeling teaching a class consisting of only one laptop and that too in your own office!

After some initial teething problems with recording and uploading, the course went on smoothly. As of writing this blog, I have 23,000 views on all the lecture videos that we uploaded as part of the course. Apart from videos, I also had one AMA (Ask Me Anything) session and one physical meeting at IIITD, where students could ask questions, clarify doubts or share their concerns directly with me and the TAs.

What the students felt, I will share later in this blog but personally it was a very satisfying experience. Many students all over India got to know me. Students from many smaller towns have taken this course. I received emails from college principals from tier II, tier III cities saying they had made this course a part of their curriculum and they have their best students taking this course. In my opinion, this is the biggest advantage of such online courses, it breaks geographical barriers and makes quality education and knowledge accessible to a larger audience. Out of the 5250 students, 152 students registered for the final exam and appeared for the exam; students have to pay some nominal fee to take this exam. I was super excited to have so many students pay for the course and take the exam.

NPTEL maintains a mailing list of all students registered for the course and that acts as a good medium for all of us; faculty, TAs and students to interact on a regular basis. This is where I was told that I have an American accent when I speak L or that in some videos my voice was very feeble. Also, as a practice, NPTEL requests students to fill a feedback form and shares the feedback with the faculty teaching the course and students also sent some through the mailing list. It feels very heartening to see some comments from the students and I take this opportunity to thank and congratulate them for their time and effort in finishing the course and giving a constructive feedback on the course. Some comments:

  • “Thanks for giving me a sense of satisfaction of doing a course.“
  • “Thanks a million to the whole Team. One of the best online course I ever had. There were days when I started posting queries at 10PM in the forum and TA’s helped me till I get what I wanted, some of the discussions went on till 1AM too. This shows how dedicated the team is!.”
  • “Feedback for an awesome course like this is really worth. Thank you PK sir for opening up such a treasure of knowledge. The best part of the course and it actually made the course different was the meet up at IIITD and also the hangouts session. The tutorials are really nicely presented and challenging for us.”
  • “I have gone through couple of other NPTEL certifications in recent years but this one was the best I would say…. Special thanks to Dr. PK. He was very interactive and an enthusiast. “
  • “firstly i am happy for taking this course, i did well in exam and very very thanks to all.. teaching faculty.. all teaching faculty did beyond the expectations..now i realise what are the skills  i have..  and thank you PK sir..and  lastly i say thank u NPTEL team.”
  • “5/5… Thank you IIIT-D, PK sir and the awesome TAs.

Below is the certificate that my TAs got for helping with the course.

Lessons learned / suggestions for doing a good job with teaching on NPTEL:

  • Prepare the lectures and record it before-hand (well before the date of uploading)
  • Have wonderful TAs, they are the secret for success!
  • Try to have Ask Me Anything or physical meeting sessions at least a couple of times
  • Keep the mailing list very active
  • If you are teaching a course that you teach otherwise in campus, please be aware that the students taking the course are not so well equipped compared to students in your class in campus.

I would definitely love to teach a course on NPTEL again! Until then goodbye to the NPTEL community!