Call for Service : Characterizing and Modeling Police Response to Serviceable Requests on Facebook

India is going digital in a big way; from banking to manufacturing to agriculture, each field is seeing the penetration of technology. Police organizations also have started using technology for effective policing. Most police organisations now have an official website, a Facebook page and a Twitter handle. Police not only use these new media services to showcase their organisation but also to interact with citizens very regularly. Police posts on Facebook or tweets on Twitter include a variety of topics ranging from traffic advisories, to awareness creation to bragging about their achievements. Similarly, the growing technology savvy population of India is using these mediums to share their grievances, concerns, etc. with the police. With a handful of police officers serving 1.25 billion people, it is no surprise that a lot of posts/tweets by the citizens go unnoticed by the police. Even features like tagging police commissioners and police accounts do not always yield the expected response, causing a sense of resentment. The police too find themselves helpless given the multitude of things.

With our continued interest in empowering police organizations with technology which can help them in their day-to-day activities, we have been working in the space of online social media and policing for some time now. For our research publications in this space, please visit here. For effective communication between the citizens and police, it is necessary for the police to understand the vast amount of content generated on their social media accounts. In this direction, we started thinking about how to break up the content into important versus unimportant, urgent versus non-urgent, etc. Our main aim in this research was to help police identify ‘serviceable’ content which can be served quickly and efficiently. Requests to which police should respond, evaluate or take action are considered as serviceable requests.  

We analyzed 85 official Facebook pages of police organizations in India and studied the nature of posts that citizens share on police Facebook pages. Not all posts require the same amount of attention from the police, there are some cases where immediate action needs to be taken while some can wait. Based on this analysis, we came up with six textual attributes that can identify serviceable posts; posts that need some kind of police response. We find such posts are marked by high negative emotions, more factual, and objective content such as location and time of incidences.

We identify four types of response that citizens may get on their posts:

(a) Forward: Posts which had enough information and could be forwarded to appropriate authorities for action. For instance, a resident posted, Date : 4/11/2015 (Wednesday), Time : 10:17 pm, Number : [withheld], Location : [withheld], Violations : Crossing line by way too much obstructing the vehicles which were coming from [withheld] entrance later he jumped the signal ……..

(b) Give Solution: Posts mostly included queries by residents to police that could be answered without any detail; resident asks, Admin !! Can U Explain to Me How Two Challans On Same Date Same Time in Just 5 Minutes Gap !! How Its Possible ?? Any Thing Wrong ??

(c) Acknowledge with thanks: Posts to which the police wrote “thanks for sharing the information” or “thanks for the appreciation.” For instance, resident remarks, Chennai City Traffic Police a humble salute from a fellow Chennaiite for the commendable job in such rains!!

(d) Need more details: In these resident’s posts, police inquired more details so that action could be taken, e.g., a resident asks, Cops driving wrong side [of road] near XXX hotel .. what action will be taken against them ? This post lacks information such as time and date when the incident happened.

To enhance response to serviceable posts, we propose a request – response identification framework. The approach followed in the paper is shown below:

 

Understanding Requests from Citizens:

Residents often use different language styles in posts while expressing their concerns and asking queries to police. Our approach includes following six category of features to characterize serviceable posts:Emotional Attributes,Cognitive and Interpersonal Attributes, Linguistic Attributes, Question Asking Attributes, Entity-Based Attributes, and Topical Attributes. These include the both handcrafted features and LDA / NMF based features that help automatically discover the latent dimensions and induce semantic features in our data.

Our analysis shows some intriguing results:

Serviceable requests show significantly higher value of negative emotional states i.e. “anger” (+15.38%), “disgust” (+47.8%), “fear” (+60%), and “sadness” (+10%) in comparison to non-serviceable requests. Most frequent topic is includes queries / question posed to police (Complaints represents complaints against cops in- correct decisions).

Comparing serviceable sub-types, we observe that 93.10% posts in Thanks sub-type did not receive a response from police. Posts in Forward sub-type received the maximum number of responses from police (63.6%, 182 posts). Table 1 below summarizes the number of posts that did not receive police responses.

Table 1: Number of posts that received responses (N of Events) and censored event showing posts that did not get response from the police.

Automated Classifier for Serviceability:

Our work explores a series of statistical models to predict serviceable posts and its different types. The model makes use of the content based measures – emotions, cognitive attributes, linguistic, question posed, entity and topical attributes. We explore five different classification algorithms – Random Forest (RF), Logistic Regression (LR), Decision Trees (DT), Adaptive Boosted Decision Trees (ADT), and Gradient Boosting Classifier (GBC) using balanced class weights. Table 2 below reports the performance of different algorithms to correctly identify serviceable posts.

Table 2: Mean Performance after 10-fold CV of different algorithms to correctly identify serviceable posts.

Through our work, we believe technological interventions can help increase the interactions between police and citizens and thereby increase the trust people have on police. The police too may have a more directed and cost-labour efficient mechanism in dealing with any law and order situation reported on their Facebook page. This will increase the overall well-being and safety of society.

Link to the analysis portal

Link to the accounts portal:

Full citation & link to the paper: Sachdeva, N., and Kumaraguru, P. Call for Service: Characterizing and Modeling Police Response to Serviceable Requests on Facebook. Accepted at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW), 2017. PDF

 

 

 

 

What’s your MWI? : A social media based study of Mental Well-Being in college campuses

College students’ mental health concerns are a persistent issue; psychological distress in the form of depression, anxiety and other mental health challenges among college students is a growing health concern. However, very few university students actually seek help related to mental illness. This arises due to various barriers like limited knowledge about available psychiatric services and social stigma. Further, there is dearth of accurate, continuous and multi-campus data on mental well-being which presents significant challenges to intervention and mitigation strategies in college campuses.

Recent advances in HCI and social computing show that content shared on social media can enable accurate inference, tracking and understanding of the mental health concerns of users. There has also been work showing that college students appropriate social media for self-disclosure, support seeking and social connectedness. These facts, coupled with the pervasiveness of social media among college students, motivated us to examine the potential of social media as a “measure” for quantifying the mental well-being in a college population. Specifically, we focused on the following research goals:

  • Building and validating a machine learning model to identify mental health expressions of students in online communities
  • Analysing the lingusitic and temporal characteristics of the identified mental health content
  • Developing an index for the collective mental well-being in a campus, and examining it’s relationship with university attributes like academic prestige, enrollment size and student body demographics

We obtained a list of 150 ranked major universities in the US by crawling the US News website. We also obtained university metadata like gender distribution, tuition/fee during this crawl. Next, we crawled the Wikipedia pages for these 150 universities for extracting the student enrollment, type of university (public/private) and the setting (city/urban/suburban/rural) at each institute. Lastly, we obtained information on the racial diversity at each university from an article on Priceonomics. We study these universities in our work and use the metadata in our analysis.

Geographical distribution of the 150 ranked universities we studied

For social media data, we focus on Reddit. Reddit is known to be a widely used online forum and social media sites among the college student demographic. It’s forum structure allows creation of public online communities (known as “subreddits”), including many dedicated to specific college campuses. This allowed us to collect a large sample of posts shared by university students in one place. Although Facebook is likely more popular/widespread among students, it is challenging to use Facebook in such studies since the content shared is largely private, making it challenging to obtain such large data from it. Further, the semi-anonymous nature of Reddit enables candid self-disclosure around stigmatized topics like mental health.

After a manual search for subreddits for each university, we were able to identify public subreddit pages for 146 of the 150 universities. Next, we focused on correcting the “under-adoption” bias in subreddits. Subreddits which had a small fraction of Reddit users (as compared to university enrollment) were filtered out due to being under-representated. This left us with 109 universities with adequate Reddit representation. We leveraged the data on Google BigQuery (combined with some additional data collection) to get all posts ranging from June 2011 to February 2016. The final dataset used for our analysis included 446,897 posts from 152,834 unique users.

Since Reddit data does not contain any gold standard information on whether a post in a university subreddit is a mental health expression, our first goal was to use an inductive transfer learning approach to build a model to identify such content in a university subreddit. First, we include (as ground truth data) Reddit posts made on various mental health support communities. Prior work has established that, in these communities, individuals self-disclose a variety of mental health challenges explicitly. We use these posts as the “positive” posts and, parallelly, we utilize another set of Reddit posts, made on generic subreddits unrelated to mental health, as “negative” posts. We obtain 21,734 posts for each category, which we use as the positive and negative class for building a classifier. We observed a validation accuracy of 93% and an accuracy of 97% on a test set of 500 unseen, expert-annotated posts from our university subreddit data. We then proceeded to use this classifier for labelling the 446,397 other posts across the 109 university subreddits. Our classifier identified 13,914 posts (3.1%) to be mental health expressions, whereas the rest of the 432,483 posts were marked not about the topic. This corresponded to 9010 unique users out of a total of 152,834.

Next, we looked at the linguistic characteristics of the posts identified to be mental health expressions by conducting a qualitative examination of the top n-grams uniquely occuring in these posts. We found that students appropriate the Reddit communities to converse on a number of college, academic, relationship, and personal life challenges that relate to their mental well-being (“go into debt”, “doing poorly in”, “only one homework”, “up late”, “the jobs i”). The n-grams also indicated that certain posts contained explicit mentions of mental health challenges (“psychiatric”, “depression”, “killing myself”, “suicidal thoughts”), as well as the difficulties students face in their lives due to these experiences (“life isnt”, “issues with depression”, “was doing great”, “ruin”, “cheated”). Some of the top n-grams were also used in the context of seeking support (“need help”, “i really need”, “could help me”).

For the temporal analysis of mental health content, we first study the proportion of posts with mental health expression across the years. The figure below shows the content per year (along with a least squares line fit). We observed that the proportion of posts with mental health expressions has been on the rise — there is a 16% increase in 2015, compared to that in 2011.

We then looked at how this trend varies over the course of an academic year. The plots below show the trend separately for universities following the semester system and the quarter system. Between August and April, for the universities in the semester system, we observed an 18.5% increase in mental health expression; this percentage was much higher: 78% for those in the quarter system, when compared between September and May. On the other hand, we observed a reverse trend in mental health content during summer months, for both semester and quarter system universities.

Trend in the semester months
Trend in the quarter months
Trend in the summer months for quarter system universities

 

 

Trend in the summer months for semester system universities

 

 

 

Lastly, as a part of our third research goal, we formulated an index we refer to as the Mental Well-Being Index (MWI), as a measure of the collective mental well-being in a university subreddit, based on the posts labelled as mental health related by the classifier. We then computed the MWI metric for all 109 subreddits and examined it’s relationship with the university attributes.

 

 

 

 

 

By visualising these relationships (as above), we gleaned several interesting observations. We found:

  •  Universities with larger student bodies (enrollment) as well as greater proportion of undergraduates in their student bodies tend to be associated with lower MWI
  • MWI of the 66 public universities we consider, is lower, relative to that in the 43 private universities, by 332%
  •  MWI is lower in the 7 rural and 33 suburban universities by 40-266% compared to others, while it is the highest in the 31 universities categorized to be in cities (by 29-77%)
  • Universities with higher academic prestige (or low absolute value rank) and higher tuition tend to be associated with higher MWI
  • MWI tends to be lower in universities with more females (or sex ratio, male to female <= 1) by 850%

Further, although our data shows a marginally lower MWI in universities with greater racial diversity, we did not find statistical significance to support this claim.

Our work here (the complete paper accepted at CHI 2017) further details our analysis in depth. Below is an infographic for our work.

 

 

I have been Precog-ed (for life): Part 4

Holà! It’s the first day of 2017. All of us just got done with looking back at the past year, trying to fathom how time flies and life metamorphosizes. My life has taken a leap too and this is my last blog as a part of the ‘I have been Precog-ed’ series. Earlier, I have written about my first stint at research (Part 1), a wonderful summer at the Information Sciences Institute at Marina Delray, Los Angeles (Part 2), my first paper presentation at ICWSM 2016 in Germany (Part 3), and my time at Precog. This post is about the last 6 months of my journey and an attempt to express what being a Precog-er is all about (for more on this, please read the first three parts too). Being a Precog-er for more than 3 years, I have more thoughts than I can ever pen down; from being an undergrad who joined Precog as a noob to a grad student at Carnegie Mellon University, my path has always been illuminated by the light of learning and hope.

April 2016 – I was struggling with end-sem preparations, document processing and Visa applications for my trip to ICWSM and my masters in the States, and the humdrum undergrad life when an unexpected email got an unexpected reaction from me –

“Dear Megha,

We are pleased to inform you that you have been selected as an one of the 40 CERN Openlab Summer Students 2016 (out of 1461 applicants)! For nine weeks, CERN will be your host for what we hope is going to be an interesting, fun and active summer…”

I have been an amateur astronomer for 9 years, and getting to work at the ‘Mecca of Particle Physics’ would have been a dream come true. I knew I wouldn’t be able to make it. I was applying for my Schengen Visa for Germany (which would take another 2 weeks), and then I had to start my application for the US visa. I needed another Schengen Visa for Switzerland in a span of one week. On top of that, the only dates I could select for the internship were overlapping with my initial orientation schedule at CMU. I almost disrupted a meeting in PK’s office to break the news to him. I was sad. Pillars (Ph.D. students at Precog) and PK were convinced that I should try and if it doesn’t work out, so be it. That’s a Precog trait – not giving up until you have given your best shot! After cutting short the duration of my summer at CERN, pushing CMU to allow me to skip the orientations (convincing them that I’ll manage when I wasn’t sure myself I’ll), and getting my Schegen for Switzerland in a day (thanks to CERN’s administrative staff who made a special request for me to the embassy), I was ready for a summer at CERN.

I worked for 2 months at CERN’s data center on a storage system of ~125PB (one of the largest in the world). CERN openlab program includes a lecture series to helps CS students understand the Physics needed for some of the projects, trips to ETH Zürich and EPFL Lausanne, hackathons, and several means to help the students gain insights about the revolutionary projects spanning across 100 hectares in Switzerland and more than 450 hectares in France! It was a humbling experience, which entailed learning something new every day. Europeans have nailed the work-life balance too. Along with finishing my project on time, I managed to check Geneva, Lausanne, Lyon, Zürich, Paris, Montreux, Bern, Engelberg, Chamonix and many more off my list!

Delhi for 2 days, and Pittsburgh was my next destination, my home for the next 16 months. I am an MSCS student at CMU now. Last to arrive and one of the youngest of the lot, thanks to PK I had ample of background knowledge about life as a student here and the city of Pittsburgh. The experience I have gained at Precog comes in handy when I have to identify research gaps and solve hard problems. I feel more equipped and confident to take up the challenges that come along with grad life at a school like CMU.

Throughout these 6 months (Jul – Dec 2016), I have been working with a few Precog-ers on what we now call the Killfie project. It has turned out to be one of the most exciting projects I have worked on as a part of the group. It is the inclination to work on interesting problems with some brilliant people, which gives me the motivation to find time for this amongst courses and projects at CMU.

I cannot finish this blog without revisiting these lines from my first blog – “…PK, the heart and brain of Precog. He is the coolest adviser I have ever met and his skills and dexterity at work are almost mind-boggling. I came to know him as my Probability and Statistics professor, the role changed to being my adviser working at Precog and now I see him as a mentor for life..”. A lot of what I have been able to achieve in the last 3 years, I owe it to PK’s unconditional support. Thank you PK for illuminating my path always and for proving what good mentorship can accomplish!
My time at Precog has taught me how to help people, make friends, eliminate distractions and focus, improve daily, think big, fail often and give nothing short of your very best effort! I have had last minute unscheduled video calls in the middle of the night from the other end of the world with Precog-ers when I needed help. Pillars, interns, RAs – thank you each one of you for this experience. Even though I live in a different time-zone now and my attendance at the 4th floor Ph.D. lab has been at an all-time low, I know my association with the group will last forever.  As has been rightly put – ‘Once a Precog-er, always a Precog-er!’.

PS – Some pictures…

Just another day at Precog…
“It’s all about the people!”
The room where Tim Berners-Lee developed the World Wide Web at CERN!
This one doesn’t need a caption… 🙂
The Aiguille du Midi Skywalk, “Step into the Void” at Chamonix (altitude – 3842m)
CERN Openlab Summer Students 2016

 

 

#ProfGiri #LovingMyFacultyLife An amazing year of my faculty life!

A few days back, when I logged into Facebook, it asked me if I wanted to make a video of the year gone by. That was the trigger for this blog. Just a peek at the year gone by… a collection of the #ProfGiri that I did this year, 2016. It has been one roller coaster ride!

Since it is about ProfGiri, let me begin with students, my lifeline, in every sense of the term. Students who took courses with me, both at the  Undergraduate level and Masters have joined amazing institutes for their further studies, like Carnegie Mellon University (my alma mater), GaTech, UIUC, USC, ASU, and other brilliant places in the world. Some students also started working at places like Apple, GoPro, MasterCard as UX Designers, or moved from one place to another better avenue like Microsoft in the US, etc. Through the year, I was also able to host some bright students from outside IIITD, students came from IIT Guwahati, College of Engineering Guindy, NIT Trichy, IIIT Sri City, and other institutes from India.

Traveling for work and meeting IIITD alums is something I always enjoy doing; this year has been splendid in this regard. Looking back it looks like I have met many brilliant and successful alums around the world. My earlier blogs on meeting alums from Europe and alums from the US. I also got a chance to meet many other alums within different cities in the country. I graduated one PhD student Paridhi Jain this year, who is now working at Accenture Research. This year’s graduation at IIITD was a treat for me. It rained awards for students who have been working with me! Megha Arora received the chancellor’s award, Mansi Panwar & Shashank Gautham received the award for  best BTP thesis, and Sarthak Ahuja received the Best All-Rounder Student award. I couldn’t have asked for more. An exciting period in the year is when admits from graduate schools around the world start coming in. That’s when I know where all my bright students are headed for higher studies. I had written a blog “This is Why I Love My Job: Students are the backbone of Faculty life!” just dedicated to all those students who have made me and my institute proud. Most of these students and other students living in the US and Europe usually back home for a winter break and so it is alum-visiting time around Christmas/New Year. This year we have already had 4 of our Alums come-by, many are scheduled in the next 2 – 3 weeks. Last but not the least, I had a good crop of research papers published / accepted this year; to name a few venues, CHI 2017 (camera ready version getting prepared), CSCW 2017 paper, BHCI 2016 paper, SocInfo 2016, etc. Rest of the papers can be seen from our publications page.

Another integral part of #ProfGiri is Teaching. The Privacy and Security in Online Social Media course that I taught on NPTEL had 5,250 students signed up. It was a very different experience, my blog on the experience. At IIITD, I taught Designing Human Centered Systems in Spring 2016. It was highly appreciated by the students, which reflected in the course feedback. A big thankyou to all those students. The course always ends with a Building Better Interface (BBI). This year’s edition of BBI was particularly a grand success with a variety of projects and students displaying their creative front to the best. The Foundation to Computer Security course in Fall 2016 also received high rating from the students. It is probably not just about high rating, this feedback is a healthy platform where students can communicate what was good and what went wrong in a course. It gives a sense of satisfaction and creates motivation for a faculty to improvise on the course content and delivery mechanism.

ProfGiri does not end at IIITD, I do it outside too. This year I was invited to some prestigious institutes in and outside the country to give talks and lectures on various topics associated with my research. Some Indian academic institutes I visited this year were, IIIT Hyderabad, IIT KGP, IIT Guwahati and, LNMIIT. I visited The Berkman Center for Internet Safety at Harvard University, Massachusetts Institute of Technology (MIT), GaTech, Northeastern University (NEU) in Boston, Carnegie Mellon University and University of Maryland, Baltimore County in a span of one week. It was one of kind US visit, full of talks and, meeting alums in multiple cities. It was in this trip that I met my advisor after seven long years! It was really nice to relive some old memories and update each other on current activities. In the Europe, I visited ETH Zurich, GESIS Cologne, Germany and, Bern Switzerland for an APWG workshop. I also visited Singapore to give a talk in a workshop, and visited NUS; thanks to my alums who are in these places, who spend time with me when I am visiting their city / campus. This is definitely a high point of profgiri! One continent I hadn’t visited until this year was Australia, a week-long visit to University of New South Wales (UNSW) gave me an opportunity to see that part of the world too. I met some very accomplished faculty and hardworking students there. I also had the opportunity this year to be part of the Microsoft Faculty Summit in Pune. Another major highlight of this year was the TEDx talk. I was invited to give a TEDx talk by TEDx Juhu, Mumbai. I am not sure of how the talk went or if I enjoyed the whole process but it definitely was an amazing experience.

This year was one of the best years for research funds that I have raised as a faculty, I received 1,68,70,000 INR from Government of India and some other funds from different industry organizations. This year was also exciting for two very different reasons, one I got added to the ACM Distinguished Speakers of the world, and got an offer to be an Affiliate Faculty at IIIT Hyderabad (I have spent 2003 / 2004 also at IIITH, so it was a different feeling!).

This year was very productive in terms of kick starting interesting projects, some ideas getting translated into technology, used by several users and appreciated by the media.

  • KillFie – was the most talked about project that I have had in the recent years. I had written a blog just about our experience with news media. Never had any of my project garnered so much media attention. There were many friends of mine who got to know it themselves on different media and did not know that I was part of the team doing this work, and were surprised when they got to know it. Among many others (100+ popular venues), this work was on MIT Tech Review, front page of Economic Times in all editions in India, front page of Pittsburgh Post Gazette, Home page news of CMU & IIIT Delhi, many Radio channels and some TV channels. Stay tuned for the app that we are building! Paper can be found here. Below is an image which captures the different new media services where KillFie was covered, and the languages in which it was covered.
  • News Bugle, a Free basics news RSS feed app. This is an RSS aggregator service that gives a live update of news from top sources under different categories. This has about 800+ active users every day.
  • Google Chrome Spying Extensions. We analysed 43,000 browser extensions, and found 218 spying users’ sensitive information and sharing it to the creator of the extension. Anu’s blog on the results & Paper.
  • Helix – We developed this tool to help identify the tag in an image uploaded on Facebook. Chrome extension & Firefox extension.

An annual event I have been conducting for the past four years is the Security and Privacy Symposium. This year’s edition was held at IIITD campus and was attended by 160+ people, including students, faculty, government and industry. Pictures from the event.

To conclude, I can definitely say I had a very satisfying year, lots of lessons learned, lots of brick walls faced, many scaled, some in progress … Hope to continue #ProfGiri in the coming year and many more years to come 🙂

Killed it with a #Killfie: Journey from an Idea to a Global Media Phenomenon

31,000+ likes, 34,000+ shares, 1,000+ Tweets!

Most research goes through some natural phases; formulating the problem statement, collecting and analyzing data, submitting a research paper to a conference, writing a technical report, and then hoping the paper will get accepted at the conference and the work will be appreciated/acknowledged by the community (happily ever after!). I had never imagined that one such research topic, which went through some initial natural phases, will take such an interesting turn at some point and receive such an overwhelming amount of attention!

A lot has been said and written about our recent work (you can infer that from the title, and see ‘Who is talking about this research’) both in the technical community and press. I want to share my behind-the-scenes experience of going through this amazing phase of research – when it gets hard to count the number of mentions about your work returned by a quick google search! A news article about someone dying just after taking a selfie was posted on the Precog mailing list on June 2, 2016. Definitely not a conventional cause of death, this disturbing news made some members of the group to dig into the what, how and why of selfie deaths around the world. It was just a small idea that we started working on, discussions trickled, and some compelling observations followed. All culminated into a well written paper, submitted to a conference and the technical report going online on Arxiv on 7th November.

The report was first picked up by Sun UK news and some twitter handles like VickiTurk  on 9th November and what followed was a whirlwind of news articles and technology blogs across the globe, and across all media. It had become a sensation! It seemed to have touched all time zones from California (GMT-8) to New Zealand (GMT+13). The news buzz peaked on 18 November when three of us, Hemank Lamba, Megha Arora and myself, went on a spree of giving interviews. We had news reporters wanting answers over email, phone and skype, following up with us through the day. Here is what 18th November entailed for all three of us:

  • 0700 hrs IST: Call with CBC Canada, me sitting in IIT Kharagpur guest house and Hemank and Megha taking the call from Pittsburgh
  • 1000 hrs IST: Call with BBC UK Radio, I was taking it alone from in IIT Kharagpur, CSE department
  • 18 00 hrs EST: BBC World TV News, Hemank took this alone from Pittsburgh
  • 2130 hrs IST: NBC US, I took the call during my transit from Delhi Airport to IIITD
  • 2330 hrs IST: CMU, Hemank and Megha sitting in Gates building in CMU and I was at home in Delhi

This does not include the 25+ unique emails that we probably sent out answering questions or fixing timeslots for more interviews. While 3 of us were engrossed in this craziness, Mayank Vachher, Varun Bharadhwaj, and Divyansh Agarwal had their hands full providing backend support, collating the hits that we were getting in news and social media, and getting more specific insights from the data which the reporters were interested in. Three of them played an integral role in ensuring that we had a smooth run. In the meantime, we also had our group meetings to discuss the feedback that we are getting from people around the world.

The news about this research has been spreading across many newspapers, and online social networks like Facebook, Twitter, and Youtube. As of this moment, the following numbers summarize the traction garnered by this research:

  • Total Articles written (unique ones): 160
  • Total Facebook posts: 100+
  • Total Facebook likes: 32,108
  • Total Facebook shares (shares of the articles + posts): 33,937
  • Total Facebook comments: 2,795
  • Total Twitter tweets: 1000+
  • Total Twitter RTs (of all the above tweets): 1075
  • Total videos created on the project: 15
  • Radio interviews: 11
  • TV interviews: 2
  • Total number of requests for the dataset: 6

Below is a tag cloud capturing all the major news agencies which featured our work and the work was featured in 17 different languages.

Lessons learned through this media frenzy:

  • For a research to get popular, the topic has to be relevant to ‘people’
  • Reporters ask interesting ‘research’ questions, be prepared
  • Sociological/psychological studies around ‘who’ and ‘why’ of the research are important
  • Feedback from people is helpful in identifying potential issues in the research
  • Having captivating titles for the paper helps

Below is an infographic capturing the research work.

For those interested in knowing more about this research, here are some useful links:

There’s misinformation on Facebook. Here’s how you deal with it.

I’ll keep this short and to the point. There’s a sudden backlash on Facebook for hosting misinformation [1], and polar politics [2] after the recent elections in the USA. Is this new? NO.

Let me take you to back in time, to March 2014. The deeply tragic incident of the Malaysian Airlines Flight MH370 wiped off an entire aircraft and all on board [3]. A sea of prayers and solidarity followed on all social networks including Facebook. What also followed was a series of fake, misinformative posts, links, and videos claiming to show you the footage of the aircraft crashing [4], and rumors claiming that the plane had been found in the Bermuda triangle (see image of one such post below). Such footage never existed.

http://www.hoax-slayer.com/images/malaysia-airlines-MH370-scam-1.jpg

Following this incident, there have been a series of events where miscreants have exploited the context of a popular event to spread hoaxes, misinformation, rumors, fake news, etc. From the rumor of the death of comic actor Rowan Atkinson (a.k.a. Mr. Bean) to the suicide video by late legendary actor Robin Williams, misinformation has plagued Facebook for years, and is continuing to do so. While Facebook has recently acknowledged misinformation to be a serious problem, we at Precog had already started working on it when we first came across instances of misinformation. So how do you really deal with misinformation and rumors and hoaxes and fake news on Facebook?

There have been a few attempts to solve this problem. Facebook posted a series of blogs vowing to improve their algorithms to reduce misinformation, hoaxes, rumors, clickbaiting, etc. [8, 9, 10, 11, 12]. A recently conducted hackathon by Princeton University also witnessed a group of 4 students attempting to fix this problem [13]. Well, as it turns out, we took a dig at this problem over 2 years ago, and came up with a robust solution of our own. In August 2015, we publicly launched Facebook Inspector, a free, easy-to-use browser extension that identifies malicious content (including the type we just discussed above) in real time. At this moment, Facebook Inspector has over 200 daily active users, and has just crossed 5,000,000 hits (it’s 5 million; but it’s just fun to write it with so many zeros xD). We leveraged multiple crowd sourcing mechanisms to gather a pool of misinformative and other types of malicious posts, and harnessed them to generate a model to automatically identify misinformative posts, hoaxes, rumors, scams, etc.

Give it a try. Download the Chrome version at https://chrome.google.com/webstore/detail/facebook-inspector/jlhjfkmldnokgkhbhgbnmiejokohmlfc

Firefox users, download at https://addons.mozilla.org/en-US/firefox/addon/fbi-facebook-inspector/

To read the entire story behind the inception of the idea, and incarnation of Facebook Inspector, read the detailed technical report here.

So we spotted a problem a couple of years ago, took a dig at solving it (and I’d like to believe we succeeded), and apparently, the entire world is after Facebook for the same problem today. But misinformation, hoaxes, and rumors aren’t the only big problems that Facebook is surrounded by. Lets talk some more about the US elections. Facebook’s algorithms have been accused of reinforcing “political polarization” by Professor Filippo Menczer in a popular news article [2]. Apparently, Facebook is home to a big bunch of political groups which post polarized content to influence users towards / against certain political beliefs. Whether such content should be allowed on social networking websites, is debatable. After all, free speech is a thing! But the question that demands attention here is, did these politically polarized entities suddenly appear on Facebook around the election time? I mean, if they would’ve been around for long, Facebook would’ve known, right? And the effects of social network content on elections are well known and studied [5, 6, 7]. So Facebook would’ve definitely done something to at least nudge users when getting exposed to polarized political content. But polarized political content was never a point of concern for Facebook. So it probably didn’t exist until right before the elections. Right? Wrong!

Well, this is a literal “I told you so moment.” Last year, we conducted a large scale study of malicious Facebook pages, and one of our main findings was the dominant presence of politically polarized entities on Facebook among malicious pages. We analyzed the content posted by these politically polarized pages, and found that negative sentiment, anger, and religion dominated within such content. We reported our findings in the form of a technical report: https://arxiv.org/abs/1510.05828v1

It is good to know that what you work on, as part of research, connects closely to relevant, present day, real world problems, but it isn’t really a good feeling to realize that something you already knew could happen, happens anyway. We at Precog always push towards trying to make a difference and making the online world better and safer. We try our best, but we can only do so much.

To conclude, not bragging here (well, it’s not bragging if it’s true!), but we saw not one, but two real problems coming, more than a year before Facebook did.

You see, we’re called “Precog” for a reason. *mic drop*

References

[1] https://techcrunch.com/2016/11/10/facebook-admits-it-must-do-more-to-stop-the-spread-of-misinformation-on-its-platform/

[2] https://www.theguardian.com/technology/2016/nov/10/facebook-fake-news-election-conspiracy-theories

[3] https://en.wikipedia.org/wiki/Malaysia_Airlines_Flight_370

[4] https://www.scamwatch.gov.au/news/scammers-using-videos-of-malaysian-airlines-flight-mh370-to-spread-malware

[5] Williams, Christine B., and Girish J. Gulati. “Social networks in political campaigns: Facebook and the 2006 midterm elections.” annual meeting of the American Political Science Association. Vol. 1. No. 11. 2007.

[6] Williams, Christine B., and J. Girish. “Social networks in political campaigns: Facebook and the congressional elections of 2006 and 2008.” New Media & Society (2012): 1461444812457332.

[7] Douglas, Sara, et al. “Politics and young adults: the effects of Facebook on candidate evaluation.” Proceedings of the 15th Annual International Conference on Digital Government Research. ACM, 2014.

[8] https://newsroom.fb.com/news/2015/01/news-feed-fyi-showing-fewer-hoaxes/

[9] http://newsroom.fb.com/news/2016/08/news-feed-fyi-further-reducing-clickbait-in-feed/

[10] http://newsroom.fb.com/news/2014/11/news-feed-fyi-reducing-overly-promotional-page-posts-in-news-feed/

[11] http://newsroom.fb.com/news/2014/08/news-feed-fyi-click-baiting/

[12] http://newsroom.fb.com/news/2014/04/news-feed-fyi-cleaning-up-news-feed-spam/

[13] http://www.businessinsider.in/It-only-took-36-hours-for-these-students-to-solve-Facebooks-fake-news-problem/articleshow/55426656.cms

Teaching #PSOSMonNPTEL in a country of a billion: Experiences and take aways

Recently finished teaching my first course on NPTEL (National Program on Technology Enhanced Learning). NPTEL is like a Coursera of India. It is a joint initiative of the Indian Institute of Science (IISc) and the Indian Institute of Technology (IITs) and is managed by faculty from IIT, Madras.

I taught my signature course Privacy and Security in Online Social Media (PSOSM). The course was assigned noc16-cs07 number. I have taught this course previously at IIITD (CSE648, 4 times) and at Federal University of Minas Gerais (UFMG), Brazil (2 times). Below is the flier and here is the teaser video we created and used for the promotion of the course. The registration started on May 1 and went till July 15, by the end of this deadline, I had about 2200 registrations, but that number went up manifold when the registration date was extended by a couple of days. All efforts in promoting the course paid well, I had 5250+ students signed up for the course.

I had four amazing TAs assisting me on this course, all being my own Ph.D. students. Anupama Aggarwal, Prateek Dewan, Srishti Gupta and Niharika Sachdeva. They not only helped with tutorials, quizzes and tests but also functioned as tech support throughout the course. Special thanks to Prateek who took care of editing the videos and responding to the mailing list (there were even emails to prateek, referring him as faculty of the course!) and Niharika for managing the entire NPTEL portal.

I was getting mentally prepared for spending more time in preparing for this course, but it took way more time than what I had foreseen. It was my first time using Camtasia for recording lectures. Previously I have had my lectures recorded while I taught physically in a class. It feels very natural teaching a class full of curious students, interacting with them, asking / answering questions, but it is quite a different feeling teaching a class consisting of only one laptop and that too in your own office!

After some initial teething problems with recording and uploading, the course went on smoothly. As of writing this blog, I have 23,000 views on all the lecture videos that we uploaded as part of the course. Apart from videos, I also had one AMA (Ask Me Anything) session and one physical meeting at IIITD, where students could ask questions, clarify doubts or share their concerns directly with me and the TAs.

What the students felt, I will share later in this blog but personally it was a very satisfying experience. Many students all over India got to know me. Students from many smaller towns have taken this course. I received emails from college principals from tier II, tier III cities saying they had made this course a part of their curriculum and they have their best students taking this course. In my opinion, this is the biggest advantage of such online courses, it breaks geographical barriers and makes quality education and knowledge accessible to a larger audience. Out of the 5250 students, 152 students registered for the final exam and appeared for the exam; students have to pay some nominal fee to take this exam. I was super excited to have so many students pay for the course and take the exam.

NPTEL maintains a mailing list of all students registered for the course and that acts as a good medium for all of us; faculty, TAs and students to interact on a regular basis. This is where I was told that I have an American accent when I speak L or that in some videos my voice was very feeble. Also, as a practice, NPTEL requests students to fill a feedback form and shares the feedback with the faculty teaching the course and students also sent some through the mailing list. It feels very heartening to see some comments from the students and I take this opportunity to thank and congratulate them for their time and effort in finishing the course and giving a constructive feedback on the course. Some comments:

  • “Thanks for giving me a sense of satisfaction of doing a course.“
  • “Thanks a million to the whole Team. One of the best online course I ever had. There were days when I started posting queries at 10PM in the forum and TA’s helped me till I get what I wanted, some of the discussions went on till 1AM too. This shows how dedicated the team is!.”
  • “Feedback for an awesome course like this is really worth. Thank you PK sir for opening up such a treasure of knowledge. The best part of the course and it actually made the course different was the meet up at IIITD and also the hangouts session. The tutorials are really nicely presented and challenging for us.”
  • “I have gone through couple of other NPTEL certifications in recent years but this one was the best I would say…. Special thanks to Dr. PK. He was very interactive and an enthusiast. “
  • “firstly i am happy for taking this course, i did well in exam and very very thanks to all.. teaching faculty.. all teaching faculty did beyond the expectations..now i realise what are the skills  i have..  and thank you PK sir..and  lastly i say thank u NPTEL team.”
  • “5/5… Thank you IIIT-D, PK sir and the awesome TAs.

Below is the certificate that my TAs got for helping with the course.

Lessons learned / suggestions for doing a good job with teaching on NPTEL:

  • Prepare the lectures and record it before-hand (well before the date of uploading)
  • Have wonderful TAs, they are the secret for success!
  • Try to have Ask Me Anything or physical meeting sessions at least a couple of times
  • Keep the mailing list very active
  • If you are teaching a course that you teach otherwise in campus, please be aware that the students taking the course are not so well equipped compared to students in your class in campus.

I would definitely love to teach a course on NPTEL again! Until then goodbye to the NPTEL community!

Me, Myself and My Killfie: Characterizing and Preventing Selfie Deaths

Authors: Hemank Lamba, Varun Bharadhwaj, Mayank Vachher, Divyansh Agarwal, Megha Arora, Ponnurangam Kumaraguru

Our world is becoming smaller with time, bringing us closer and bestowing upon us a number of avenues to easily showcase ourselves in any manner we want. Perhaps the biggest facilitating agent in this regard, is Online Social Media (OSM). In a way, OSM replicates our world, with friends, interactions and constant information exchange. The world of OSM seems to have developed an interesting currency of its own too – LIKES and COMMENTS, the dollars and cents of the virtual realm; something which everyone aspires to have in abundance.

We are also familiar with the popular “selfie” phenomenon. Recognized as the “word of the year” by Oxford dictionary in 2013, the “selfie” is defined as a “photograph taken of oneself, and uploaded to a social media website.”  In recent years, there has been a sharp increase in the number of selfies posted on OSM. However, one particularly disturbing trend that has emerged lately is that of clicking dangerous selfies; proving to be so disastrous that during the year 2015 alone, there have been more deaths caused due to selfies than shark attacks all over the world [1]. Figure 1 shows examples of such selfies taken moments before the fatal incident. A selfie-related death can be defined as a death of an individual or group of people that could have been avoided had the individual(s) not been taking a selfie.

The level of threat that adventurous selfie taking behaviour exposes people to, is being acknowledged slowly by governments as well. Russian authorities came up with a public awareness campaign to enlighten citizens of the hazardous implications of taking selfies [2]. Similarly, Mumbai police recently classified 16 zones across the city as No-Selfie zones, after a rise in the number of selfie casualties [3].

The reason for this outrageous trend of dangerous selfies becomes clear when we combine the thoughts above. Since the advent of online social networks, people have developed an insatiable urge to be the most “popular” in their community. In medicinal terms, this has been long compared to forms of narcissism and in relation to selfies, termed as Selfitis [4,5,6]. This becomes the prime reason why people resort to performing risky feats while taking a selfie to garner more appreciation in the form of likes and comments from their friends online.

We, at Precog@IIITD chose to analyse the issue from a technical perspective and to dive deeper into what characterizes a selfie casualty/death, what kind of information we can extract from selfie images and how selfie casualties can be prevented.

Over the past two years, we found that a total of 127 deaths have been reported to be caused due to selfies, of which a whopping 76 deaths occurred in India alone! [7] Table 1 shows the country-wise distribution of selfie casualties across the world. The reasons for these selfie casualties were found to broadly belong to the following categories (Figure 2) at https://views.guru/:

  • Height Related – Selfie casualties caused due to people falling from an elevated location. [8]

  • Water Related – Selfie casualties caused due to drowning. [9]

  • Height and Water Related – Selfie casualties involving falling from elevated locations into a water body. [10]

  • Vehicle/Road Related– Selfie casualties caused due to vehicle accidents. [11]

  • Train Related– Selfie casualties caused due to being hit by a train.[12]

  • Weapons Related– Selfie casualties caused due to accidental firing of a weapon.[13]

  • Animal Related– Selfie casualties caused due to attack by an animal while taking the selfie with or near the animal.[14]

  • Electricity Related- Selfie casualties caused due to electrocution from live wires.[15]

Figure 2: (a) Number of Deaths and (b) Number of Incidents due to various reasons

Using a collective dataset of 138,496 tweets collected between August and September 2016, we implemented a three-fold architecture based on Image features, Location features, and Text features to quantify the danger level of selfies in our dataset.  Our machine learning model takes into account a variety of features to identify dangerous selfies along with their potential risks, and analyses common characteristics in these images. These features are supplied to four different classifiers with similar parameters to avoid bias in the results. Table 2 shows the sets of features we used for each feature type.

Table 2: Location-Based, Image-Based and Text-Based features used for classification of selfies

After thorough analysis, we found that the image-based features are the best indicators that accurately capture the dangerous nature of a selfie, in comparison to other feature-types. This seems logical as image features attempt to infer meaning directly out of the image, in a sense replicating our visual senses. Our model resulted in an accuracy of 73.6% for the task of identifying a dangerous selfie.

To further capture the risk type of a dangerous selfie, we used specific features that were relevant only to a particular risk type and supplied the data to our classifier. In particular, we concentrated on singling out dangerous selfies that belonged to height, water and vehicle related risks. We found that the set of features performing the best for this task was a combination of all 3 feature types – Image, Location and Text based features, and the best accuracy was obtained on the Water-related features. With remarkable accuracy, we have been able to establish a method to identify and capture the “danger level” of a selfie along with its risk type.

With the growing trend of dangerous selfies, it becomes important to spread awareness of the inherent hazards associated with people risking their lives simply for the sake of recognition on a virtual forum. As Shakespeare coins it, this type of “Bubble Reputation” induced by a dangerous selfie posted on OSM has claimed multiple lives lately. This work is a small contribution towards making the world safer, by making the people aware.

Our full report / paper on this work. You can access the portal and our dataset here.

References:

[1] http://www.telegraph.co.uk/technology/11881900/More-people-have-died-by-taking-selfies-this-year-than-by-shark-attacks.html

[2] https://www.theguardian.com/world/2015/jul/07/a-selfie-with-a-weapon-kills-russia-launches-safe-selfie-campaign

[3] http://metro.co.uk/2016/02/25/mumbai-orders-selfie-ban-after-19-people-die-5716731/

[4] S. Bhogesha, J. R. John, and S. Tripathy. Death in a flash: selfie and the lack of self-awareness. Journal of Travel Medicine, 23(4):taw033, 2016

[5] B. Subrahmanyam, K. S. Rao, R. Sivakumar, and G. C. Sekhar. Selfie related deaths perils of newer technologies. Narayana Medical Journal, 5(1):52–56, 2016.

[6] A. LAKSHMI. The selfie culture: Narcissism or counter hegemony? Journal of Communication and media Studies (JCMS), 5:2278–4942, 2015

[7] http://labs.precog.iiitd.edu.in/killfie/analysis

[8] http://www.telegraph.co.uk/news/2016/07/01/german-tourist-plunges-to-his-death-while-posing-for-picture-at/

[9] http://www.thenewsminute.com/article/selfie-deaths-two-men-drown-karnataka-couple-washed-away-tn-46735

[10] http://www.ndtv.com/cities/teenager-drowns-while-clicking-selfie-friend-dies-trying-to-save-him-1277217

[11] http://www.independent.co.uk/news/world/americas/selfie-crash-death-woman-dies-in-head-on-collision-seconds-after-uploading-pictures-of-herself-and-9293694.html

[12] http://timesofindia.indiatimes.com/city/varanasi/2-killed-while-taking-selfie-on-railway-tracks/articleshow/51850194.cms

[13] http://www.aljazeera.com/news/2015/07/russia-launches-safe-selfie-guide-light-deaths-150707132204704.html

[14] http://www.radar.ng/2016/04/elephant-tramples-boy-to-death-while.html?utm_source=nnd&utm_medium=twitter&utm_campaign=nnd

[15] http://www.thelocal.es/20140318/young-man-dies-in-train-selfie-fail

The complete picture: Visual Themes and Sentiment on Social Media for First Responders.



Researchers and academicians all over the world have conducted numerous studies and established that ​social media plays a vital role during crisis events. From citizens helping police to capture suspected terrorists Boston Marathon [5], to vigilant users spreading  situational awareness [6], OSNs have proved their mettle as a powerful platform for information dissemination during crisis.

Most of the aforementioned work has relied on textual content posted on OSNs to extract knowledge, and make inferences. Now the thing is, that online media is rapidly moving from text to visual media. With the prevalence of 3G, 4G technologies and high-bandwidth connectivity in most Internet enabled countries, images and videos are gaining much more traction than text. This is also natural, since the human brain is hardwired to recognize and make sense of visual information more efficiently [1]. Just using text to draw inferences from social media data is no longer enough. As we discussed in our previous blog, there is a significant percentage of social media posts which do not contain any text. Moreover, there’s also a large percentage of posts which contain both text, and images. The point to keep in mind here is, that images and text may be contradicting each other, even if they’re part of the same post. While text in Figure 1 inspires support and positive sentiment, the image (or more precisely, the text in the image) is pretty negative. This is what current research methodology is missing out on http://followersguru.net/.

Example of Facebook post

Figure 1. Example of a Facebook post with contradicting text and image sentiment.

Continuing our work on image and online social media, we​ decided to dig further into images posted on social networks, and see if images could aid first responders to get a more complete picture of the situation during a crisis event.​ We collected Facebook posts published during the attacks in Paris in November 2015, and performed large scale mining on the image content we captured. Typically, monitoring the popular topics and sentiment among the citizens can be of help to first responders. Timely identification of misinformation, sensitive topics, negative sentiment, etc. online can be really helpful in predicting and averting any potential implications in the real world.

​We were able to gather over 57,000 images using the #ParisAttacks and #PrayForParis hashtags put together, out of which, 15,123 images were unique. Analyzing such a big number of images manually is time consuming, and not scalable. So we utilized state-of-the-art techniques from the computer vision domain to automatically analyze images on a large scale. These techniques include Optical Character Recognition (OCR) [2], image classification, and image sentiment identification using Convolutional Neural Networks (CNNs). Figure 2 shows how a typical CNN model processes and classifies images [4].

Figure 2. Typical CNN model for object identification in images. Image taken from http://cs231n.github.io/convolutional-networks/

With all these “weapons”, we set out to mine the sea of images and see if we could discover something useful. And we struck gold right away. We used Google’s Inception-v3 model [3] for generating tags for images automatically, and looked at a few of the most popular tags. Interestingly, we found numerous instances of misinformative images, images containing potentially sensitive themes, and images promoting conspiracy theories among popular images. By the time we identified them, these images had gathered millions of likes, and hundreds of thousands of comments and shares. Some of these examples are listed below (Figure 3 – 6) at http://followersguru.net/buy-instagram-likes/.

Figure 3. Eiffel Tower turns off its lights for the first time in 63 years. This information was incorrect. Eiffel Tower’s lights are turned off every night between 1 am and 6 am following a ruling by the French Government.

Figure 4. Image incorrectly quoting the cause of death of Diesel, a police dog that helped the police during the attacks. The French Police later clarified that the actual cause of death was gunshot wounds from the French Police fleet itself, and not the suicide bomber.

Figure 5. Donald Trump’s insensitive tweet just after the Paris attacks. As the time stamp of the tweet suggests, this tweet was posted months ago, but resurfaced just after the attacks to defame the politician.

Figure 6. Picture claiming that a muslim guard named Zouheir stopped a suicide bomber from entering the Stade de France football stadium and saved thousands of lives. As later clarified by the security guard himself, such an incident never took place. Zouheir, the security guard was stationed at a different spot.

Applying OCR on the images in our dataset, we were able to extract text from about 55% of the images (31,869 out of 57,748 images). We wondered if this text embedded in images would be any different than the text that users post otherwise, in the orthodox manner. Upon analyzing and comparing the sentiment of image text and post text, we found that image text (extracted through OCR) was much more negative than post text (the orthodox text). In fact, not only was image text more negative, it was also different from post text in terms of topics being talked about. Table 1 shows a mutually exclusive subset of the most common words appearing in image text and post text. While post text was full of generic text offering prayers, support and solidarity, image text was found to mention some sensitive issues like “refugees”, “syria”, etc.

Top words in posts Top words in images
S. No. Word Normalized frequency Word Normalized frequency
1. retweeted 0.005572571 house 0.00452941
2. time 0.005208351 safety 0.004481122
3. prayers 0.005001407 washington 0.004297628
4. news 0.004713342 sisters 0.003940297
5. prayfortheworld 0.004431899 learned 0.003863036
6. life 0.004393821 mouth 0.003853378
7. let 0.004249789 stacy 0.003751974
8. support 0.004249789 passport 0.003708515
9. god 0.00401139 americans 0.003694028
10. war 0.003986557 refugee 0.00352502
11. thoughts 0.003882258 japan 0.002887619
12. need 0.003878946 texas 0.002781386
13. last 0.003797825 born 0.002689639
14. lives 0.003734914 dear 0.002689639
15. said 0.003468371 syrians 0.002607549
16. place 0.003468371 similar 0.002573748
17. country 0.003319372 deadly 0.002568919
18. city 0.003291227 services 0.002554433
19. everyone 0.003281294 accept 0.002554433
20. live 0.003274672 necessary 0.002549604
Table 1. Mutually exclusive set of 20 most frequently occurring
relevant keywords in post and image text, with their normalized
frequency. We identified some potentially sensitive topics among
image text, which were not present in post text. Word frequencies
are normalized independently by the total sum of frequencies of the
top 500 words in each class.
We also uncovered a popular conspiracy theory surrounding the Syrian “passports” that were found by French police near the bodies of terrorists who carried out the attacks, and were allegedly used to establish the identity of the attackers as Syrian citizens. Text embedded in images depicting this theme questioned how the passports could have survived the heat of the blasts and fire. This conspiracy theory was then used by miscreants to label the attacks as a false flag operation, influencing citizens to question the policies and motives of their own government. The popularity of such memes on OSN platforms can have undesirable outcomes in the real world, like protests and mass unrest. It is therefore vital for first responders to be able to identify such content and counter / control its flow to avoid repercussions in the real world.

Figure 7. Example of a picture containing text relating to a conspiracy theory questioning how the Syrian passports survived the blasts. We found hundreds of images talking about this topic in our dataset.

Images posted on OSNs are a critical source of information that can be useful for law and order organizations to understand popular topics and public sentiment, especially during crisis events. Through our approach, we propose a semi-automated methodology for mining knowledge from visual content and identifying popular themes and citizens’ pulse during crisis events. Although this methodology has its limitations, it can be very effective for producing high level summaries and reducing the search space for organizations with respect to content that may need attention. We also described how our methodology can be used for automatically identifying (potentially sensitive) misinformation spread through images during crisis events, which may lead to major implications in the real world.

Here is a link to the complete Technical report on this work. Big credits to Varun Bharadhwaj, Aditi Mithal, and Anshuman Suri for all their efforts. Below is an infographic of work.

References:

[1] https://www.eyeqinsights.com/power-visual-content-images-vs-text/

[2] https://github.com/tesseract-ocr/

[3] https://www.tensorflow.org/versions/r0.11/tutorials/image_recognition/index.html

[4] http://cs231n.github.io/convolutional-networks/

[5] Gupta, Aditi, Hemank Lamba, and Ponnurangam Kumaraguru. “$1.00 per rt# bostonmarathon# prayforboston: Analyzing fake content on twitter.” In eCrime Researchers Summit (eCRS), 2013, pp. 1-12. IEEE, 2013.

[6] Vieweg, Sarah, Amanda L. Hughes, Kate Starbird, and Leysia Palen. “Microblogging during two natural hazards events: what twitter may contribute to situational awareness.” In Proceedings of the SIGCHI conference on human factors in computing systems, pp. 1079-1088. ACM, 2010.