I had an inclination towards research ever since I joined IIIT-D. I admired my Seniors who got admits from the Best CS Universities of the world, which felt like the next milestone to achieve. Looking at the profiles of several of them, I realised that there was an eminent underlying force behind, that was PK! I was awestruck by the posts that PK used to do about the achievements of his students on Social Media.
I had done research in the Security domain during my first 2 years and Security in Social Media looked like a fascinating field to explore. I joined the DHCS course in the beginning of 2016, that’s when I got to interact with PK. His way of teaching is unique, and he instills an aura of enthusiasm and interest as he delivers lectures. From him I got a new perspective to look and analyse the everyday interfaces.
I decided to apply for an Internship with Precog for the Summer of 2016. The interview was grilling and I kept my fingers crossed till the result was announced. After receiving acceptance from Precog I also received an Internship Offer from IBM Research Labs. I was faced with a dilemma, both being Great Research opportunities. I talked to PK and he offered me a part time internship with Precog.
I was excited to start working with the group. Anupama introduced me to the problem statement we were going to be working on, “Privacy Leaks through Browser Extensions”. I was thrilled to know that we were going to have a collaboration with Bimal Viswanath from Bell Labs. I had read his research papers before and was intrigued by his findings. Working with Precog feels like, you have been blessed with a network of the Top researchers across the globe with whom you can discuss, learn and work.
I moved to Bangalore to join IBM Research and worked remotely. I used to stay in office till late night reading Papers, and also Precoged during the weekends. PK used to be in constant touch as my guide and a mentor. Precog felt like a family and working on the Research Project was fun. Whenever I was stuck anywhere I was offered full support, I could message anytime and Anupama was lightning fast with her replies. I learnt the Skills of collaboration, time management and most importantly smart-working rather than hard-working.
Coming back from Bangalore I was in full swing to join the group with vigour. PK always has suggestions to improve efficiency and quality, his visions and ideas can be clearly seen imbibed in the group. Attending regular meetings, ‘whatsup’ and BM sessions, I have learned far too many things to be enumerated. Earlier I was intimidated by the rapid progress of the group, now I take pride with every new accomplishment of every member. Since every other day there is a new achievement being discussed in the group’s mailing list. Association with Precog has been a Roller-Coaster ride, where we have worked as team and partied as friends. I have been really lucky to be associated with the Coolest Research Group of IIIT-Delhi.
Below is the picture taken on Jan 4, 2016, Precog’s birthday!
India is going digital in a big way; from banking to manufacturing to agriculture, each field is seeing the penetration of technology. Police organizations also have started using technology for effective policing. Most police organisations now have an official website, a Facebook page and a Twitter handle. Police not only use these new media services to showcase their organisation but also to interact with citizens very regularly. Police posts on Facebook or tweets on Twitter include a variety of topics ranging from traffic advisories, to awareness creation to bragging about their achievements. Similarly, the growing technology savvy population of India is using these mediums to share their grievances, concerns, etc. with the police. With a handful of police officers serving 1.25 billion people, it is no surprise that a lot of posts/tweets by the citizens go unnoticed by the police. Even features like tagging police commissioners and police accounts do not always yield the expected response, causing a sense of resentment. The police too find themselves helpless given the multitude of things.
With our continued interest in empowering police organizations with technology which can help them in their day-to-day activities, we have been working in the space of online social media and policing for some time now. For our research publications in this space, please visit here. For effective communication between the citizens and police, it is necessary for the police to understand the vast amount of content generated on their social media accounts. In this direction, we started thinking about how to break up the content into important versus unimportant, urgent versus non-urgent, etc. Our main aim in this research was to help police identify ‘serviceable’ content which can be served quickly and efficiently. Requests to which police should respond, evaluate or take action are considered as serviceable requests.
We analyzed 85 official Facebook pages of police organizations in India and studied the nature of posts that citizens share on police Facebook pages. Not all posts require the same amount of attention from the police, there are some cases where immediate action needs to be taken while some can wait. Based on this analysis, we came up with six textual attributes that can identify serviceable posts; posts that need some kind of police response. We find such posts are marked by high negative emotions, more factual, and objective content such as location and time of incidences.
We identify four types of response that citizens may get on their posts:
(a) Forward: Posts which had enough information and could be forwarded to appropriate authorities for action. For instance, a resident posted, Date : 4/11/2015 (Wednesday), Time : 10:17 pm, Number : [withheld], Location : [withheld], Violations : Crossing line by way too much obstructing the vehicles which were coming from [withheld] entrance later he jumped the signal ……..
(b) Give Solution: Posts mostly included queries by residents to police that could be answered without any detail; resident asks, Admin !! Can U Explain to Me How Two Challans On Same Date Same Time in Just 5 Minutes Gap !! How Its Possible ?? Any Thing Wrong ??
(c) Acknowledge with thanks: Posts to which the police wrote “thanks for sharing the information” or “thanks for the appreciation.” For instance, resident remarks, Chennai City Traffic Police a humble salute from a fellow Chennaiite for the commendable job in such rains!!
(d) Need more details: In these resident’s posts, police inquired more details so that action could be taken, e.g., a resident asks, Cops driving wrong side [of road] near XXX hotel .. what action will be taken against them ? This post lacks information such as time and date when the incident happened.
To enhance response to serviceable posts, we propose a request – response identification framework. The approach followed in the paper is shown below:
Understanding Requests from Citizens:
Residents often use different language styles in posts while expressing their concerns and asking queries to police. Our approach includes following six category of features to characterize serviceable posts:Emotional Attributes,Cognitive and Interpersonal Attributes, Linguistic Attributes, Question Asking Attributes, Entity-Based Attributes, and Topical Attributes. These include the both handcrafted features and LDA / NMF based features that help automatically discover the latent dimensions and induce semantic features in our data.
Our analysis shows some intriguing results:
Serviceable requests show significantly higher value of negative emotional states i.e. “anger” (+15.38%), “disgust” (+47.8%), “fear” (+60%), and “sadness” (+10%) in comparison to non-serviceable requests. Most frequent topic is includes queries / question posed to police (Complaints represents complaints against cops in- correct decisions).
Comparing serviceable sub-types, we observe that 93.10% posts in Thanks sub-type did not receive a response from police. Posts in Forward sub-type received the maximum number of responses from police (63.6%, 182 posts). Table 1 below summarizes the number of posts that did not receive police responses.
Table 1: Number of posts that received responses (N of Events) and censored event showing posts that did not get response from the police.
Automated Classifier for Serviceability:
Our work explores a series of statistical models to predict serviceable posts and its different types. The model makes use of the content based measures – emotions, cognitive attributes, linguistic, question posed, entity and topical attributes. We explore five different classification algorithms – Random Forest (RF), Logistic Regression (LR), Decision Trees (DT), Adaptive Boosted Decision Trees (ADT), and Gradient Boosting Classifier (GBC) using balanced class weights. Table 2 below reports the performance of different algorithms to correctly identify serviceable posts.
Table 2: Mean Performance after 10-fold CV of different algorithms to correctly identify serviceable posts.
Through our work, we believe technological interventions can help increase the interactions between police and citizens and thereby increase the trust people have on police. The police too may have a more directed and cost-labour efficient mechanism in dealing with any law and order situation reported on their Facebook page. This will increase the overall well-being and safety of society.
Full citation & link to the paper: Sachdeva, N., and Kumaraguru, P. Call for Service: Characterizing and Modeling Police Response to Serviceable Requests on Facebook. Accepted at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW), 2017. PDF
College students’ mental health concerns are a persistent issue; psychological distress in the form of depression, anxiety and other mental health challenges among college students is a growing health concern. However, very few university students actually seek help related to mental illness. This arises due to various barriers like limited knowledge about available psychiatric services and social stigma. Further, there is dearth of accurate, continuous and multi-campus data on mental well-being which presents significant challenges to intervention and mitigation strategies in college campuses.
Recent advances in HCI and social computing show that content shared on social media can enable accurate inference, tracking and understanding of the mental health concerns of users. There has also been work showing that college students appropriate social media for self-disclosure, support seeking and social connectedness. These facts, coupled with the pervasiveness of social media among college students, motivated us to examine the potential of social media as a “measure” for quantifying the mental well-being in a college population. Specifically, we focused on the following research goals:
Building and validating a machine learning model to identify mental health expressions of students in online communities
Analysing the lingusitic and temporal characteristics of the identified mental health content
Developing an index for the collective mental well-being in a campus, and examining it’s relationship with university attributes like academic prestige, enrollment size and student body demographics
We obtained a list of 150 ranked major universities in the US by crawling the US News website. We also obtained university metadata like gender distribution, tuition/fee during this crawl. Next, we crawled the Wikipedia pages for these 150 universities for extracting the student enrollment, type of university (public/private) and the setting (city/urban/suburban/rural) at each institute. Lastly, we obtained information on the racial diversity at each university from an article on Priceonomics. We study these universities in our work and use the metadata in our analysis.
For social media data, we focus on Reddit. Reddit is known to be a widely used online forum and social media sites among the college student demographic. It’s forum structure allows creation of public online communities (known as “subreddits”), including many dedicated to specific college campuses. This allowed us to collect a large sample of posts shared by university students in one place. Although Facebook is likely more popular/widespread among students, it is challenging to use Facebook in such studies since the content shared is largely private, making it challenging to obtain such large data from it. Further, the semi-anonymous nature of Reddit enables candid self-disclosure around stigmatized topics like mental health.
After a manual search for subreddits for each university, we were able to identify public subreddit pages for 146 of the 150 universities. Next, we focused on correcting the “under-adoption” bias in subreddits. Subreddits which had a small fraction of Reddit users (as compared to university enrollment) were filtered out due to being under-representated. This left us with 109 universities with adequate Reddit representation. We leveraged the data on Google BigQuery (combined with some additional data collection) to get all posts ranging from June 2011 to February 2016. The final dataset used for our analysis included 446,897 posts from 152,834 unique users.
Since Reddit data does not contain any gold standard information on whether a post in a university subreddit is a mental health expression, our first goal was to use an inductive transfer learning approach to build a model to identify such content in a university subreddit. First, we include (as ground truth data) Reddit posts made on various mental health support communities. Prior work has established that, in these communities, individuals self-disclose a variety of mental health challenges explicitly. We use these posts as the “positive” posts and, parallelly, we utilize another set of Reddit posts, made on generic subreddits unrelated to mental health, as “negative” posts. We obtain 21,734 posts for each category, which we use as the positive and negative class for building a classifier. We observed a validation accuracy of 93% and an accuracy of 97% on a test set of 500 unseen, expert-annotated posts from our university subreddit data. We then proceeded to use this classifier for labelling the 446,397 other posts across the 109 university subreddits. Our classifier identified 13,914 posts (3.1%) to be mental health expressions, whereas the rest of the 432,483 posts were marked not about the topic. This corresponded to 9010 unique users out of a total of 152,834.
Next, we looked at the linguistic characteristics of the posts identified to be mental health expressions by conducting a qualitative examination of the top n-grams uniquely occuring in these posts. We found that students appropriate the Reddit communities to converse on a number of college, academic, relationship, and personal life challenges that relate to their mental well-being (“go into debt”, “doing poorly in”, “only one homework”, “up late”, “the jobs i”). The n-grams also indicated that certain posts contained explicit mentions of mental health challenges (“psychiatric”, “depression”, “killing myself”, “suicidal thoughts”), as well as the difficulties students face in their lives due to these experiences (“life isnt”, “issues with depression”, “was doing great”, “ruin”, “cheated”). Some of the top n-grams were also used in the context of seeking support (“need help”, “i really need”, “could help me”).
For the temporal analysis of mental health content, we first study the proportion of posts with mental health expression across the years. The figure below shows the content per year (along with a least squares line fit). We observed that the proportion of posts with mental health expressions has been on the rise — there is a 16% increase in 2015, compared to that in 2011.
We then looked at how this trend varies over the course of an academic year. The plots below show the trend separately for universities following the semester system and the quarter system. Between August and April, for the universities in the semester system, we observed an 18.5% increase in mental health expression; this percentage was much higher: 78% for those in the quarter system, when compared between September and May. On the other hand, we observed a reverse trend in mental health content during summer months, for both semester and quarter system universities.
Lastly, as a part of our third research goal, we formulated an index we refer to as the Mental Well-Being Index (MWI), as a measure of the collective mental well-being in a university subreddit, based on the posts labelled as mental health related by the classifier. We then computed the MWI metric for all 109 subreddits and examined it’s relationship with the university attributes.
By visualising these relationships (as above), we gleaned several interesting observations. We found:
Universities with larger student bodies (enrollment) as well as greater proportion of undergraduates in their student bodies tend to be associated with lower MWI
MWI of the 66 public universities we consider, is lower, relative to that in the 43 private universities, by 332%
MWI is lower in the 7 rural and 33 suburban universities by 40-266% compared to others, while it is the highest in the 31 universities categorized to be in cities (by 29-77%)
Universities with higher academic prestige (or low absolute value rank) and higher tuition tend to be associated with higher MWI
MWI tends to be lower in universities with more females (or sex ratio, male to female <= 1) by 850%
Further, although our data shows a marginally lower MWI in universities with greater racial diversity, we did not find statistical significance to support this claim.
Our work here (the complete paper accepted at CHI 2017) further details our analysis in depth. Below is an infographic for our work.
Holà! It’s the first day of 2017. All of us just got done with looking back at the past year, trying to fathom how time flies and life metamorphosizes. My life has taken a leap too and this is my last blog as a part of the ‘I have been Precog-ed’ series. Earlier, I have written about my first stint at research (Part 1), a wonderful summer at the Information Sciences Institute at Marina Delray, Los Angeles (Part 2), my first paper presentation at ICWSM 2016 in Germany (Part 3), and my time at Precog. This post is about the last 6 months of my journey and an attempt to express what being a Precog-er is all about (for more on this, please read the first three parts too). Being a Precog-er for more than 3 years, I have more thoughts than I can ever pen down; from being an undergrad who joined Precog as a noob to a grad student at Carnegie Mellon University, my path has always been illuminated by the light of learning and hope.
April 2016 – I was struggling with end-sem preparations, document processing and Visa applications for my trip to ICWSM and my masters in the States, and the humdrum undergrad life when an unexpected email got an unexpected reaction from me –
We are pleased to inform you that you have been selected as an one of the 40 CERN Openlab Summer Students2016 (out of 1461 applicants)! For nine weeks, CERN will be your host for what we hope is going to be an interesting, fun and active summer…”
I have been an amateur astronomer for 9 years, and getting to work at the ‘Mecca of Particle Physics’ would have been a dream come true. I knew I wouldn’t be able to make it. I was applying for my Schengen Visa for Germany (which would take another 2 weeks), and then I had to start my application for the US visa. I needed another Schengen Visa for Switzerland in a span of one week. On top of that, the only dates I could select for the internship were overlapping with my initial orientation schedule at CMU. I almost disrupted a meeting in PK’s office to break the news to him. I was sad. Pillars (Ph.D. students at Precog) and PK were convinced that I should try and if it doesn’t work out, so be it. That’s a Precog trait – not giving up until you have given your best shot! After cutting short the duration of my summer at CERN, pushing CMU to allow me to skip the orientations (convincing them that I’ll manage when I wasn’t sure myself I’ll), and getting my Schegen for Switzerland in a day (thanks to CERN’s administrative staff who made a special request for me to the embassy), I was ready for a summer at CERN.
I worked for 2 months at CERN’s data center on a storage system of ~125PB (one of the largest in the world). CERN openlab program includes a lecture series to helps CS students understand the Physics needed for some of the projects, trips to ETH Zürich and EPFL Lausanne, hackathons, and several means to help the students gain insights about the revolutionary projects spanning across 100 hectares in Switzerland and more than 450 hectares in France! It was a humbling experience, which entailed learning something new every day. Europeans have nailed the work-life balance too. Along with finishing my project on time, I managed to check Geneva, Lausanne, Lyon, Zürich, Paris, Montreux, Bern, Engelberg, Chamonix and many more off my list!
Delhi for 2 days, and Pittsburgh was my next destination, my home for the next 16 months. I am an MSCS student at CMU now. Last to arrive and one of the youngest of the lot, thanks to PK I had ample of background knowledge about life as a student here and the city of Pittsburgh. The experience I have gained at Precog comes in handy when I have to identify research gaps and solve hard problems. I feel more equipped and confident to take up the challenges that come along with grad life at a school like CMU.
Throughout these 6 months (Jul – Dec 2016), I have been working with a few Precog-ers on what we now call the Killfie project. It has turned out to be one of the most exciting projects I have worked on as a part of the group. It is the inclination to work on interesting problems with some brilliant people, which gives me the motivation to find time for this amongst courses and projects at CMU.
I cannot finish this blog without revisiting these lines from my first blog – “…PK, the heart and brain of Precog. He is the coolest adviser I have ever met and his skills and dexterity at work are almost mind-boggling. I came to know him as my Probability and Statistics professor, the role changed to being my adviser working at Precog and now I see him as a mentor for life..”. A lot of what I have been able to achieve in the last 3 years, I owe it to PK’s unconditional support. Thank you PK for illuminating my path always and for proving what good mentorship can accomplish! My time at Precog has taught me how to help people, make friends, eliminate distractions and focus, improve daily, think big, fail often and give nothing short of your very best effort! I have had last minute unscheduled video calls in the middle of the night from the other end of the world with Precog-ers when I needed help. Pillars, interns, RAs – thank you each one of you for this experience. Even though I live in a different time-zone now and my attendance at the 4th floor Ph.D. lab has been at an all-time low, I know my association with the group will last forever. As has been rightly put – ‘Once a Precog-er, always a Precog-er!’.