Most electronic behavior traces available to social scientists offer a site-centric view of behavior. We argue that to understand patterns of interpersonal communication and media consumption, a more person-centric view is needed. The ideal research platform would capture reading as well as writing and friending, behavior across multiple sites, and demographic and psychographic variables. It wo…
Theorists have long predicted that like-minded individuals will tend to use social media to self-segregate into enclaves and that this tendency toward homophily will increase over time. Many studies have found moment-in-time evidence of network homophily, but very few have been able to directly measure longitudinal changes in the diversity of social media users’ habits. This is due in part to a…
This article explores the relative influence of individual and network-level effects on the emergence of online social relationships. Using network modeling and data drawn from logs of social behavior inside the virtual world Second Life, we combine individual- and network-level theories into an integrated model of online social relationship formation. Results reveal that time spent online and …
Twitter provides a direct method for political actors to connect with citizens, and for those citizens to organize into online clusters through their use of hashtags (i.e., a word or phrase marked with # to identify an idea or topic and facilitate a search for it). We examine the political alignments and networking of Twitter users, analyzing 9 million tweets produced by more than 23,000 random…
People create, consume, and share content online in increasingly complex ways, often including multiple news, entertainment, and social media platforms. This article explores methods for tracing political media content across overlapping communication infrastructures. Using the 2011 Occupy Movement protests and 2013 consumer boycotts as cases, we illustrate methods for creating integrated datas…
Content analysis of political communication usually covers large amounts of material and makes the study of dynamics in issue salience a costly enterprise. In this article, we present a supervised machine learning approach for the automatic coding of policy issues, which we apply to news articles and parliamentary questions. Comparing computer-based annotations with human annotations shows that…
This article examines the prevalence and nature of negativity in news content. Using dictionary-based sentiment analysis, we examine roughly fifty-five thousand front-page news stories, comparing four different affect lexicons, one for general negativity, and three capturing different measures of fear and anger. We show that fear and anger are distinct measures that capture different sentiments…
This study offers a systematic comparison of automated content analysis tools. The ability of different lexicons to correctly identify affective tone (e.g., positive vs. negative) is assessed in different social media environments. Our comparisons examine the reliability and validity of publicly available, off-the-shelf classifiers. We use datasets from a range of online sources that vary in th…
Researchers have long measured people’s thoughts, feelings, and personalities using carefully designed survey questions, which are often given to a relatively small number of volunteers. The proliferation of social media, such as Twitter and Facebook, offers alternative measurement approaches: automatic content coding at unprecedented scales and the statistical power to do open-vocabulary explo…
This article discusses methodological challenges of using big data that rely on specific sites and services as their sampling frames, focusing on social network sites in particular. It draws on survey data to show that people do not select into the use of such sites randomly. Instead, use is biased in certain ways yielding samples that limit the generalizability of findings. Results show that a…
Analytic techniques developed for big data have much broader applications in the social sciences, outperforming standard regression models even—or rather especially—in smaller datasets. This article offers an overview of machine learning methods well-suited to social science problems, including decision trees, dimension reduction methods, nearest neighbor algorithms, support vector models, and …
Over the past few years, we have seen the emergence of “big data”: disruptive technologies that have transformed commerce, science, and many aspects of society. Despite the tremendous enthusiasm for big data, there is no shortage of detractors. This article argues that many criticisms stem from a fundamental confusion over goals: whether the desired outcome of big data use is “better science” o…
One of the challenges associated with high-volume, diverse datasets is whether synthesis of open data streams can translate into actionable knowledge. Recognizing that challenge and other issues related to these types of data, the National Institutes of Health developed the Big Data to Knowledge or BD2K initiative. The concept of translating “big data to knowledge” is important to the social an…
Despite the apparent partisan divide over issues such as global warming and hydraulic fracturing, little is known about what shapes citizens’ willingness to accept scientific recommendations on political issues. We examine the extent to which Democrats, Republicans, and independents are likely to defer to scientific expertise in matters of policy. Our study draws on an October 2013 U.S. nationa…
Like lay people, experts vary in their technology optimism or pessimism about scientific endeavors, for reasons that are poorly understood. We explore experts’ technology optimism through a focus on genomics; its novelty, life-and-death implications, complex technology, and broad but as yet unknown societal implications make it an excellent subject for studying views about new knowledge. We use…
The cultural cognition thesis posits that individuals rely extensively on cultural meanings in forming perceptions of risk. The logic of the cultural cognition thesis suggests that a two-channel science communication strategy, combining information content (“Channel 1”) with cultural meanings (“Channel 2”), could promote open-minded assessment of information across diverse communities. We test …
Numerous factors shape citizens’ beliefs about global warming, but there is very little research that compares the views of the public with key actors in the policymaking process. We analyze data from simultaneous and parallel surveys of (1) the U.S. public, (2) scientists who actively publish research on energy technologies in the United States, and (3) congressional policy advisors and find t…
Evolution deniers do not need to establish their own scientific position but merely cast doubt on some aspect of evolution or obtain a small amount of legitimacy for creationism or intelligent design to sow sufficient doubt in the mainstream. This doubt is one of three pillars, along with demands for equal time and the incompatibility of science and religion, that Eugenie Scott has argued defin…
We use an experiment with a nationally representative sample of the U.S. population to examine how political partisans consume and process media reports about nanotechnology—a scientific issue that is unfamiliar to most Americans. We manipulate the extent to which participants receive ideological cues contextualizing a news article, and follow their subsequent information seeking about nanotech…
Health issues are increasingly becoming politicized, but little is known about how politicization takes shape in the news and its effect on the public. We analyze the evolution of politicization in news coverage of two health controversies: the uproar over the 2009 mammography screening guidelines and the 2006–2007 debate over mandating the HPV vaccine as a requirement for middle school–aged gi…