From WEIRD to Wide: my keynote at CSCW 2024 in Costa Rica
This past November, I had the privilege of offering the closing keynote at CSCW 2024 – the ACM Conference on Computer Supported Cooperative Work and Social Computing – in San Jose, Costa Rica. I’m a huge admirer of CSCW and the work presented there, though it was my first time in attendance. (I was hoping to go the previous year, but my paper was rejected… which is what’s great about blind peer review.)
This was the first time CSCW has been held in a Latin American country, and the conference organizers are two brilliant Latina scholars, Rosta Farzan and Claudia López. The opening keynote was given by Paola Ricaurte, an amazing Mexican scholar who is rethinking how AI systems could be built to empower communities, rather than extracting data from them. I wanted to make sure that my remarks celebrated the presence of the conference in the Global Majority world and built on Paola’s work, which has deeply influenced my thinking.
What follows is unlikely to be an exact rendering of what I said on stage – I improvise from notes when delivering talks – but is a pretty good summary of what I hoped to say.
From WEIRD to Wide: Global Views of the Quotidian Web
November 13, 2024
Three behavioral scientists at the University of British Columbia – Joseph Heinrich, Steven Heine and Ara Norenzayan – published a remarkable paper in 2010 titled “The Weirdest People in the World?” The paper observes that a huge percent of research in psychology and behavioral economics comes from studying a very specific population of people: undergraduates at North American universities. An analysis of top journals in six subfields of psychology revealed that 96% of findings came from countries with less than 12% of the world’s population… and from a specific and narrow subset of individuals within those countries.
The authors characterize the subjects of these research studies as WEIRD: Western, Educated, from Industrialized and Rich Democracies. The bulk of their argument looks at research studies on apparently “universal” phenomena, like the Müller-Lyer illusion. Designed by sociologist Franz Carl Müller-Lyle in 1889, many people have difficulty seeing these two lines as equal in length.
But the tendency to see these lines as different in length is strongly culturally linked. Foragers and other people who spend a lot of time outside, like the San people of southwestern Africa, generally see the lines as the same size, which they are. Of populations studied in a comparative analysis, US university students ended up at the other end of the spectrum, estimating a 20% difference between line segments. The speculation is that the illusion relies on how perspective is represented in western art, and how we learn to see corners in enclosed spaces. People who spend a lot of time in enclosed, man-made spaces with corners see this illusion while people who spend more time in natural environments are less likely to see it.
Heinrich and colleagues examine dozens of comparative studies between people in the Global North and the Global Majority. They look at differences between North America and industrialized, rich populations in Asia, between university educated people and less educated people in the United States and come to the conclusion that, on many – not all – comparative measures, American university undergraduates are at the extremes of the distribution on psychological and behavioral economics tests.
This has significant implications for research. The studies being conducted are methodologically sound, but it’s not clear that you can extrapolate any general truths about human beings from them. “It is not merely that researchers frequently make generalizations from a narrow subpopulation. The concern is that this particular subpopulation is highly unrepresentative of the species”. In other words, it’s a really bad idea to run experiments on American undergraduates and conclude that the results are universalizable – not only is that not true, but American undergraduates are likely to be one of the least representative samples you could possibly study.
(By the way, one of the few groups even WEIRDer than American undergrads are American professors. In particular, Americans – and professors in particular – are extremely confident that we’re right. One study Heinrich and his colleagues cite finds that 94% of American university professors believe that we are above average. So perhaps take everything else I have to say today with a grain of salt.)
I’m interested in this paper because I am worried that scholars may be making some similar mistakes in studying social media. The scholars who experiment on American undergraduates aren’t particularly fascinated by this population: they are a sample of convenience, the easiest population to study.
We often use samples of convenience in studying social media as well. Frequently we design research based around what data is easily available. This led to lots of studies of Twitter based on access to the Twitter API – as a field, we probably paid a disproportionate amount of attention to a platform that was less important in terms of social influence than Facebook or Instagram based on the ease of studying it. We’ve recently seen a wave of work on Reddit based on access to the excellent Pushshift corpus and the ability to do full-text searches against years of data across a whole platform. A great deal of work on mis/disinformation focused on URLs with wide reach on Meta platforms as studied with Crowdtangle, which gives insight into widely shared URLs in public groups, but omits information about more obscure links shared.
You’ll note, of course, that all three tools I just mentioned – the Twitter API, Pushshift and Crowdtangle – are all inaccessible to most researchers at the moment. This is an unfortunate trend as regards “permissioned” ways of studying social media, and given Elon Musk’s influence in Washington, it seems likely that we will see increased hostility to research on social media platforms conducted in partnership with platforms. My lab has a history of “unpermissioned research”, creating data sets without the cooperation of the platforms that are being studied. This includes Media Cloud, a large data set of news stories from around the world, and a pair of data sets my team has been developing at UMass, TubeStats and TokStats.
My colleagues Ryan McGrady, Kevin Zheng and I started a project two years ago to generate a random sample of YouTube videos. On the one hand, this is very easy to do – YouTube videos have a predictable URL sequence, and all you need to do is generate random IDs and see which ones exist on YouTube’s server. The hard part is that you have to do this billions of times to find even a handful of videos. We were able to generate a set of random videos and found some shortcuts in the process, which means we should be able to generate a set of random YouTube videos going forward and share them with other researchers.
More recently, we’ve figured out how to do the same thing with TikTok, and we’re about to publish our first paper on that data set. In studying random videos from YouTube and TikTok, we’ve become acutely aware of two biases that we see as shaping what we know about online video in particular, and perhaps social media as a whole.
The first bias is a bias towards popularity. If you look at videos in terms of their influence on politics or on popular culture, it makes good sense to look at videos with large audiences. But it’s worth remembering how unrepresentative those videos are of the content on YouTube as a whole. Only 15% of all YouTube videos have a thousand or more views. If you’re sampling YouTube by collecting videos recommended by YouTube’s algorithm – a popular technique for developing samples of YouTube videos – you are mostly videos within that top 15% and many from the top 1-5%.
If your main concern is studying the influence of videos on broad audiences, it’s perfectly reasonable to study the most popular videos. But if you’re studying the broader dynamics of a platform, particularly the experience people having creating videos on these platforms – producer, rather than consumer, dynamics – it’s essential to look at this broader sample as well, and to recognize that the samples are very, very different. The videos that receive the most attention tend to follow the economic model of “like and subscribe” – they are videos made by influencers and brands, and they are hoping to achieve as large an audience as possible. But that’s far from the only way people use YouTube. Kevin Zheng likes to explain YouTube by showing these four videos:
Consider this video by John Green, a very successful YouTube influencer. It’s got 16 million views and is easily within the top 0.01% of YouTube videos – indeed, it’s helped define a category of explanatory videos and spawned countless imitators.
Let’s contrast with an example of a video from a failed influencer – here’s my TED Talk. Like John’s video, it’s been up for a very long time. And in the general scope of YouTube it’s pretty successful – it’s in the top 1% of videos. But it’s been seen by less than a million people and it’s an example of someone who was shooting for a wide audience and missed. It’s easy to assume that most YouTube videos hoped to be John Green and ended up falling short, but that’s not an accurate picture of what we’ve found online.
’
Contrast that, in turn, with a video that’s reached a much smaller audience, but arguably has done a better job of fulfilling its function. This is a school board meeting from Amherst MA, where our lab is based. It’s got about 175 views, putting it in the top third of all YouTube videos. But unlike my TED talk, it’s not a failure.
Indeed, let’s imagine for a moment that this school board meeting got 1 million YouTube views. That would indicate that something went really badly wrong during the meeting in question. The people who put this up weren’t looking for a huge audience – they were trying to bring transparency to a public meeting, to archive a civic event for future use. Those are entirely legitimate – and common – uses for YouTube, even if they are entirely different from influencer logic.
According to our sample, the median YouTube video has 42 views, which implies that half have even fewer views. To give audiences a sense for what these videos feel like, I showed brief clips from a dozen videos, including:
An overexposed image of a window, with a man’s voice intoning “The falling snow”
A highly produced video of a religious leader answering questions about managing one’s emotions, in an Indian language with English titles
A young girl dancing to a Mexican pop tune
Gameplay from a first-person shooter game
A brief clip from a South Asian religious ceremony
A snippet of a cartoon in Russia, possibly from the Soviet era
A mechanic explaining damage to a camshaft
Footage from Minecraft entitled “How diamonds are mined in different countries”
A woman singing a hymn in a church
A clip from a highly produced documentary with a synthesized voiceover
(I created this collage by collecting a dozen videos at random from our set retrieved from sampling YouTube. I took brief clips from each to try to show their content and ended up discarding one video that was sexually suggestive. I am not linking the videos for reasons discussed a few paragraphs below, i.e., concern that while these videos are publicly available, they may or may not be public documents.)
We’ve watched thousands of YouTube videos in our lab, and we’re starting to see some common patterns. There’s enormous amounts of videogame livestreaming, but there’s also millions of hours of religious services from all over the globe. There are ads for cars and apartments, how-to videos for simple and complicated things and lots of homework assignments.
To be clear – these videos are helpful for understanding YouTube from a production point of view, and less helpful for understanding a consumer point of view. Because our randomly selected videos often have very few views, they are not a great proxy for what people are seeing on YouTube. But they are an excellent way to see the diversity of uses people have found for posting and sharing online video. And the very breadth of uses people have put online video to creates a unique and valuable cultural archive.
I’ve started referring to this long tale of YouTube videos as “the quotidian internet”. There are thousands of ordinary, everyday ways to use YouTube or TikTok, and I think we neglect these non-influencer uses at our peril. For millions of people who publish content on YouTube, this – not launching a career as an influencer – is what YouTube is for, and the experience these users have is more like that of the teenager posting homework to YouTube than that of John Green.
Taken as a whole, these quotidian videos represent a fascinating sort of archive. Collectively they represent a picture of the world at different moments in time between 2005 and now. We can get a sense of what people were wearing, how we spoke, how our homes and our technology looked at these moments in recent history, much as we find looking through magazine ads from the 1950s or family photos from Hungary in the second half of the 20th century.
Pulling apart this sample of the quotidian internet by date to see how fashions or behavior change over time is only one way to approach this archive. We’re learning even more by pulling it apart by language and nationality. In our YouTube videos, we use OpenAI’s Whisper to guess at what language is being spoken in a given video. TikTok actually gives us the country from which each video was uploaded. That’s allowed us to create language specific corpora and work with researchers who’ve got the language and cultural knowledge to understand what’s going on.
Jane Pyo, a scholar of South Korean news media, is working with us to understand Korean-language YouTube, and made a surprising observation after watching her first 100 Korean-language YouTube videos: 9 of the 100 were about contemporary politics or news. We were stunned, because in the English-language videos we’d seen, news was extremely uncommon. It turns out that English is the exception here – in many other languages, YouTube is a space to post clips from the news, comment on them and post your own takes on the news.
Jane’s dissertation work looks at South Korean news as a low-trust, but high engagement space. News is perceived to have a strong right bias, and her research documents conflicts between left-leaning critics of the news and their targets in mainstream media. We see those patterns unfolding in Korean news on YouTube – there’s high engagement with news disseminated on the platform, but there’s also a lively culture of political commentary, including this video from a far-right influencer. He’s a middle-aged, far right guy who’s talking about politics as he drives, characterizing the current liberal government as illegitimate, arguing that there hasn’t been a legitimate president since Park Geun-hye was impeached for abuse of power. Given the rightwing lean of Korean media we might expect Korean YouTube to lean to the left, but never underestimate the energy of a middle-aged dude who thinks the world is against him. You’ll note that he has almost two thousand followers, even if this rant has only 38 views, suggesting a community of people using video to talk through political opinion.
Harshita Snehi in our lab has been leading our work to understand Hindi language YouTube videos. Like Jane, she’s finding a lot of news videos, but she’s been seeing a very different pattern. There are thousands of newspapers and small news stations in India, and many are owned by figures like Gautam Adani, a pro-Modi billionaire, who’s been purchasing independent media outlets and influencing their coverage.
Watching clips from these stations in our random sample, Harshi is seeing evidence of a media campaign that features people from rural India talking about ways Modi has made their lives better… and evidence that these videos are the result of careful engineering by Modi’s BJP party, bringing people from rural areas to capital cities and ensuring that they are featured in local media. Harshi’s from the same state as this woman and observes that she’s married and would generally appear in public with her hair covered, as you can see from the scarf she’s wearing. It’s not an accident that a pro-Modi station has found a pro-Modi woman willing to tell the story of how rural women love Modi-ji – these videos are a new form of astroturfing, a way of documenting the public’s love for BJP and ensuring anyone searching for local information on YouTube finds these opinions.
We decided to pay special attention to Hindi because there was a huge surge in Hindi content in 2020 and 2021 – we naively assumed that it had something to do with increased internet access in India. Harshi pointed out that TikTok was banned by the Indian government for national security reasons in 2020, and millions of Indian TikTok users poured from that platforms onto local platforms, onto YouTube and presumably onto Instagram as well.
This helped explain something we’d noticed about Hindi-language content on YouTube – videos that had under 40 views were more likely to have “likes” than comparable content in Russian, Spanish and English, and slightly more likely to have comments. Harshi started looking at these videos and discovered that lots of them are what we now call “friends and family” videos. They’re glimpses of daily life, of in-home religious ceremonies, likely shared with friends and family over WhatsApp or other small-group social networks.
TikTok has caught on incredibly fast in India and throughout Asia – in the graph above based on our estimates of TikTok’s growth, you can see TikTok in India outpace the rest of the world. No one catches up until over a year after India has banned the platform! Pakistan and Bangladesh, which share some cultural overlaps with India, remain two of the biggest countries for TikTok. We think TikTok is being used in these countries less as an influencer network and more like a video version of WhatsApp, a way in which friends and family are able to stay in touch with one another even when some folks in the conversation have low levels of literacy.
We’re now doing comparative studies of Hindi, Urdu and Bengali across YouTube and TikTok to try to test this hypothesis, which also helps explain why our early estimates of TikTok’s size – in terms of total videos hosted – show it to be several times larger than YouTube. That wouldn’t make much sense if TikTok were an influencer network like US YouTube, but it makes lots of sense if it’s a video-based social network. (We hope to release our paper on random sampling of TikTok, an estimate of TikTok’s total size and geographic distribution of TikTok producers in Q1 of 2025).
Our data-based work can tell us only part of the story. Testing our hypothesis of TikTok as a video-based social network will require both content analysis and ethnographic work. We will need people with linguistic and cultural experience to help us understand the content of the videos we’ve found and to conduct interviews with users of TikTok in Bangladesh and Pakistan. Our data helps us find hypotheses to test, but testing those hypotheses requires an array of mixed methods. But the ethnographic questions we are now asking come directly from analyzing data of quotidian creators.
Our discoveries with short-form video reveal a second bias we want to be careful to question: an assumption that the same technologies work the same way in different parts of the world. There’s a tendency to assume that because YouTube was started in the US that the ways it’s used in the US are the “right” ways and that these patterns will be seen in other nations.
But as we’ve drilled into how YouTube is used in different countries, it’s clear that generalizations are dangerous. And perhaps it’s especially dangerous for TikTok: if our estimates are right, about 5.9 billion of 81 billion videos on TikTok were uploaded from the US – roughly 7.3%. No European and no other North American countries appear in the top 10 – Mexico is #12. TikTok is a Chinese-builr network whose userbase is predominantly from the Global Majority and the responsible way to study it is going to be to center the perspectives of researchers from Pakistan, Bangladesh and Indonesia where the network has the most traction.
Twenty years ago, I was a researcher at Harvard’s Berkman Center when blogs were beginning to impact American politics. Colleagues at Berkman organized a conference about blogs and politics partially focused on how left-wing US bloggers were influencing the fringe of the Democratic party in the US. My colleague Rebecca MacKinnon and I felt like an important part of the story was being left out: the rest of the world. We invited bloggers from two dozen different countries to come to Harvard and talk about the different ways blogs were being used in Iran, Iraq, Malaysia and Kenya.
What emerged from that conference was Global Voices, a worldwide network of writers, translators and activists, excited to share what’s going on in their parts of the world with the rest of the internet. Global Voices looks like a global news site, and it is, but it’s also a giant participatory ethnography project, exploring the ways in which the internet – which allows us to express ourselves to local and global audiences – gets used differently by different groups of people around the world.
Global Voices is now twenty years old and has spawned a whole family of related projects. One of the many outgrowths of the community of thousands involved with GV is a set of scholars who work all over the world understanding how digital media works differently in diverse parts of the world. I’m hoping that this is one of several networks that can provide insights into the ways different communities are using online video.
But I also think there’s an amazing opportunity to build new networks around the idea of studying digital media from a global majority point of view and integrating multiple perspectives. I’m proud to be part of a new lab at UMass called GloTech – a group of scholars including Jonathan Ong, Burcu Baykurt, Seyram Avile, Wayne Xu and Martha Fuentes-Bautista, who are doing critical work on technology rooted in local knowledge and expertise and what we can learn from multiple perspectives. In cooperation with Marcelo Alvez at Pontifical Catholic University of Rio, GloTech organized one of the best conferences I’ve been to in years, Disinformation and Elections in the Global Majority, which brought scholars, journalists and activists from India, Moldova, South Africa, Indonesia, Myanmar, Brazil and the Philippines to look at how online disinformation is evolving in different media ecosystems.
This hope of building multiperspectival research networks is why I was so excited to have the opportunity to speak at CSCW this year, and particularly in a year we’re meeting in Costa Rica. Understanding how social media – and particularly social video – work requires us to get beyond the WEIRD biases that assume that how American influencers use social media is the way the rest of the world uses social video. We need to build collaborations between qualitative and quantitative researchers around the world, researchers who’ve got a deep understanding of local politics and culture, to understand the unique and creative ways video is being used differently around the world.
The data sets we’re creating are going to be shared through SOMAR with research teams who are willing to agree to the privacy protections necessary to share this data without exposing personally identifiable information. The videos we’re collecting are “public”, but most of them have only been viewed by a few dozen people – we believe that the ethical way to handle this data is to treat it as personally identifiable information. Within those constraints, my lab is excited about working with collaborators around the world – we’d like to share data with you, to learn from you, to coauthor with you, if appropriate. (When the data set on SOMAR is live, I will link to it here.)
In other words, we’re trying to take on work, understanding the value of quotidian video online, that can only happen if we’re able to work with quantitative and qualitative researchers around the world. More broadly, it means we need more structures to feature the excellent work already being done around the world on social media and to help it reshape our preconceptions about the digital world.
Henrich and his colleagues warn that there’s a danger of WEIRD bias in behavioral research, and I think it’s likely that we’ve replicated some of those biases in social media research. But we have the potential to go from WEIRD to wide, to demonstrate that social media evolves differently in different parts of the world, that different people use tools in different and valid ways. We’d love your help exploring the wide world of quotidian video and understanding the global ways this space is evolving.
The post From WEIRD to Wide: my keynote at CSCW 2024 in Costa Rica appeared first on Ethan Zuckerman.
Source: https://ethanzuckerman.com/2025/01/13/from-weird-to-wide-my-keynote-at-cscw-2024-in-costa-rica/
Anyone can join.
Anyone can contribute.
Anyone can become informed about their world.
"United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.
Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world. Anyone can join. Anyone can contribute. Anyone can become informed about their world. "United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.
LION'S MANE PRODUCT
Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules
Mushrooms are having a moment. One fabulous fungus in particular, lion’s mane, may help improve memory, depression and anxiety symptoms. They are also an excellent source of nutrients that show promise as a therapy for dementia, and other neurodegenerative diseases. If you’re living with anxiety or depression, you may be curious about all the therapy options out there — including the natural ones.Our Lion’s Mane WHOLE MIND Nootropic Blend has been formulated to utilize the potency of Lion’s mane but also include the benefits of four other Highly Beneficial Mushrooms. Synergistically, they work together to Build your health through improving cognitive function and immunity regardless of your age. Our Nootropic not only improves your Cognitive Function and Activates your Immune System, but it benefits growth of Essential Gut Flora, further enhancing your Vitality.
Our Formula includes: Lion’s Mane Mushrooms which Increase Brain Power through nerve growth, lessen anxiety, reduce depression, and improve concentration. Its an excellent adaptogen, promotes sleep and improves immunity. Shiitake Mushrooms which Fight cancer cells and infectious disease, boost the immune system, promotes brain function, and serves as a source of B vitamins. Maitake Mushrooms which regulate blood sugar levels of diabetics, reduce hypertension and boosts the immune system. Reishi Mushrooms which Fight inflammation, liver disease, fatigue, tumor growth and cancer. They Improve skin disorders and soothes digestive problems, stomach ulcers and leaky gut syndrome. Chaga Mushrooms which have anti-aging effects, boost immune function, improve stamina and athletic performance, even act as a natural aphrodisiac, fighting diabetes and improving liver function. Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules Today. Be 100% Satisfied or Receive a Full Money Back Guarantee. Order Yours Today by Following This Link.