The notion of a ‘home’ community seems pretty unique to Reddit, since it has self-contained subreddits. Sure, you can triangulate as user’s home on FB and Twitter, but you wouldn’t be able to anchor it to anything singular1, which is key to the idea of ‘home’. This feels like a key definition to decide on, since the term ‘home’ carries a lot of weight and implications.
Within the context of Ideolect & Reddit, I can think of a few axes to decide a user’s home. For a user $U_{a}$ who participates2 in $n_{a}$ different subreddits, you could decide their home based on: their participation volume; the karma they receive; the language they use; and where they have the most power.
Each of these variables offers something interesting to ponder over. These distinctions are important and useful: Drawing a contrast between the user’s language-home (they use language similar to r/MGTOW) versus their participation-home (they post more regularly in r/AskMen, maybe using r/MGTOW language) could explain their characteristics. As a rule, bigger subreddits like r/pics, r/videos are more likely to generate higher raw participation and upvotes, but I think that’s just a thin (often misleading) slice of the story.
This becomes the underlying assumption if you argue that participation volume is the best proxy for home. The associated claim being that the place where you spend most of your time influences your behaviour and worldviews the most.
This is a reasonable claim, in all honesty. And I haven’t even included the replies they get. If I spend a lot of time commenting in Pokemon-related subreddits, it’s fair to argue that my home on Reddit is r/pokemon. But it also equates time spent with influence: something feels off there.
If you make your decision based on where they receive the most karma (adulation), your associated claim is that positive feedback is the key variable that influences a user’s behaviour, worldview, and interests.
With this approach, we also make a very specific claim about the user: they chase adulation and praise. It’s a reasonable claim to make, since it’s what most of us do. There’s the tiny question of accounting for subreddit size – getting 100 upvotes in a niche subreddit with 200 active monthly contributors is vastly different from getting the same in a massive subreddit like r/AskReddit with millions of active monthly contributors.
You replace participation and praise for the more complex notion of language or ideology. It lines up well with the idea of a spiritual home, a safe space for yourself, where there are others like you, and don’t have to go to great efforts to explain yourself or in-jokes.
I came up with this definition after thinking about a particular case: I spend 9 hours a day at my workplace, maybe 6 waking hours at my place of residence. If you assume my sense of self-worth is attached to how my work is received, then by measures 1 & 2, my workplace is home. But the inconsistency here is that you may not be calling the shots at your workplace, that you’re a cog in the machine, where you might have least freedom. Notions of power are deeply attached to what you term as a safe space and call home.
This is why I’m particularly attached to this definition over the ones of participation and adulation alone. It’s much more nebulous though, quite understandably.
In all of this, I’ve forgotten that home is a sticky concept, you don’t change it easily or regularly. One was to address this is by computing these for a wider window of time. But that still doesn’t add a penalty3 for the change of home.
I’d also want a user’s home to be singular. It’s a stronger restriction than others here, since people present multi-faceted interests online. How do I resolve ties? I don’t know.
You could also argue that some users can’t be pigeonholed into a single a home. And you’d have support from the literature: Zhang & others4 define a home community only for users who made more than 50% of their comments in one subreddit.
I’ve calculated these (except #3) for a sliding window of +2 and -2 months to get more stable results for all users, from 2016-19.
In a big chunk of cases, 2/3 of these are the same subreddit, so I’ve not been too worried about those cases. What trips me up are the cases where each one of those give different results.
For example (and this is from the actual data): A user X had these results in early 2016:
What is this user’s home? Any ideas?
You can see the answer lies in a composite score of these 4 measures. They might also be heavily correlated. I’m not sure how I’ll construct this, especially with the constraints of inertia and being singular. Maybe this isn’t even a machine learning task.
Actually, now that I think about it, you can organize such a definition around groups/pages for FB. I don’t have an answer for Twitter, yet. ↩︎
participate = make comments in. Since I can’t get subscriber lists and voting logs. ↩︎
I actually haven’t implemented a penalty. I have to think about this. ↩︎
Zhang, Justine, William Hamilton, Cristian Danescu-Niculescu-Mizil, Dan Jurafsky, and Jure Leskovec. “Community identity and user engagement in a multi-community landscape.” In Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, no. 1. 2017. ↩︎