this post was submitted on 11 Sep 2024
186 points (97.9% liked)

Fediverse

27729 readers
296 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago
MODERATORS
186
Map of 2000+ lemmy communities (danterious.codeberg.page)
submitted 1 week ago* (last edited 1 week ago) by Danterious@lemmy.dbzer0.com to c/fediverse@lemmy.world
 

This is my first try at creating a map of lemmy. I based it on the overlap of commentors that visited certain communities.

I only used communities that were on the top 35 active instances for the past month and limited the comments to go back to a maximum of August 1 2024 (sometimes shorter if I got an invalid response.)

I scaled it so it was based on percentage of comments made by a commentor in that community.

Here is the code for the crawler and data that was used to make the map:

https://codeberg.org/danterious/Lemmy_map

you are viewing a single comment's thread
view the rest of the comments
[–] Danterious@lemmy.dbzer0.com 23 points 1 week ago* (last edited 1 week ago) (1 children)

Well I used dimensionality reduction to make it 2D so the axes are how the algorithm chose to compress it.

The original data had each data point as a community and the features as a frequency of a user posting in that community.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] threelonmusketeers@sh.itjust.works 9 points 1 week ago (1 children)

I used dimensionality reduction to make it 2D

Huh, interesting. So is the idea to spread the data out as much an possible, while keeping "similar" communities near each other? What was the dimensionality of the original set?

[–] Danterious@lemmy.dbzer0.com 10 points 1 week ago* (last edited 1 week ago) (1 children)

Total communities: 2986

Total users: 21934

So the dimensions were reduced from (2986, 21934) to (2986, 2)

Edit: Also yeah it is using Umap for the algorithm and it does do something pretty similar to what you described.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] keepthepace@slrpnk.net 8 points 1 week ago (1 children)

That's really interesting! It shows which communities share users. I am part of jlai.lu, a french-speaking community that is relatively isolated by also slrpnk.net that seems very spread out!

Would it make sense to compute the standard deviation of each instance's communities? It would give an idea of which are islands and which are more extended. Not sure if it makes sense to compute it more on 2 dimensions or on the original 21934 though.

[–] Danterious@lemmy.dbzer0.com 7 points 1 week ago

Yeah that sounds like a good idea so you can see how connected local communities are. Probably makes more sense to use original dimensions so no extra information is lost.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~