this post was submitted on 13 Apr 2025
56 points (100.0% liked)

Slop.

459 readers
118 users here now

For posting all the anonymous reactionary bullshit that you can't post anywhere else.

Rule 1: All posts must include links to the subject matter, and no identifying information should be redacted.

Rule 2: If your source is a reactionary website, please use archive.is instead of linking directly.

Rule 3: No sectarianism.

Rule 4: TERF/SWERFs Not Welcome

Rule 5: No bigotry of any kind, including ironic bigotry.

Rule 6: Do not post fellow hexbears.

Rule 7: Do not individually target other instances' admins or moderators.

Rule 8: Do not post public figures, these should be posted to c/gossip

founded 5 months ago
MODERATORS
 

I was playing around with Lemmy statistics the other day, and I decided to take the number of comments per post. Essentially a measure of engagement – the higher the number the more engaging the post is. Or in other words how many people were pissed off enough to comment, or had something they felt like sharing. The average for every single Lemmy instance was 8.208262964 comments per post.

So I modeled that with a Poisson distribution, in stats terms X~Po(8.20826), then found the critical regions assuming that anything that had a less than 5% chance of happening, is important. In other words 5% is the significance level. The critical regions are the region either side of the distribution where the probability of ending up in those regions is less than 5%. These critical regions on the lower tail are, 4 comments and on the upper tail is 13 comments, what this means is that if you get less than 4 comments or more than 13 comments, that’s a meaningful value. So I chose to interpret those results as meaning that if you get 5 or less comments than your post is “a bad post”, or if you get 13 or more than your post is “a good post”. A good post here is litterally just “got a lot of comments than expected of a typical post”, vice versa for “a bad post”.

You will notice that this is quite rudimentary, like what about when the Americans are asleep, most posts do worse then. That’s not accounted for here, because it increases the complexity beyond what I can really handle in a post.

To give you an idea of a more sweeping internet trend, the adage 1% 9% 90%, where 1% do the posting, 9% do the commenting, and 90% are lurkers – assuming each person does an average of 1 thing a day, suggests that c/p should be about 9 for all sites regardless of size.

Now what is more interesting is that comments per post varies by instance, lemmy.world for example has an engagement of 9.5 c/p and lemmy.ml has 4.8 c/p, this means that a “good post” on .ml is a post that gets 9 comments, whilst a “good post” on .world has to get 15 comments. On hexbear.net, you need 20 comments, to be a “good post”. I got the numbers for instance level comments and posts from here

This is a little bit silly, since a “good post”, by this metric, is really just a post that baits lots and lots of engagement, specifically in the form of comments – so if you are reading this you should comment, otherwise you are an awful person. No matter how meaningless the comment.

Anyway I thought that was cool.

you are viewing a single comment's thread
view the rest of the comments
[–] Awoo@hexbear.net 10 points 1 week ago (1 children)

Hexbear uses a different algorithm than other lemmy instances which highly favors comments over upvotes

Pretty sure this stopped being true when Hexbear merged onto the same branch as Lemmy.

[–] dead@hexbear.net 11 points 1 week ago* (last edited 1 week ago) (1 children)

https://hexbear.net/post/2294466

You may now notice a fresher and more engaging feed, remember you can set default sort settings in your user settings including choosing Local, All, or Subscribed communities in addition to the post sort method.

After hexbear federated, hexbear re-implimented the old hexbear-exclusive 'active' algorithm which highly favors posts which have numbers of comments.

[–] RedWizard@hexbear.net 7 points 1 week ago (1 children)

Its interesting you say that. Here is the change: https://github.com/hexbear-collective/lemmy/pull/6

The current Lemmy active sort will keep a thread at the top of the feed for a long time. The change makes the posts decay faster. If anything it does the opposite of what you say it does, or at least the default from Lemmy would be far worse.

[–] dead@hexbear.net 9 points 1 week ago* (last edited 1 week ago) (1 children)

How does that contradict what I'm saying? If the decay is more aggressive, then important news story posts with 0 comments will die even faster. Sometimes when I see an important news story post get 40 upvotes in an hour and then it dies because nobody commented on it. However, there are engagement-bait posts (example: posting a question in the title of the post) which last multiple hours. Increasing the decay rate only makes the cycle of death more quick. That doesn't target bait posts specifically. Bait is inherently resistant to decay.

I'll try to make an analogy. Imagine that Hexbear is a greenhouse with wonderful plants. Posts are the plants. Plants need water to survive. Posts need comments to survive. Some plants are beautiful. Some plants smell really bad. Some plants come with a little sign that says "water me". We want people to water the beautiful plants, the important plants, the interesting plants. Some plants are really awful and the greenhouse visitors decide to piss on those plants. The awful plants don't mind because they are sustained by piss. The greenhouse staff notices a problem. They post a new sign in the greenhouse. It says "We've introduced new soil to the green house which more rapidly dries so that plants won't live as long." Will drier soil in the greenhouse make the beautiful plants outlive the piss plants?

[–] RedWizard@hexbear.net 4 points 1 week ago* (last edited 1 week ago)

The inputs that decide what lives and dies are the same. A thread that lives under the current math and same inputs would simply live longer under the old math and same inputs. That's what I'm trying to say. Here, I even graphed it out so we can see what I'm talking about. One post, with the same vote score (1000), which both get one new comment every four hours for a week.

This is already bad enough, but, look at what happens when the posts are both gaining votes. This simply adds a random number between 0 and 5 every hour to the post score.

The old algorithm pushes the score even higher. It makes the thread creep up and up. Sure, it decays pretty fast between comments, but people will be returning to the thread from their inbox as they reply to people replying to them. This keeps the thread pushed up, inviting more people to leave top-level comments.

Just to illustrate your point, here are two Hexbear threads, one that receives no comments, and one that gets a comment every 4 hours, both with 1000 upvotes.

A thread with comments every 4 hours will have a sub 250 rank value after about 16 hours. A thread with a similar score and no comments for the duration will have a sub 250 rank value after just 4 hours. So, that's a long ass time.

The disparity between active and "inactive" threads is even worse in the default sort. A thread with a high enough score could maintain a top level position in the rankings for two who days, while one that gets no comments drops off in the first 5 hours.

This is, to my understanding, the most recent version of the "Hot Rank" that all other ranks are based on in Lemmy.

CREATE OR REPLACE FUNCTION hot_rank (score numeric, published timestamp without time zone)
    RETURNS integer
    AS $$
DECLARE
    hours_diff numeric := EXTRACT(EPOCH FROM (timezone('utc', now()) - published)) / 3600;
BEGIN
    IF (hours_diff > 0) THEN
        RETURN floor(10000 * log(greatest (1, score + 3)) / power((hours_diff + 2), 1.8))::integer;
    ELSE
        RETURN 0;
    END IF;
END;
$$
LANGUAGE plpgsql
IMMUTABLE PARALLEL SAFE;

The thing that makes a Hot Rank a Hot Active Rank is whether you are sending the Published Date to the Hot Rank function OR the most recent comment timestamp (clamped at 48 hours, after that, it defaults to Publish Date)

    diesel::update(post::table.find(post_id))
      .set((
        // Normal Hot Rank, uses Published date.
        post::hot_rank.eq(hot_rank(post::score, post::published)), 
        //Active Hot Rank, uses newest_comment_time_necro date. 
        post::hot_rank_active.eq(hot_rank(post::score, post::newest_comment_time_necro)), 
        post::scaled_rank.eq(scaled_rank(
          post::score,
          post::published,
          interactions_month,
        )),
      ))
      .get_result::<Self>(conn)
      .await

So, for two days, threads on a Lemmy instance other than Hexbear can have their timestamp refreshed, as if they were just posted, with their current score, so long as someone is leaving a comment.

So, this leads me to a couple of points.

  1. Yes, Hexbear threads with comments have a higher rank than threads with no comments, and decay way slower.
  2. The Hexbear algo depresses the impact of a new comment over time, meaning the thread decays faster than on normal Lemmy.
  3. The normal Lemmy active sort results in front pages that can feel stagnate every two days. This suppresses new threads, unless those threads get a lot of vote traction and sustained conversation.
  4. Every few hours, new threads are climbing to the top of the Hexbear front page.
  5. In both systems, "important news story posts with no comments" die faster because they have no comments. The default Lemmy algo may actually be better about this because the first comment isn't subject to a time penalty like on Hexbear.
  6. All our "bump" bots are pretty useless because they comment nearly instantly after someone summons them, not really bumping the thread at all, plus even if it was delayed, it has a lesser impact because of the decay. Those bots were way more effective under the old aglo.

You would need to come up with new or different inputs, or a whole new method of ranking that doesn't leave threads with no comments in the dust. Comment count could be considered, if a thread has a low comment count, or no comment count, then maybe it shouldn't be impacted by the decay in the Hexbear algo. Or the number of comments could speed up the thread decay, while threads with longer spans between comments do not.


If people want to double-check my work, you can find it here: https://gist.github.com/The-RedWizard/d4567266537673ce4d2009c518951154

I think my implementations are correct.