Determining the reason no one replied to your Lemmy post.

Agosagror@lemmy.dbzer0.com · edit-2 9 days ago

Determining the reason no one replied to your Lemmy post.

Agosagror@lemmy.dbzer0.com · edit-2 9 days ago

Look, I survived statistics class. I will stride to defend some of my post.

but it doesn’t explain what alternative hypothesis you’re leaning toward—high engagement versus low engagement isn’t inherently “good” or “bad” without further context.

Namely that much of the aim of it was to show that an metric like comment count doesn’t imply that it was a good or bad post - hence the bizarre engagement bait at the end. And also why all of the “good posts” were in quotes.

you might add a step that actually calculates the p-value for an observed comment count. This would give you a clearer measure of how “unusual” your observation is under your model.

I’m under the impression that whilst you can do a Hypothesis test by calculating the probability of the test statistic occurring, you can also do it by showing that the result is in the critical regions. Which can be useful if you want to know if a result is meaningful based on what the number is, rather than having to calculate probabilities. For a post of this nature, it makes no sense to find a p value for a specific post, since I want numbers of comments that anyone for any post can compare against. Calculating a p-value for an observed comment count makes no sense to me here, since it’s meaningless to basically everyone on this platform.

Using critical regions based on the Poisson distribution can be useful to flag unusual observations. However, you need to be careful that the interpretation of those regions aligns with the hypothesis test framework. For instance, simply saying that fewer than 4 comments falls in the “critical region” implies that you reject the null when observing such counts

Truthfully I wasn’t doing a hypothesis test - and I don’t say I am in the post - although your original reply confused me - so I thought I was, I was finding critical regions and interpreting them, however I’m also under the impression that you can do 2 tailed tests, although I did make a mistake by not splitting the significance level in half for each tail. :(. I should have been clearer that I wasn’t doing a hypothesis test, rather calculating critical regions.

It doesn’t seem like you are saying I’m wrong, rather that my model sucks - which is true. And that my workings are weird - it’s a Lemmy post not a science paper. That said, I didn’t quite expect this post to do so well, so I’ve edited the middle section to be clearer as to what I was trying to do.

TropicalDingdong@lemmy.world · 9 days ago

Well I appreciate the effort regardless. If you want any support in getting towards a more “proper” network analysis, I’ve dm’d you a link you can use to get started. If nothing else it might allow you to expand your scope or take your investigations into different directions. The script gets more into sentiment analysis for individual users, but since Lemmy lacks a basic API, the components could be retooled for anything.

Also, you might consider that all a scientific paper is, at the end of the day, is a series of things like what you’ve started here, with perhaps a little more narrative glue, and the repetitive critique of a scientific inquiry. All scientific investigations start with exactly the kind of work you are presenting here. Then you PI comes in and says “No you’ve done this wrong and that wrong and cant say this or that. But this bit or that bit is interesting”, and you revise and repeat.