this post was submitted on 13 Apr 2024
15 points (100.0% liked)
technology
23212 readers
218 users here now
On the road to fully automated luxury gay space communism.
Spreading Linux propaganda since 2020
- Ways to run Microsoft/Adobe and more on Linux
- The Ultimate FOSS Guide For Android
- Great libre software on Windows
- Hey you, the lib still using Chrome. Read this post!
Rules:
- 1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
- 2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
- 3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
- 4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
- 5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
- 6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
- 7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
That probably worked for some, I pushshift's API returned just the latest revision and a timestamp it was edited at. For forums I archive, I store every event. If it's a small enough edit (using difflib I think) then I store the deltas. If it overwrote most of the comment I store the latest non-overwritten one and mark it as having been overwritten with the last event time on the comment.
Text is tiny, and with federation it's trivial to scrape but even centralized forums barely impede data archival.
Ya but it's unusual to be doing that.
In tests I was able to retrieve the text of comments deleted a year ago, when I had never even participated in lemmy. Certainly I have no archive of anything and if I did it wouldn't extend so far back.
I think it's fair to allow people to get rid of the low-hanging fruit if they want. Even though the internet is forever. Depending on the threat model, it might be good enough.