10
submitted 10 months ago* (last edited 10 months ago) by Crul@lemm.ee to c/reddit@lemmy.world

Hi, I requested reddit for my data and I got 16Mb of CSVs... which is a considerable amount. Do anyone know of any tool to process / visualize / search ... the data. I asume the format is the same for everyone, so maybe someone has already built something like that.

EDIT: the problem is not performance, with files <5Mb I can search with notepad++ in miliseconds. What I'm looking for is a user friendly interface (ideally with thumbnail images, links and such).

The problem with searching for "reddit export data visualizer" is that Google shows posts from reddit about visualization of generic data.

Thanks.

top 10 comments
sorted by: hot top controversial new old
[-] slazer2au@lemmy.world 2 points 10 months ago

Firstly, can you even open the Csvs? If you can then Power Bi Desktop by Microsoft is the emerging goto for data visualisation

[-] Crul@lemm.ee 5 points 10 months ago* (last edited 10 months ago)

Yes, no problem reading the CSVs, sorry if that wasn't clear.

I was looking for something more specific. Ideally something like a local web app that renders the posts, comments,... in a webpage with thumbnails and links to reddit elements.

But that's probably asking too much :).
Thanks for the suggestion!

[-] lemann@lemmy.one 2 points 10 months ago

If you find one, let me know pretty please...

I found a UI for my Hangouts data a while back, occasionally skim through those old chats once in a while. It's nice to have a tool that visualises data request files in a user friendly way

[-] Crul@lemm.ee 3 points 10 months ago* (last edited 10 months ago)

I'm searching on github different CSV filenames and I found a couple of projects that may be relevant:

EDIT: This one also looks interesting:

  • karlicoss/HPI: Human Programming Interface ๐Ÿง‘๐Ÿ‘ฝ๐Ÿค–

I'm still trying to figure out how to use them.

[-] lemann@lemmy.one 2 points 10 months ago

Those first two look interesting - thanks!

[-] Antimutt@lemmy.world 3 points 10 months ago

Power Query can search line by line without loading a file much bigger than your RAM.

[-] Crul@lemm.ee 2 points 10 months ago

The links you posted are weird:

  • https://pixeldrain.com/u/KfgV7bqn: It offers to download a file with the name Antimutt in r-Excel ultra.paq8o which I have no idea what is for.

  • https://the-eye.eu/redarcs: It says "This Reddit Community Has Been Archived"

[-] Antimutt@lemmy.world 1 points 10 months ago

The first is the result when I extracted all lines with my nick in them from the csv, stored with the best compression around. The second is where to get the csv - and a lot of communities have been archived there, like it says.

[-] Crul@lemm.ee 2 points 10 months ago

Just to confirm I understand: you are talking about Power Query VS Power Bi for dealing with huge datasets, right?

Because, in my case, with 16Mb, I don't see the need for anything specially powerful. My problem is not performace, but convenience.

Thanks for the input.

[-] Antimutt@lemmy.world 1 points 10 months ago

Power Query is a component of Excel and Power BI.

this post was submitted on 12 Aug 2023
10 points (91.7% liked)

Reddit

16744 readers
293 users here now

News and Discussions about Reddit

Welcome to !reddit. This is a community for all news and discussions about Reddit.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules


Rule 1- No brigading.

**You may not encourage brigading any communities or subreddits in any way. **

YSKs are about self-improvement on how to do things.



Rule 2- No illegal or NSFW or gore content.

**No illegal or NSFW or gore content. **



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Posts and comments which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts.

Provided it is about the community itself, you may post non-Reddit posts using the [META] tag on your post title.



Rule 7- You can't harass or disturb other members.

If you vocally harass or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



:::spoiler Rule 10- Majority of bots aren't allowed to participate here.

founded 1 year ago
MODERATORS