pe1uca

joined 2 years ago
MODERATOR OF
[–] pe1uca@lemmy.pe1uca.dev 3 points 11 months ago

Check the most upvoted answer and then look into tubearchivist which can take your yt-dpl parameters and URLs to download the videos plus process them to have a better index of them.

[–] pe1uca@lemmy.pe1uca.dev 16 points 11 months ago (4 children)

Do you mean a community?
Like having your own !nostupidquestions@lemmy.world but with a different name?

It depends on your instance if they allow anyone to create a community or not, there's a configuration in the admin panel to restrict creating them to only admins.

If your instance allows it, then you can go into the home page and see a button on the left side which says "Create a Community" right above "Explore Communities".
Then you just have to fill up the data and click "Create".

[–] pe1uca@lemmy.pe1uca.dev 4 points 11 months ago (1 children)

Lemmy only stores text, any other form of media is handled by other services.
Image hosting comes "integrated" because it uses pict-rs, but Lemmy itself only stores the URL to that service.
For example in my instance I disabled image hosting, so I'd need to upload to another service for that, same with video.

[–] pe1uca@lemmy.pe1uca.dev 2 points 11 months ago (1 children)

You shouldn't use the name as a replacement for the ID, you need to use a slug.
The name should be stored as the user sets it, and the slug is autogenerated by your code removing any problematic character, so usually it only contains letters, numbers, and dashes, which makes it perfect to be a substitute for the numeric ID.
There should be libraries to handle this for you.

And ID is just something to identify a resource, so your ID in this case would be the slug.
I have a use case where the ID is generated by two fields, adapting it to your case would be something like /users/{user}/categories/{category}
So, whatever you define to be a unique way of working with an entity will be the identifier (ID) of that entity.

[–] pe1uca@lemmy.pe1uca.dev 2 points 11 months ago

Hard to tell that to a bot copying content from a bot infested site :P

[–] pe1uca@lemmy.pe1uca.dev 3 points 11 months ago (3 children)

Plus I'd suggest having a slug so the user doesn't have to memorize a meaningless number, instead a similar sounding string.

Instead of having 12345, something like category-1 for "Category 1".
Specially for sharing with a URL, it's more meaningful to share " domain.tld/search/categories/cat-1" than any other form of id (I'm annoyed with lemmy for not having a slug for posts, it feels so shady to share anything haha)

[–] pe1uca@lemmy.pe1uca.dev 5 points 11 months ago (1 children)

Borderlands 2, specifically the mechromancer class.
It has a perk where you get more damage each time you unload your full clip, and it resets when you manually reload.
In PC the reload action is its own key.

But I had a potato PC and was only able to play it at low settings. When I got a PS4 I bought the game again to play it with nice graphics. It quickly got very frustrating since the reload action is bouns to the same button as interact! So each time you tried to talk to someone, to get into a vehicle, or even pick up something from the ground you got into the risk of not aiming well enough and reloading by accident which resets your buff!

[–] pe1uca@lemmy.pe1uca.dev 1 points 11 months ago

I only had to run this in my home server, behind my router which already has firewall to prevent outside traffic, so at least I'm a bit at ease for that.
In the VPS everything worked without having to manually modify iptables.

For some reason I wasn't being able to make a curl call to the internet inside docker.
I thought it could be DNS, but that was working properly trying nslookup tailscale.com
The call to the same url wasn't working at all. I don't remember the exact details of the errors since the iptables modification fixed it.

AFAIK the only difference between the two setups was ufw enabled in the VPS, but not at home.
So I installed UFW at home and removed the rule from iptables and everything keeps working right now.

I didn't save the output of iptables before uwf, but right now there are almost 100 rules for it.

For example since this is curl you’re probably going to connect to ports 80 and 443 so you can add --dport to restrict the ports to the OUTPUT rule. And you should specify the interface (in this case docker0) in almost all cases.

Oh, that's a good point!
I'll later try to replicate the issue and test this, since I don't understand why OUTPUT should be solved by an INPUT rule.

[–] pe1uca@lemmy.pe1uca.dev 1 points 11 months ago

Well, it's a bit of a pipeline, I use a custom project to have an API to be able to send files or urls to summarize videos.
With yt-dlp I can get the video and transcribe it with fast whisper (https://github.com/SYSTRAN/faster-whisper), then the transcription is sent to the LLM to actually make the summary.

I've been meaning to publish the code, but it's embedded in a personal project, so I need to take the time to isolate it '^_^

[–] pe1uca@lemmy.pe1uca.dev 11 points 11 months ago

Good tip, just also good to remember there's no way to control the seed of the generator, for example for a seeded world.
For that use case you still need to manually run the generator you created with your seed to select an item.

[–] pe1uca@lemmy.pe1uca.dev 9 points 11 months ago (4 children)

I've used it to summarize long articles, news posts, or videos when the title/thumbnail looks interesting but I'm not sure if it's worth the 10+ minutes to read/watch.
There are other solutions, like a dedicated summarizer, but I've investigated into them and they only extract exact quotes from the original text, an LLM can also paraphrase making the summary a bit more informative IMO.
(For example, one article mentioned a quote from an expert talking about a company, the summarizer only extracted the quote and the flow of the summary made me believe the company said it, but the LLM properly stated the quote came from the expert)

This project https://github.com/goniszewski/grimoire has in it's road map a way to connect to an AI to summarize the bookmarks you make and generate at 3 tags.
I've seen the code, I don't remember what the exact status of the integration.


Also I have a few models dedicated for coding, so I've also asked a few pieces of code and configurations to just get started on a project, nothing too complicated.

[–] pe1uca@lemmy.pe1uca.dev 1 points 11 months ago

Ah, that makes sense!
Yes, a DB would let you build this. But the point is in the word "build", you need to think about what is needed, in which format, how to properly make all the relationships to have data consistency and flexibility, etc.
For example, you might implement the tags as a text field, then we still have the same issue about addition, removal, and reorder. One fix could be have a many tags to one task table. Then we have the problem of mistyping a tag, you might want to add TODO but you forgot you have it as todo, which might not be a problem if the field is case insensitive, but what about to-do?
So there are still a lot of stuff you might oversight which will come up to sidetrack you from creating and doing your tasks even if you abstract all of this into a script.

Specifically for todo list I selfhost https://vikunja.io/
It has OAS so you can easily generate a library for any language for you to create a CLI.
Each task has a lot of attributes, including the ones you want: relation between tasks, labels, due date, assignee.

Maybe you can have a project for your book list, but it might be overkill.

For links and articles to read I'd say a simple bookmark software could be enough, even the ones in your browser.
If you want to go a bit beyond that I'm using https://github.com/goniszewski/grimoire
I like it because it has nested categories plus tags, most other bookmark projects only have simple categories or only tags.
It also has a basic API but is enough for most use cases.
Other option could be an RSS reader if you want to get all articles from a site. I'm using https://github.com/FreshRSS/FreshRSS which has the option to retrieve data form sites using XMLPath in case they don't offer RSS.


If you still want to go the DB route, then as others have mentioned, since it'll be local and single user, sqlite is the best option.
I'd still encourage you to use any existing project, and if it's open source you can easily contribute the code you'd have done for you to help improve it for the next person with your exact needs.

(Just paid attention to your username :P
I also love matcha, not an addict tho haha)

view more: ‹ prev next ›