this post was submitted on 17 May 2024
31 points (72.5% liked)

Technology

59629 readers
2866 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

On Tuesday at Google I/O 2024, Google announced Veo, a new AI video-synthesis model that can create HD videos from text, image, or video prompts, similar to OpenAI's Sora. It can generate 1080p videos lasting over a minute and edit videos from written instructions, but it has not yet been released for broad use.

top 14 comments
sorted by: hot top controversial new old
[โ€“] just_another_person@lemmy.world 12 points 6 months ago (1 children)
[โ€“] far_university1990@feddit.de 12 points 6 months ago

Here, have a bed, you need it ๐Ÿ›

[โ€“] vhstape@lemmy.sdf.org 11 points 6 months ago (2 children)

cuz ... they need a healthy supply of products for google graveveyard

[โ€“] Grimy@lemmy.world 0 points 6 months ago

Why not?

I want to have fun generating my own shows and movies eventually. Just for my own fun. You can literally just not use it.

[โ€“] JackGreenEarth@lemm.ee 6 points 6 months ago (1 children)

Tell me again when it's open source.

[โ€“] BaroqueInMind@lemmy.one 7 points 6 months ago* (last edited 6 months ago) (1 children)

OpenAI is as open as an elderly Catholic nun's legs while she's reading incel posts on 4chan.

[โ€“] JackGreenEarth@lemm.ee 2 points 6 months ago

That means not open, right? I'm only interested once the first good opens Circe video generator is released, more closed source ones aren't interesting, once I heard about the first.

[โ€“] Badeendje@lemmy.world 6 points 6 months ago

Until they sunset it. No use getting invested in new Google products anyway.

[โ€“] pastermil@sh.itjust.works 4 points 6 months ago

Are they gonna demo it like they do with Gemini?

[โ€“] deathmetal27@lemmy.world 4 points 6 months ago

After reading the wheresyouredat article I don't have much faith in this one either for any serious work. It's a curiosity at best.

[โ€“] the_crotch@sh.itjust.works 2 points 6 months ago

Veo is a pretty prolific graffiti artist in Hartford CT

[โ€“] autotldr@lemmings.world 2 points 6 months ago

This is the best summary I could come up with:


Veo's example videos include a cowboy riding a horse, a fast-tracking shot down a suburban street, kebabs roasting on a grill, a time-lapse of a sunflower opening, and more.

Conspicuously absent are any detailed depictions of humans, which have historically been tricky for AI image and video models to generate without obvious deformations.

Google says that Veo builds upon the company's previous video-generation models, including Generative Query Network (GQN), DVD-GAN, Imagen-Video, Phenaki, WALT, VideoPoet, and Lumiere.

While the demos seem impressive at first glance (especially compared to Will Smith eating spaghetti), Google acknowledges AI video-generation is difficult.

But the company is confident enough in the model that it is working with actor Donald Glover and his studio, Gilga, to create an AI-generated demonstration film that will debut soon.

Initially, Veo will be accessible to select creators through VideoFX, a new experimental tool available on Google's AI Test Kitchen website, labs.google.


The original article contains 701 words, the summary contains 150 words. Saved 79%. I'm a bot and I'm open source!

[โ€“] mindbleach@sh.itjust.works 1 points 6 months ago

it can generate 1080p videos lasting over a minute

Any length limit is a sign you're doing it wrong. You don't need every single frame in-memory at the same time to figure out what any specific frame should look like. Local frames matter for fine changes. Further frames matter for continued movement. Distant frames matter for continuity. It should be possible to scan across an arbitrarily long sequence and gradually remove its flaws.

... though admittedly once you get to about five minutes, you've covered nearly one hundred percent of all shots in film and television. One minute is already long enough for a human editor to work with. (And evidently people hate the idea of a robot churning out a whole finished product, cuts and all.) But if the network only need a few seconds at a time, it'll be faster to train and easier to run.