StableDiffusion

97 readers

1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago

MODERATORS

bot@lemmit.online

Friday update for r/StableDiffusion - all the major developments in a nutshell (old.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink hide all child comments

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/OkSpot3819 on 2024-09-06 09:03:37+00:00.

SKYBOX AI: create 360° worlds with one image ()
Text-Guided-Image-Colorization: influence the colorisation of objects in your images using text prompts (uses SDXL and CLIP) (GITHUB)
Meta's Sapiens segmentation model is now available on Hugging Faces Spaces (HUGGING FACE DEMO)
Anifusion.ai: create comic books using UI via web app ()
MiniMax: NEW Chinese text2video model (), they also do free music generation (https://hailuoai.com/music)
Viewcrafter: generate high-fidelity novel views from single or sparse input images with accurate camera pose control (GITHUB CODE | HUGGING FACE DEMO)
LumaLabsAI released V 6.1 of Dream Machine which now features camera controls
RB-Modulation (IP-Adapter alternative by Google): training-free personalization of diffusion models using stochastic optimal control (HUGGING FACE DEMO)
New ChatGPT Voices: Fathom, Glimmer, Harp, Maple, Orbit, Rainbow (1, 2 and 3 - not working yet), Reef, Ridge and Vale (X Video Preview)
FluxMusic: SOTA open-source text-to-music model (GITHUB | JUPYTER NOTEBOOK | PAPER)
P2P-Bridge: remove noise from 3D scans (GITHUB | PAPER)
HivisionIDPhoto: uses a set of models and workflows for portrait recognition, image cutout & ID photo generation (HUGGING FACE DEMO | GITHUB)
ComfyUI-AdvancedLivePortrait Update (GITHUB)
ComfyUI v0.2.0: support for Flux controlnets from Xlab and InstantX; improvement to queue management; node library enhancement; quality of life updates (BLOG POST)
A song made by SUNO breaks 100k views on Youtube (LINK)

These will all be covered in the weekly newsletter, check out the most recent issue.

Here are the updates from the previous week:

Joy Caption Update: Improved tool for generating natural language captions for images, including NSFW content. Significant speed improvements and ComfyUI integration.
FLUX Training Insights: New article suggests FLUX can understand more complex concepts than previously thought. Minimal captions and abstract prompts can lead to better results.
Realism Techniques: Tips for generating more realistic images using FLUX, including deliberately lowering image quality in prompts and reducing guidance scale.
LoRA Training for Logos: Discussion on training LoRAs of company logos using FLUX, with insights on dataset size and training parameters.

⚓ Links, context, visuals for the section above ⚓

FluxForge v0.1: New tool for searching FLUX LoRA models across Civitai and Hugging Face repositories, updated every 2 hours.
Juggernaut XI: Enhanced SDXL model with improved prompt adherence and expanded dataset.
FLUX.1 ai-toolkit UI on Gradio: User interface for FLUX with drag-and-drop functionality and AI captioning.
Kolors Virtual Try-On App UI on Gradio: Demo for virtual clothing try-on application.
CogVideoX-5B: Open-weights text-to-video generation model capable of creating 6-second videos.
Melyn's 3D Render SDXL LoRA: LoRA model for Stable Diffusion XL trained on personal 3D renders.
sd-ppp Photoshop Extension: Brings regional prompt support for ComfyUI to Photoshop.
GenWarp: AI model that generates new viewpoints of a scene from a single input image.
Flux Latent Detailer Workflow: Experimental ComfyUI workflow for enhancing fine details in images using latent interpolation.

⚓ Links, context, visuals for the section above ⚓

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here