Open-source model from Lightricks

LTX-2 Video

Open-source video generation model up to 4K with AI audio and 50 FPS. Long clips, high resolution, and an affordable per-second price.

6-20 sec
Duration
Up to 4K
Resolution
credits
50 FPS
Smoothness

LTX-2 Video specifics

LTX-2 Video — open-source model from Lightricks.It delivers a quality image, but it has its quirks:the model does not always follow every prompt detail precisely and may interpret your description loosely, and Russian speech with audio enabled may sound accented.

💡 Useful tips:

  • Pro tier follows the prompt more precisely than Fast
  • If it didn't work the first time — regenerate, every run gives a different result
  • The more detailed prompt in screenplay style, the more accurate the result

Key features

Fast tier

Fast generation up to 20 seconds. For testing ideas, drafts and experiments

6-20sFast

Pro tier

Maximum quality and detail. For final clips and commercial projects

6-10sBetter quality

Up to 4K + 50 FPS

Three resolutions: 1080p, 1440p, 2160p (4K). All videos at 50 FPS for smoothness

1080p / 1440p / 4K50 FPS

Text → Video

Generate video from a text description. Good for landscapes, abstractions and stylization

Image → Video

Image animation. The model brings a picture to life while preserving appearance and composition

AI audio

Synchronized audio generation. Can be disabled

Generation modes

Text-to-Video

Video generation from text. Works great for landscapes, abstractions, stylization and scenes without close-ups of faces.

  • • No need to prepare an image
  • • The model creates everything from the description
  • • Best results with a detailed prompt

Image-to-Video

Animate an uploaded image. Helps preserve a specific character look, composition and style.

  • • Control over appearance and composition
  • • The model animates, doesn't invent
  • • Good for portraits and characters

Image tips for I2V

  • Landscape 16:9 — LTX-2 Video works only in widescreen
  • High quality — sharp, not blurry, well lit
  • Clear subject — a clear main subject/character in frame
  • AI-generated images — also work great (Ideogram, Midjourney, DALL-E)

LTX-2 Video prompt guide

Screenplay style — the main rule

LTX-2 Video uses screenplay approach — write the prompt like a scene description from a film.Coherent present-tense text, 4-8 sentences. Each sentence describes one element.

6 key elements:
1. Frame (shot, scale)2. Scene (place, time)3. Action4. Characters5. Camera6. Audio (if sound is enabled)

Example of a good prompt:

«Medium shot of a woman standing on a cliff overlooking a rugged coastline during golden hour. She turns her head slowly, her hair and linen dress swaying in the wind. The camera dollies in gently from a wide angle to a medium close-up. Volumetric light rays break through scattered clouds. Waves crash against the rocks below, seagulls call in the distance. Cinematic, shallow depth of field.»

What NOT to do

  • Quality tags — "4K", "cinematic", "high quality", "50fps" — the model ignores
  • Negative phrases — "no artifacts", "without morphing" — the model doesn't understand negation
  • Keyword lists — "realistic, smooth, detailed, professional" — that's not a script

💡 Our prompt enhancement system automatically converts your description into the proper screenplay format

What works well

  • Landscapes, nature, architecture, urban scenes
  • Characters — appearance generates well
  • Stylization — anime, 3D, art, abstractions
  • Smooth camera moves — pans, dollies, tracking
  • Dialogue and sound effects (when audio is enabled)
  • Long videos up to 20s (Fast) — a unique advantage

What to watch for

  • ⚠️Fast mode can be loose — skipping prompt details or changing composition. For strict prompt adherence use Pro
  • ⚠️Long videos (16-20s) — quality may degrade toward the end
  • ⚠️Complex physics (liquids, destruction) — results are unpredictable
  • ⚠️Text on video — generates readable text poorly (like most models)
  • ⚠️Speech works best in English

💡 If the result doesn't satisfy you — regenerate. Each run gives a different outcome.

For Image-to-Video: describe only the motion

When animating a photo, the model already SEES everything in the image — appearance, clothing, background. Describe only what should to change:

✅ WHAT TO DESCRIBE:

Actions, gestures, facial expressions, camera motion, physics (wind, light), sounds and dialogue

❌ DON'T DESCRIBE:

Appearance, facial features, clothing, hair color, background details — the model takes all of these from the photo

Video examples with prompts

Fisherman on a foggy lake

A lone fisherman rows across a foggy lake before sunrise, the boat creaking softly as water laps at its sides. The camera glides overhead, tracking his slow progress. His lantern casts a warm circle of light, reflecting in ripples while reeds sway gently on the shoreline. A distant bird call echoes as mist rolls across the surface, partially obscuring the horizon.

atmospherea coherent paragraphlandscape

Boy running in the rain

A young boy runs barefoot across a wet stone courtyard as the first raindrops begin to fall. The camera tracks behind him at low angle, catching the splashes beneath his feet. He turns sharply, arms outstretched for balance, and laughs as thunder rumbles overhead. His clothes cling slightly from the damp, and puddles reflect the flickering lanterns lining the perimeter.

present tenseactionsdynamics

Florist on a morning street

A florist arranges bouquets at an outdoor stall along a cobbled street while morning light spills between nearby buildings. She lifts a bundle of tulips and rotates it slightly as the stems are trimmed with quick, practiced motions. The camera begins in a wide shot from across the street, then slowly pushes forward at shoulder height as pedestrians blur in the foreground.

camerawide → close-upmorning light

Girl on a cliff by the ocean

A young woman stands at the edge of a rocky cliff overlooking the ocean. She raises her hands while seagulls fly overhead. Pale blue light from an overcast sky diffuses across the scene, softening the edges of distant waves and casting no sharp shadows. The camera begins in a medium shot behind her, then slowly pulls back to reveal tall grasses swaying at the cliff's edge.

atmospherelightingnature

Alien on a spaceship

A pair of elevator doors slide open inside a metallic corridor on a spaceship as thin mist rolls out from the floor vents. As the camera begins in a stationary wide shot, a tall alien figure wearing a white uniform steps forward through the haze. Then the camera glides sideways, following the alien's stride as it moves across the deck toward a glowing console.

sci-fitemporal flowgenre language

Man in a neon alley

A middle-aged South Asian man wearing a long tan coat and dark scarf steps into a narrow alley lit by neon signage. The camera tilts up from his shoes as rain hits the cobblestones, revealing his profile in close-up as he pauses beneath a flickering sign. Steam rises from a nearby vent while green and purple reflections dance across his glasses.

character detailsnoirneon

Pricing

Price = price per second × duration. Depends on the tier (Fast/Pro) and resolution.

Failed to load prices

How to use

1

Choose a mode

Text→Video for generation from scratch or Image→Video to animate an uploaded image.

2

Upload an image (for I2V)

Horizontal 16:9, high quality. AI-generated images also work great.

3

Configure parameters

Pick a tier (Fast/Pro), duration, and resolution. Fast 6s 1080p — the most economical option.

4

Describe the video

Enter a description in any language. AI will create an optimized cinematic prompt.

5

Translate and generate

Translate the prompt to English and press 'Generate'. Result in 2-8 minutes.

Frequently asked questions

Is LTX-2 Video open-source?

Yes! LTX-2 Video is a fully open-source model by Lightricks (GitHub, Hugging Face). It offers unique capabilities: up to 4K, 50 FPS, up to 20-second videos, and high quality at an affordable price.

How does Fast differ from Pro?

Fast — fast generation up to 20 seconds, good for drafts and experiments. Pro — best quality, follows the prompt more precisely, but limited to 10 seconds. If Fast takes too many liberties — try Pro.

Why do results sometimes differ from the prompt?

This is a quirk of LTX-2 Video — the model produces high-quality images but doesn't always follow every detail of the description precisely. The Pro tier follows prompts more accurately than Fast. If the result doesn't match expectations — regenerate; each run gives a different result.

Image-to-Video or Text-to-Video — which to pick?

It depends on the task. Text-to-Video is great for landscapes, abstractions, and stylization. Image-to-Video is better when preserving a specific character or composition matters. Both modes work well with a detailed prompt.

What is the maximum resolution?

2160p (4K Ultra HD). 1080p and 1440p are also available. All videos are generated at 50 FPS for smooth motion.

Is audio generated?

Yes, LTX-2 Video supports AI audio generation. It can be enabled/disabled in settings.

How is the cost calculated?

Cost = price per second × video duration. The per-second price depends on tier (Fast/Pro) and resolution (1080p/1440p/4K). Audio does not affect the price.

When should I pick LTX-2 Video over other models?

LTX-2 Video is the best choice when you need a long clip (up to 20 sec in Fast), high resolution (up to 4K), 50 FPS for smoothness, or a low per-second price. It's the only model on the platform with up-to-4K resolution and 50 FPS.

Ready to try LTX-2 Video?

Up to 4K, 50 FPS, up to 20-second video with AI audio — all at an affordable price

© 2026 Sixio. All rights reserved.

We use cookies to operate the service, keep your session, and collect anonymous statistics. See our Privacy Policy.