Advanced photo animation by xAI: photorealism, physics, native audio
Upload one image — the model adds motion and camera work.
A clear upgrade over 1.0: detail, lighting, physics and motion stability.
Two resolutions, 8 aspect ratios and native audio (speech, SFX, music).
Flexible duration for any scene, 8 seconds by default.
Plus 2 credits per input image. Range: from 38 to 302 credits per clip.
Pick an image from your device or gallery.
Briefly: what moves and how, how the camera behaves. AI will enhance the prompt.
Choose aspect ratio, resolution and duration — then start generation.
Grok Imagine 1.5 is an xAI model for photo animation (image-to-video): upload an image and describe the motion, and the model creates a 3–15 second video in 480p or 720p HD with native audio. A new-generation Grok with improved photorealism, lighting, physics and facial expressions.
Yes. Grok Imagine 1.5 works in image-to-video mode — one reference image is required for generation (upload from your device or pick from the gallery). A text prompt is optional but helps define the motion and camera work.
The price depends on duration and resolution: 12 credits/sec for 480p and 20 credits/sec for 720p, plus 2 credits per input image. For example, an 8-second 480p clip costs 98 credits.
Duration is 3 to 15 seconds (8 by default). Available aspect ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3 and auto (fit to the image).
Version 1.5 is a clear upgrade over 1.0: improved photorealism, lighting, physics, facial expressions and native audio. It is, however, more expensive than regular Grok. If you need cheap image-to-video, text-to-video or the Fun/Spicy modes, use regular Grok.
We use cookies to operate the service, keep your session, and collect anonymous statistics. See our Privacy Policy.