Nano Banana is excellent at following instructions. The interesting part is what happens when those instructions describe something that doesn't exist yet.

It follows instructions well. Type "a red bottle on a black background" and that's what comes back: accurate color, accurate material, accurate background. For a lot of work, that's exactly what you want.
Where problems tend to emerge is when the brief asks for something that doesn't exist. An alien flower. A surreal sculptural stack of impossible objects. Nano Banana still follows the words, but it tends to land on the most literal, recognizable version of what you described. Recraft V4.1 tends to land somewhere more art-directed, especially on prompts with an emotional or surreal edge.
We ran the same prompts through both to see where that difference actually shows up.
Credit where it's due. Nano Banana is fast, it tracks references well, and on literal, well-defined prompts (a bottle, a chair, a person tying their shoes) it returns exactly what was asked for. Accuracy isn't the gap.
The gap shows up in intention. The same brief run through Recraft tends to come back with a clearer sense of light, mood, and composition, the kind of choices an actual photographer or art director would make on set. And on prompts that ask for something invented rather than observed, Recraft commits to a stranger, more specific result instead of defaulting to the nearest familiar object
Three prompts, same pattern each time: Nano Banana delivers a competent, accurate stock photo. Recraft delivers the same scene shot by someone with a point of view about light.
Nano Banana placed the subject on a London street in flat daylight with passersby in the background: an honest, slightly busy candid shot.
Recraft shot the same action at golden hour, cropped tighter on the hands and shoes, and let warm side light carve out the form. It looks editorial rather than candid.
Nano Banana gave a woman by a window with a mug and a houseplant: comfortable, and a little familiar as far as lifestyle stock goes.
Recraft lit the same setup from one side only, on a more sculptural chair, with a quieter, more contemplative pose. She seems pensive and perhaps even carries a touch of melancholy. It makes you wonder what she was thinking. Same brief, more emotion.
Nano Banana's scene is cozy and competent: warm bulb, books, a journal, all present and accounted for.
Recraft used a single hard lamp source and cropped in close on the hand and pen, dropping most of the desk into shadow. It's the same three nouns in the prompt, staged with intent instead of just included.
This category is the clearest split, because the difference is about what to do with the things in the prompt, not whether to make them.
Nano Banana included every object the prompt named. The smiley faces, the hearts, the chrome accents are all there, but the composition is busy, the background blur competes with the subject, and the surfaces land closer to glossy plastic than the "ultra polished" finish the brief called for.
Recraft built the same stack against a clean white field with sharper chrome reflections and a clearer hero object up front. It's the kind of image that belongs in a MoMA exhibit next to Yayoi Kusama, not a render that just happens to include the right objects.
Nano Banana built something genuinely well crafted, but it's recognizably a flower: petals, a center, a stem, rendered in glass instead of plant matter. It followed the instructions and defaulted to the nearest real thing.
Recraft built something with no obvious real-world reference: spiked translucent fins, trailing antenna-like spines tipped in glass beads, a dense cluster of glowing blue spheres at the core. Nothing about it reads as "flower, but glass." It reads as a specimen from somewhere that doesn't exist, which is what "bio-organic alien structure" was actually asking for.
Nano Banana rendered a smooth black wetsuit on a person, faceless and glossy, but built on entirely human proportions and a human stance. It reads as someone in a costume standing in a field, not as a being that was never human to begin with.
Recraft never started from a person. The head is an elongated, faceless ovoid. The joints at the shoulders, elbows, and hips are visibly segmented, more marionette than mannequin. The proportions are deliberately wrong in a way a costume never would be, and that wrongness is exactly what makes it convincing as something else entirely. The sheep at knee height and the golden hour light do the rest. Nano Banana placed a person wearing the idea of a robot. Recraft placed the robot.
Nano Banana returned the bottle straight up: centered, evenly lit, black background exactly as specified. It's correct in every detail and looks like stock photography for a product listing.
Recraft tilted the same bottle off axis and let red light bleed across the surface, pushing the far edge into near silhouette. Same bottle, same three words of brief, but it reads like a campaign shot instead of a listing photo. Both are great photos, but it depends on what you need them for.
This one also could go to any model. Nano Banana rendered the chair fully saturated, neon green from the seat down to the legs, exactly the color and material the prompt asked for, no notes.
Recraft took more risk with the same three words: a sculptural, almost organic chair form with the neon green concentrated in the legs and joints while the seat and back stay closer to clear glass. It's a better-lit, more striking image, but it's a slightly looser read of "neon-green chair" than the brief technically called for. Call this one a tie: Nano Banana nailed the spec, Recraft made the better photograph.
If your brief is a single literal object with no room for interpretation, Nano Banana will get there fast and get it right. If your brief has any emotional or surreal register, anything without a one-to-one match in the real world, Recraft is built to commit to that instead of rounding it down to the nearest familiar thing.
When the brief is a single real-world object with no room for interpretation: a product on a plain background, a straightforward portrait, anything where the "right" answer is obvious and you just need it rendered.
The choices a photographer or art director makes on set: where the light comes from, what falls into shadow, how tight the crop is, what gets emphasized. Recraft tends to make those calls even when the prompt doesn't spell them out.
Recraft V4.1.
Yes. Every prompt in this post is written out in full, so you can drop them straight into Recraft and see how they land for you.