not sure, this is what I got with the same prompt....

![](https://m.stacker.news/84136)

The perfect angle of the bridge in the background is the giveaway 

That's pretty wild.

**Image**:  ![](https://m.stacker.news/84107)

**Prompt**: 
```
A wide image taken with a phone of a glass whiteboard, in a room overlooking the Bay Bridge. The field of view shows a woman writing, sporting a tshirt wiith a large OpenAI logo. The handwriting looks natural and a bit messy, and we see the photographer's reflection.

The text reads:

(left)
"Transfer between Modalities:

Suppose we directly model
p(text, pixels, sound) [equation]
with one big autoregressive transformer.

Pros:
* image generation augmented with vast world knowledge
* next-level text rendering
* native in-context learning
* unified post-training stack

Cons:
* varying bit-rate across modalities
* compute not adaptive"

(Right)
"Fixes:
* model compressed representations
* compose autoregressive prior with a powerful decoder"

On the bottom right of the board, she draws a diagram:
"tokens -> [transformer] -> [diffusion] -> pixels"
```

OneOneSeven

This is going to be great for storyboards

So, does it make sense for the person taking the photo to be reflected like he is? I'd have to actually set up this scenario in order to figure that out...

Signal312

Looks like there are many improvements to their image gen like multi-turn generation, text rendering and better character consistency