Open AI recently demoed their latest leap in AI technology named Sora. They describe it as “an AI model that can create realistic and imaginative scenes from text instructions“, essentially it’s the video equivalent of the AI models for image creation that we’ve seen gain much traction in recent years.
Sora is not currently publicly available and it has some limitations, however what we’ve seen so far is hugely impressive. The majority of the examples we’ve seen from Sora at this point are 3D renderings, portraying real life and 3D animated scenes like the one above. We’re yet to see anything produced in a 2D style, but I’m sure it won’t be long until we do.
What Can’t It Do?
According to Open AI, Sora’s current iteration struggles with scenes that are heavily physics dependent. The model can also find it to understand instances of cause and effect, for example Open AI describes the scenario where “a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark”. It also sounds like it can be difficult to direct very specific shots, for instance having a moving camera follow a desired path or trajectory. These limitations sound fairly minor, all things considered. In fact, at first glance you’ll be forgiven for thinking that some of these videos are real cinematic footage. For a first presentation we’re really not seeing as much visual weirdness as we’d expect. Remember how the AI imaging models struggled to render a human hand for so long?
In the safety portion of Sora’s web page, Open AI says that it will be implementing the same safety methods that it employs to it’s other products based on DALL-E 3. These safety methods should prevent user’s from generating content that violates Open AI’s usage policies, such as those featuring “extreme violence, sexual content, hateful imagery, celebrity likeness, or the IP of others“.
How Will AI Affect Animation Studios?
As mentioned earlier, its only a matter of time until we see an AI model producing flawless 2D animation. It’s going to happen. The internet is teeming with generated images of ‘Ghibli’ versions of other popular shows and movies. What happens when someone can can create a 15 second Ghibli short in less than a 30 seconds with a text prompt? Surely the next stage of that, a few iterations down the line, is a full generated film.
There’s an interesting reddit thread on r/animation, where hobbyist and professional animators alike are discussing their thoughts on how the future looks with Sora in the picture. One thing for certain is that Sora and other AI models will not be disappearing any time soon.
You can read more about Open AI’s research paper about video generation models for more information.