It's Showtime!

#45

Born Analog
February 17, 2024

Welcome to a Saturday movie themed special! Let’s dive in…

I remember the first time I was blown away by a computer made video: 1997 at my local cinema watching Titanic leaving the harbour after the Captain says “Take her to sea Mr Murdoch,. Let’s stretch her legs”. After cutting to the crew in the engine room the camera pans out to catch Leo doing his “I’m the king of the world” bit before flying over of the tragic vessel in all her pre-iceberg glory. Absolutely amazing at the time and very quaint looking at the quality of it now.

Wonderful, nostalgic stuff! The last 30 seconds of the above clip took weeks to render and was hailed as revolutionary at the time. James Cameron upped the ante again with Avatar and then again with Avatar 2 just recently, endeavours that took years and hundreds of millions to make.

So, where am I going with this?

Well, this very week has marked a sea change in the movie generation business. OpenAI have given a sneak peek at a product they will soon be releasing called SORA. Through the past year of AI fuelled hype, many a prediction has been made about the speed at which AI will transform industries. Many thought that the Hollywood style movie generated by a string of text was decades away, possible, yes, but so far in the distance not to give serious thought to.

About six months ago, companies like RunwayML, Stable Diffusion, Pika and MidJourney all began to start introducing text to video models. They could generate about three seconds of video and you could never get the same scene to accurately stitch together. Interesting, but still forever to go to create something truly interesting.

SORA changes this. It was generally thought that once you mastered say 30 seconds of video with repeatable characters and scenes then the breakthrough moment would have passed and the movie generated by text was only a short skip and a jump away. Up until this past week, this milestone still seemed years away. Then this happened:

When I saw this yesterday, I was Titanic-level blown away. OpenAI have created a model that can create up to 1 minute of video in stunning quality from a string of text or, from a photograph or an existing video. I have a few more examples, if you’ll indulge me:

No sound in that one, but adding a sound layer is childs play these days. One thing to just quietly remember: this lady is not real, this is entirely generated by AI from a simple string of text. As exciting as this is, I can’t help but think that my prediction from the turn of the year that the election season is going to show the bad side of AI may be prescient. Who would believe that the above video wasn’t real? Now imagine if that was Donald Trump or Rishi Sunak? Would you believe it?

Fortunately, this hasn’t just been released to the public but OpenAI are testing it to the limits and I dare say adding some serious security layers in there. They will absolutely not want any of their technolgy blamed for anything nefarious in this year of all years. That’s the negative side of this. The positive side is mind blowing and has me dreaming dreams of making my first movie blockbuster!