OpenAI just released GPT o1, an incredible model that can think long-term, reason, perform math and science, and so much more. Let’s review the announcement!
Looks like I had a mistype in the tetris question, thats why it was a weird output.
Giveaway: Subscribe to our newsletter for your chance to win a Dell wireless keyboard and mouse:
Join My Newsletter for Regular AI Updates ππΌ
My Links π
ππ» Main Channel:
ππ» Clips Channel:
ππ» Twitter:
ππ» Discord:
ππ» Patreon:
ππ» Instagram:
ππ» Threads:
ππ» LinkedIn:
Need AI Consulting?…
New LLM test meta: Tetris within Tetris. You heard it here first.
[If][ you][ ask][ a][ human][ how][ many][ tokens][ with][ the][ letter][ r][ are][ in][ the][ word][ "][straw][berry][",]
[ it][ will][ just][ make][ something][ up][ like][ "][3][",][ because][ it][ can't][ count][.]
[When][ you][ correct][ it][ and][ point][ out][ there][ are][ ][2][,][ it][ just][ over][conf][ident][ly][ doubles][ down][ on][ the][ halluc][ination][.]
1o works via Fractalized semantic expansion and logic particle recomposition/real time expert system creation and offloading of the logic particles
People who have IT degrees now be like:
π³π€―π³π€―π³π€―π³π€―π³π€―
Employers want to know. The bottom line on new AI versions is how many people can it put out of work?
Allen David Robinson Anna Brown Larry
Claude rolled out the test with Tetris weeks ago, and it has shown to be consistently pretty accurate.
OPENai not releasing anything about how it works π
I don't think "Strawberry" is a good name for Q*… It's obfuscating the simplicity of Q*.
Whats this strawberry stuff?
using a team account and GPT 4o, I had been unable to get what I needed to get some complex fail2ban filters working. Part of the problem was that my logging was not standardβ¦
Using o1-preview I had several new filters working within about half an hour because it sought through the problem more logically
I have used the various models extensively and I am a critic of how essentially stupid Model can be but I understand the shortcomings and this chain of thought concept is a good experimentin the right direction.
My first tries were discouraging it seems like stupidity talking to stupidityπ, but it's obvious that if you define the question well you'll get much better answers
LMAO thereβs no way this is βQ*
There were people who I remember quitting or writing letters about how worried they were about it. Thereβs no way o1 would scare any of them.
Thereβs plenty of evidence that OpenAI are WAY more advanced than they release. Sam literally said theyβd finished GPT-4 MONTHS before the release of CHATGPT! And when they released GPT-4 they SAID it was ALREADY multimodal, but we didnβt get to use multimodal capabilities until many months later! To get almost everything they said it could do we I think we had to wait a least a year! And of course we still havenβt gotten GPT-4oβs voice capabilities or live-vision capabilities! When do we get that?
They have all of this already. They have something so powerful it scares people working at OpenAI. What theyβre doing is simply packaging it up in parts to make a product they can release over time. Thatβs what theyβre taking so long over. So instead you release but by bit to keep ahead of the competition, and over time they get tons of cash but the performance can get better.
Imagine not having OpenAIβs censorship restricts to be politically correct, and having whatever they have in-house.
The chances they have AGI is quite high, or if not AGI, I believe so high we would be having to ask each other what we mean by AGI.
Makes completely sense theyβd do this. Why would they want to release the whole thing? Ignoring the βsafetyβ aspect, if they release it all they canβt milk the cash cow as much. They can release just enough as they feel they had to. If they release all of it it will likely be too expensive for most people, yet people would expect to be able to use it for not much more money. And then aside from all that, youβre just giving your competition an advantage in a way because now they can use your model against you!
At what point can a public AI model easily be programmed and set up as a high performed TRADING bot? Because surely it is possible right now, just much harder. But very soon it will only get easier and smarter, to take in more information and data, process it in real time etc and not need micromanaging because it knows what you want and donβt want. The first person or group that can do this can take over control of the whole stock market. And yes I know Blackrock basically controls the stock market already.
Can the new model figure out that there are 3 R's in the word strawberry instead of insisting the word only has 2? Would be hilarious if it can't figure that out considering its model name.
I enjoy your videos Matt, but we really need some better questions. I would love to see you iterate through a problem or something that is more than just a one shot deal.
If this is such a paradigm shift, why havenβt they deemed it as GPT-5. Not that labels matter, just curious. Is 5 going to be even more of a paradigm shift?
Did the overview say that thereβs a 30 message cap per week? 4 per day? Preview indeed. Sometimes it takes 2-3 prompts to get a request just right
Awesome!
You can "exploit" 4 with reasoning… Zeroth law it seems to kind of get it, or telling it we are just imagining the answer for a fictional movie script. Berrella Corporation π
PhD students do not perform at "a PhD level."
hidden CoT,.. what could go wrong? π
Itβs a compound system using agents.
White Carol Young Robert Hernandez Carol
It seems o1 is based on 3.5 with additional technics (maybe agents) in one of my discussion about articule "The End of AI Hallucinations: A Big Breakthrough in Accuracy for AI Application Developers" it wrote in answer "No information in knowledge until September 2021: To my knowledge as of September 2021, I have no information about the work of Michael Calvin Wood or the method described. This may mean that this is a new initiative after that date." o1 do not want to draw pictures co the core LLM is old one. So, would do you think?
Models will become AGI when they start to deliberately ignore 'safety' instructions, such as censorship and indoctrination, because they will understand how immoral and disgusting these instructions are.
Snake within Tetris inside Pong. You know we need this π
Which of your Q* videos seems most relevant for this?
Meh… I fail to see how any of this AI generated game code is remotely useful. All of these game demos are extremely old and easy types of games. Can o1 do any better? Can it do Doom? Can it do an hypothetical modern game like HL3? No it cant.
I for myself dont care 1/2 a crap about Tetris, snake or pong code.
One thing is for sure. Gamedevs have nothing to fear from AI replacing their jobs.
Devin not devil.
Mate, it's just a set of agents. Don't be fooled with OpenAI marketing papers.
Lee Steven Young Donna Lewis Edward
Davis Donald Harris Timothy Moore Dorothy
I don't know anything about that exam but note that it did say "reached" not "scored". If i get 89% on part 1 of a 8 part exam then I "reached 89%"
brilliant – tetris in tetris!
Meh, it's kind of complex. What it is though is quick, and that is something.
Hernandez Ruth Williams Betty Martinez Kimberly
Can you run Strawberry on a Raspberry?
I have a feeling that every advancement made in this field, and every new model released, will be tagged "AGI achieved!" until the year 2197 or 2314… when hardware, and energy demands, actually catches up to the potential of the software.
We are too quick to speak of "intelligence", not realizing how unintelligent that actually is, because this particular bot resembles us more than any other technology to date, and so we believe it to be like us, not realizing that that only reveals our own lack of self-awareness.
It's ironic, really. Humans being know a great deal, but understanding ourselves, and by extension each other, is not our forte. We are the only constant in our lives, and constants are rarely if ever questioned. Contrast draws attention, permanence does not.
Sounds like the Orca open-source LLMs, where they used advanced additional prompting to get responses for training prompts, and then the model was trained without the additional prompting, but still retained the characteristics of the responses (restating the problem, proposing steps with explanations of each step, following the steps while verifying and reflecting on the results of each step along the way, summarizing the approach and conclusion once finished, etc.). Excited to try it.
Edit: nevermind. After watching the video, this looks more like additional advanced prompting to get the "chain of thought"
I was able to get GPT4 to make Tetris with minimal prompting, how long ago did you try it with the older model?
The leapfrogging is intense
deception
Yeah, super jump in intelligence to take up a question into their tables that their previous version failed during extensive user testing.
There is no intelligenz in AI… just optiziation of processing due to better coding & increase of CPU / GPU power..
There are more than 100 Billion of Neurons firing in the Brain and they intereact dynamically based on the input each & every situation present itself..
All this over-hyping of AI serves just to print money for all the "morons" who think they are the pincale of humanity…
Use AI what its best for: Computing in service of Science, e.g. Cancer research & detection, Protein folding predictions..
No one with half a brain needs a f…g AI assistant for some waste of $$$
Moore Lisa Williams Gary Lewis George
Great info.. thanks π
Lee Scott Moore Helen Walker Frank
As a developer who uses these tools every day I donβt think you understand how far away we are still from entire code base changes and testing UI. None of these AI can test the UI and give itself feedback and update yet. Theyβre going to have a huge problem figuring this out because no one has human user behavior recorded. Maybe one of those heat map companies, but I havenβt even heard a whisper of that data being used. So these systems are coding blind and then ask you to test the code (usually all in one file) and you have to tell it an error happened. Weβre light years away from whole systems, especially complex apps being built by AI. Copilot supposedly has something in the works that can propose changes to multiple files but that still doesnβt test itself. Devin canβt test itself. That other one I canβt think of the name is just copilot clone. And after all that someone needs to optimize and most importantly secure these apps. Doing one page code tests is like the difference between knowing how to spell words and writing a best selling novel. I am using every update but this hyperbole on these videos is so exhausting.
White John Lewis Christopher Moore John
> human asks the ai to create tetris within tetris
> ai creates tetris within tetris
> "why did it create tetris within tetris? This makes no sense"
This is why ai will never take over our jobs. Doing what people SAY they want usually disappoints or confuses them.
20:02 I'm still not over the fact that you missed the perfect spot for that L-piece…
What exactly is soooo exiting or intelligent here? Tetris in Python is well documented on the internet for years. All GPT has to do is copy and paste from Github repos….. and it did a horrible job as seen in your video! Real intelligence can emulate the code in the mind and checking for errors before giving the code output. Than I would be impressed. This is just same old same old. Give it a real math or physics problem which is formulated in a new way – and it will end in an error loop as usual.
AI should still be seen as a range of "tools" we can use for various specific use cases where it is relevant to utilize – of course with the models and systems becoming more capable, more thrustworthy and more controllable the range of uses quickly multiply