Supercharged AI helpers are being released by Google and OpenAI. Here’s how to give them a try.

This week, Google and OpenAI both revealed that they have created supercharged AI assistants—devices that can interpret conversations on the run, analyse your surroundings using live video, and talk with you in real time—and recover when you interrupt them.

On Monday, OpenAI made its debut with the GPT-4o, a new flagship model. During the live presentation, it was heard reading bedtime stories and assisting with arithmetic problems in a voice that uncannily resembled Joaquin Phoenix’s AI lover from the film Her—a characteristic that CEO Sam Altman was aware of.

Google unveiled its own suite of new tools on Tuesday, among them the very identical Gemini Live conversational assistant. It also said that it is developing an AI agent that can essentially “do everything.” This agent is not expected to be released until later this year, but it is already under development.

You’ll soon be able to test these out for yourself to see if they’re more like a sci-fi party trick that eventually loses its novelty, or if you’ll use them in your everyday routine as much as their creators hope. Here’s what you should know about the cost, potential uses, and accessibility of these new tools.

What it can do: According to OpenAI, the model can have a real-time conversation with you with a response time of roughly 320 milliseconds, which is comparable to a typical human interaction. The model can help with activities like language translation and coding, and you can ask it to understand anything you point the camera of your smartphone at. In addition, it can create fonts, graphics, and 3D renderings in addition to summarising information.

How to access it: OpenAI has not specified a date for when it will begin to roll out the text and vision components of GPT-4o in the GPT app and web interface. The business has not yet specified a specific date for when it will introduce the voice features, but it claims it will do so in the upcoming weeks. The text and vision components of the API are currently available to developers, while voice mode will first only be available to a “small group” of developers.

The fee is not specified; however, OpenAI will place limits on the amount of time you can spend using GPT-4o before being forced to upgrade to a paid plan. Joining one of OpenAI’s subscription plans, which begin at $20 a month, will increase GPT-4o capacity five times over.
Gemini Live from Google

Gemini Live: What is it? This is the Google product that most closely resembles GPT-4o; it’s a real-time speaking version of the company’s AI model. According to Google, live video communication will also be possible with this capability “later this year.” The business claims that it will be a helpful conversational helper for tasks like practicing a speech or getting ready for a job interview.

How to get it: Google’s Gemini Advanced premium AI package will offer Gemini Live starting in “the coming months.”

Gemini Advanced is available for free for two months after that, after which it costs $20 a month.

Hold on, tell me about Project Astra. The goal of Astra is to create a do-everything artificial intelligence agent. It was shown off at Google’s I/O conference, but it won’t be available until later in the year.

Users will be able to access Astra via desktop computers and smartphones, but Oriol Vinyals, vice president of research at Google DeepMind, told MIT Technology Review that the company is also looking into additional possibilities, like integrating it into smart glasses or other devices.
Which is superior?

Without having access to the whole versions of these models, it’s difficult to say. While OpenAI chose to introduce GPT-4o through a live presentation that seemed more genuine than Google’s polished film, both companies used models that were asked to perform tasks that their designers had probably already practiced. When they are introduced to millions of consumers with different needs, that will be the true test.

Nonetheless, the two industry leaders appear to be pretty similar when comparing their published videos—at least in terms of usability—when comparing OpenAI and Google. In general, GPT-4o appears to have an advantage over Project Astra in terms of audio, since it can produce lifelike voices, have a smooth conversation, and even sing. On the other hand, Project Astra has more sophisticated visual skills, such as the ability to “remember” where you left your glasses. The fact that OpenAI decided to provide the new capabilities earlier could mean that its product sees more use in the beginning than Google’s, which won’t be fully functional until later this year. It’s too early to tell which model generates more insightful responses or “hallucinates” erroneous information less frequently.

Google and OpenAI claim that their models have undergone extensive testing. More than 70 specialists in social psychology and disinformation analysis, according to OpenAI, assessed GPT-4o. Meanwhile, Google claims that Gemini “has the most comprehensive safety evaluations of any Google AI model to date, including for bias and toxicity.”

In the future, however, these corporations are creating AI models that will search, filter, and assess the world’s information on our behalf in order to provide us with a succinct response to our queries. You should still be wary of what chatbots tell you, even more so than with basic ones.

Author: utdinfo_2ye1ln

Leave a Reply

Your email address will not be published. Required fields are marked *