Connect with us

Tech

ChatGPT and Generative AI are booming, but at a very expensive price

Published

on

107191457-1675888570906-gettyimages-1246870629-AFP_338Q79Q.jpeg


Advertisement
OpenAI CEO Sam Altman speaks during a keynote address announcing ChatGPT integration for Bing at Microsoft in Redmond, Washington, on February 7, 2023.
Advertisement

Jason Redmond | AFP | Getty Images

Before OpenAI’s ChatGPT emerged and captured the world’s attention for its ability to create compelling sentences, a small startup called Latitude was wowing consumers with its AI Dungeon game that let them use artifical intelligence to create fantastical tales based on their prompts.
Advertisement

But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the cost to maintain the text-based role-playing game began to skyrocket. Powering AI Dungeon’s text-generation software was the GPT language technology offered by the Microsoft-backed artificial intelligence research lab OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.

Compounding the predicament was that Walton also discovered content marketers were using AI Dungeon to generate promotional copy, a use for AI Dungeon that his team never foresaw, but that ended up adding to the company’s AI bill.

Advertisement

At its peak in 2021, Walton estimates Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services in order to keep up with the millions of user queries it needed to process each day.

“We joked that we had human employees and we had AI employees, and we spent about as much on each of them,” Walton said. “We spent hundreds of thousands of dollars a month on AI and we are not a big startup, so it was a very massive cost.”

Advertisement

By the end of 2021, Latitude switched from using OpenAI’s GPT software to a cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup also incorporated open source and free language models into its service to lower the cost. Latitude’s generative AI bills have dropped to under $100,000 a month, Walton said, and the startup charges players a monthly subscription for more advanced AI features to help reduce the cost.

Latitude’s pricey AI bills underscore an unpleasant truth behind the recent boom in generative AI technologies: The cost to develop and maintain the software can be extraordinarily high, both for the firms that develop the underlying technologies, generally referred to as a large language or foundation models, and those that use the AI to power their own software.

Advertisement

The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists eye companies that could potentially be worth trillions, and big companies such as Microsoft, Meta, and Google use their considerable capital to develop a lead in the technology that smaller challengers can’t catch up to. 

But if the margin for AI applications is permanently smaller than previous software-as-a-service margins, because of the high cost of computing, it could put a damper on the current boom. 

Advertisement

The high cost of training and “inference” — actually running — large language models is a structural cost that differs from previous computing booms. Even when the software is built, or trained, it still requires a huge amount of computing power to run large language models because they do billions of calculations every time they return a response to a prompt. By comparison, serving web apps or pages requires much less calculation.

These calculations also require specialized hardware. While traditional computer processors can run machine learning models, they’re slow. Most training and inference now takes place on graphics processors, or GPUs, which were initially intended for 3D gaming, but have become the standard for AI applications because they can do many simple calculations simultaneously. 

Advertisement

Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. Scientists that build these models often joke that they “melt GPUs.”

Training models

Nvidia A100 processor

Advertisement

Nvidia

Analysts and technologists estimate that the critical process of training a large language model such as GPT-3 could cost more than $4 million. More advanced language models could cost over “the high-single digit-millions” to train, said Rowan Curran, a Forrester analyst who focuses on AI and machine learning.

Advertisement

Meta’s largest LLaMA model released last month, for example, used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words is about 1,000 tokens), taking about 21 days, the company said when it released the model last month. 

It took about 1 million GPU hours to train. With dedicated prices from AWS, that would cost over $2.4 million. And at 65 billion parameters, it’s smaller than the current GPT models at OpenAI, like ChatGPT-3, which has 175 billion parameters. 

Advertisement

Clement Delangue, the CEO of AI startup Hugging Face, said the process of training the company’s Bloom large language model took more than two-and-a-half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs.”

Organizations that build large language models must be cautious when they retrain the software, which helps the software improve its abilities, because it costs so much, he said.

Advertisement

“It’s important to realize that these models are not trained all the time, like every day,” Delangue said, noting that’s why some models, like ChatGPT, don’t have knowledge of recent events. ChatGPT’s knowledge stops in 2021, he said.

“We are actually doing a training right now for the version two of Bloom and it’s gonna cost no more than $10 million to retrain,” Delangue said. “So that’s the kind of thing that we don’t want to do every week.”

Advertisement

Inference and who pays for it

Bing with Chat

Jordan Novet | CNBC

Advertisement

To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it might need to run millions of times for a popular product.

For a product as popular as ChatGPT — which investment firm UBS estimates to have reached 100 million monthly active users in January — Curran believes that it could have cost OpenAI $40 million to process the millions of prompts people fed into the software that month.

Advertisement

Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate Microsoft’s Bing AI chatbot, which is powered by an OpenAI ChatGPT model, needs at least $4 billion of infrastructure to serve responses to all Bing users.

In the case of Latitude, for instance, while the startup didn’t have to pay to train the underlying OpenAI language model it was accessing, it had to account for the inferencing costs that were something akin to “half-a-cent per call” on “a couple million requests per day,” a Latitude spokesperson said.

Advertisement

“And I was being relatively conservative,” Curran said of his calculations.

In order to sow the seeds of the current AI boom, venture capitalists and tech giants have been investing billions of dollars into startups that specialize in generative AI technologies. Microsoft, for instance, invested as much as $10 billion into GPT’s overseer OpenAI, according to media reports in January. Salesforce‘s venture capital arm, Salesforce Ventures, recently debuted a $250 million fund that caters to generative AI startups.

Advertisement

As investor Semil Shah of the VC firms Haystack and Lightspeed Venture Partners described on Twitter, “VC dollars shifted from subsidizing your taxi ride and burrito delivery to LLMs and generative AI compute.”

Many entrepreneurs see risks in relying on potentially subsidized AI models that they don’t control and merely pay for on a per-use basis.

Advertisement

“When I talk to my AI friends at the startup conferences, this is what I tell them: Do not solely depend on OpenAI, ChatGPT or any other large language models,” said Suman Kanuganti, founder of personal.ai, a chatbot currently in beta mode. “Because businesses shift, they are all owned by big tech companies, right? If they cut access, you’re gone.”

Companies such as enterprise tech firm Conversica are exploring how they can use the tech through Microsoft’s Azure cloud service at its currently discounted price.

Advertisement

While Conversica CEO Jim Kaskade declined to comment about how much the startup is paying, he conceded that the subsidized cost is welcome as it explores how language models can be used effectively.

“If they were truly trying to break even, they’d be charging a hell of a lot more,” Kaskade said.

Advertisement

How it could change

Nvidia expanded from gaming into A.I. Now the big bet is paying off as its chips power ChatGPT

It’s unclear if AI computation will stay expensive as the industry develops. Companies making the foundation models, semiconductor makers and startups all see business opportunities in reducing the price of running AI software.

Nvidia, which has about 95% of the market for AI chips, continues to develop more powerful versions designed specifically for machine learning, but improvements in total chip power across the industry have slowed in recent years.

Still, Nvidia CEO Jensen Huang believes that in 10 years, AI will be “a million times” more efficient because of improvements not only in chips, but also in software and other computer parts.

Advertisement

“Moore’s Law, in its best days, would have delivered 100x in a decade,” Huang said last month on an earnings call. “By coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and working with data scientists, AI researchers on new models, across that entire span, we’ve made large language model processing a million times faster.”

Some startups have focused on the high cost of AI as a business opportunity.

Advertisement

“Nobody was saying ‘You should build something that was purpose-built for inference.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on inference by doing more processing in the computer’s memory, as opposed to on a GPU.

“People are using GPUs today, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells that cost a ton of money. The problem with inference is if the workload spikes very rapidly, which is what happened to ChatGPT, it went to like a million users in five days. There is no way your GPU capacity can keep up with that because it was not built for that. It was built for training, for graphics acceleration,” he said.

Advertisement

Delangue, the HuggingFace CEO, believes more companies would be better served focusing on smaller, specific models that are cheaper to train and run, instead of the large language models that are garnering most of the attention.

Meanwhile, OpenAI announced last month that it’s lowering the cost for companies to access its GPT models. It now charges one-fifth of one cent for about 750 words of output.

Advertisement

OpenAI’s lower prices have caught the attention of AI Dungeon-maker Latitude.

“I think it’s fair to say that it’s definitely a huge change we’re excited to see happen in the industry and we’re constantly evaluating how we can deliver the best experience to users,” a Latitude spokesperson said. “Latitude is going to continue to evaluate all AI models to be sure we have the best game out there.”

Advertisement

Watch: AI’s “iPhone Moment” – Separating ChatGPT Hype and Reality

AI's "iPhone Moment" – Separating ChatGPT Hype and Reality





Source link

Advertisement

Tech

Amazon to lay off 9,000 more workers in addition to earlier cuts

Published

on

By

106953080-1646863596820-jas.jpg



The latest round will primarily impact Amazon’s cloud computing, human resources, advertising and Twitch livestreaming businesses, Jassy said in the memo.

Advertisement

Amazon is undergoing the largest layoffs in company history after it went on a hiring spree during the Covid-19 pandemic. The company’s global workforce swelled to more than 1.6 million by the end of 2021, up from 798,000 in the fourth quarter of 2019.

Jassy is also undergoing a broad overview of the company’s expenses as it reckons with an economic downturn and slowing growth in its core retail business. Amazon froze hiring in its corporate workforce, axed some experimental projects and slowed warehouse expansion.

Advertisement

While the company aims to operate leaner this year, Jassy said he remains optimistic about the company’s “largest businesses,” retail and Amazon Web Services, as well as other, new divisions it continues to invest in.

Shares of Amazon were down more than 2% in afternoon trading Monday.

Advertisement
As we’ve just concluded the second phase of our operating plan (“OP2”) this past week, I’m writing to share that we intend to eliminate about 9,000 more positions in the next few weeks—mostly in AWS, PXT, Advertising, and Twitch. This was a difficult decision, but one that we think is best for the company long term.

Let me share some additional context.

Advertisement

As part of our annual planning process, leaders across the company work with their teams to decide what investments they want to make for the future, prioritizing what matters most to customers and the long-term health of our businesses. For several years leading up to this one, most of our businesses added a significant amount of headcount. This made sense given what was happening in our businesses and the economy as a whole. However, given the uncertain economy in which we reside, and the uncertainty that exists in the near future, we have chosen to be more streamlined in our costs and headcount. The overriding tenet of our annual planning this year was to be leaner while doing so in a way that enables us to still invest robustly in the key long-term customer experiences that we believe can meaningfully improve customers’ lives and Amazon as a whole.

As our internal businesses evaluated what customers most care about, they made re-prioritization decisions that sometimes led to role reductions, sometimes led to moving people from one initiative to another, and sometimes led to new openings where we don’t have the right skills match from our existing team members. This initially led us to eliminate 18,000 positions (which we shared in January); and, as we completed the second phase of our planning this month, it led us to these additional 9,000 role reductions (though you will see limited hiring in some of our businesses in strategic areas where we’ve prioritized allocating more resources).

Advertisement

Some may ask why we didn’t announce these role reductions with the ones we announced a couple months ago. The short answer is that not all of the teams were done with their analyses in the late fall; and rather than rush through these assessments without the appropriate diligence, we chose to share these decisions as we’ve made them so people had the information as soon as possible. The same is true for this note as the impacted teams are not yet finished making final decisions on precisely which roles will be impacted. Once those decisions have been made (our goal is to have this complete by mid to late April), we will communicate with the impacted employees (or where applicable in Europe, with employee representative bodies). We will, of course, support those we have to let go, and will provide packages that include a separation payment, transitional health insurance benefits, and external job placement support.

If I go back to our tenet—being leaner while doing so in a way that enables us to still invest robustly in the key long-term customer experiences that we believe can meaningfully improve customers’ lives and Amazon as a whole—I believe the result of this year’s planning cycle is a plan that accomplishes this objective. I remain very optimistic about the future and the myriad of opportunities we have, both in our largest businesses, Stores and AWS, and our newer customer experiences and businesses in which we’re investing.

Advertisement

To those ultimately impacted by these reductions, I want to thank you for the work you have done on behalf of customers and the company. It’s never easy to say goodbye to our teammates, and you will be missed. To those who will continue with us, I look forward to partnering with you as we make life easier for customers every day and relentlessly inventing to do so.

Andy

Advertisement



Source link

Advertisement
Continue Reading

Tech

OpenAI CEO Sam Altman says he’s a ‘little bit scared’ of A.I.

Published

on

By

106971056-1636056957357-gettyimages-1173441590-TECHCRUNCH_DISRUPT.jpeg


Advertisement
Sam Altman, co-founder and chief executive officer of OpenAI Inc., speaks during TechCrunch Disrupt 2019 in San Francisco, California, on Thursday, Oct. 3, 2019.
Advertisement

David Paul Morris | Bloomberg | Getty Images

OpenAI CEO Sam Altman said in a recent interview with ABC News that he’s a “little bit scared” of artificial intelligence technology and how it could affect the workforce, elections and the spread of disinformation.
Advertisement

OpenAI developed the ChatGPT bot, which creates human-like answers to questions and ignited a new AI craze.

Advertisement

related investing news

Time to buy the tech rally? Hedge fund manager Dan Niles and others reveal their top picks
CNBC Pro

“I think people really have fun with [ChatGPT],” Altman said in the interview.

But his excitement over the transformative potential of AI technology, which Altman said will eventually reflect “the collective power, and creativity, and will of humanity,” was balanced by his concerns about “authoritarian regimes” developing competing AI technology.

“We do worry a lot about authoritarian governments developing this,” Altman said. Overseas governments have already begun to bring competing AI technology to market.

Advertisement

Chinese tech company Baidu, for example, recently held a release event for its ChatGPT competitor, a chat AI called Ernie bot.

Years before Russia’s invasion of Ukraine, Russian President Vladimir Putin said whoever becomes the leader in AI technology “will be the ruler of the world.” Altman called the comments “chilling.”

Advertisement

Both Google and Microsoft have aggressively stepped up their AI plays. Microsoft chose to partner with Altman’s OpenAI to integrate its GPT technology into Bing search. Google parent Alphabet unveiled an internally developed chatbot called Bard AI, to mixed feedback from Google employees and test drivers.

The influence of ChatGPT and AI tools like it hasn’t yet reverberated through the American election process, but Altman said the 2024 election was a focus for the company.

Advertisement

“I’m particularly worried that these models could be used for large-scale disinformation,” the CEO told ABC.

“Now that they’re getting at writing computer code, [models] could be used for offensive cyberattacks,” he said.

Advertisement

ChatGPT’s programming prowess has already made a mark on many developers. It already functions as a “co-pilot” for programmers, Altman said, and OpenAI is working toward unlocking a similar functionality for “every profession.”

The CEO acknowledged that it would mean many people would lose their jobs but said it would represent an opportunity to come up with a better kind of job.

Advertisement

“We can have a much higher quality of life, standard of living,” Altman said. “People need time to update, to react, to get used to this technology.”

Watch the full interview on ABC News.

Advertisement
OpenAI says its GPT-4 model can beat 90% of humans on the SAT



Source link

Advertisement
Continue Reading

Tech

Microsoft is using OpenAI to make it easier for doctors to take notes

Published

on

By

107183754-1674658141197-gettyimages-1459359021-_r7a9120_a787ccfd-5888-49a9-b381-b96bf2bf1cd0.jpeg


Advertisement
Velib bicycles are parked in front of the the U.S. computer and micro-computing company headquarters Microsoft on January 25, 2023 in Issy-les-Moulineaux, France.
Advertisement

Chesnot | Getty Images

Microsoft‘s speech recognition subsidiary Nuance Communications on Monday announced Dragon Ambient eXperience (DAX) Express, a clinical notes application for health-care workers powered by artificial intelligence.
Advertisement

DAX Express aims to help reduce clinicians’ administrative burdens by automatically generating a draft of a clinical note within seconds after a patient visit. The technology is powered by a combination of ambient A.I., which forms insights from unstructured data like conversations, and OpenAI’s newest model, GPT-4.

Diana Nole, the executive VP of Nuance’s healthcare division, told CNBC that the company wants to see physicians “get back to the joy of medicine” so they can take care of more patients.

Advertisement

“Our ultimate goal is to reduce this cognitive burden, to reduce the amount of time that they actually have to spend on these administrative tasks,” she said.

Microsoft acquired Nuance for around $16 billion in 2021. The company derives revenue by selling tools for recognizing and transcribing speech during doctor office visits, customer-service calls, and voicemails.  

Advertisement

DAX Express complements other existing services that Nuance already has on the market.

Nole said the technology will be enabled through Nuance’s Dragon Medical One speech recognition application, which is used by more than 550,000 physicians. Dragon Medical One is a cloud-based workflow assistant that physicians can operate using their voices, allowing them to navigate clinical systems and access patient information quickly, Clinical notes generated by DAX Express will appear in the Dragon Medical One desktop.

Advertisement

DAX Express also builds on the original DAX application that Nuance launched in 2020. DAX converts verbal patient visits into clinical notes, and it sends them through a human review process to ensure they are accurate and high-quality. The notes appear in the medical record within four hours after the appointment.

DAX Express, in contrast, generates clinical notes within seconds so that physicians can review automated summaries of their patient visits immediately.

Advertisement

“We believe that physicians, clinicians are going to want a combination of all of these because every specialty is different, every patient encounter is different. And you want to have efficient tools for all of these various types of visits,” Nole said. 

Nuance did not provide CNBC with specifics about the cost of these applications. The company said the price of Nuance’s technology varies based on the number of users and the size of a particular health system.

Advertisement

DAX Express will initially be available in a private preview capacity this summer. Nole said Nuance does not know when the technology will be more widely available, as it will depend on the feedback the company receives from its first users. 

Patient information is particularly sensitive and regulated under HIPAA and other laws. Alysa Taylor, a corporate vice president in the Azure group at Microsoft, told CNBC that DAX Express adheres to the core principles of Microsoft’s responsible A.I. framework, which guides all A.I. investments the company, as well as additional safety measures that Nuance has in place. Nuance has strict data agreements with its customers, and the data is fully encrypted and runs in HIPAA-compliant environments.

Advertisement

Nole added that even though the A.I. will help physicians and clinicians carry out the administrative legwork, professionals are still involved every step of the way. Physicians can make edits to the notes that DAX Express generates, and they sign off on them before they are entered into a patient’s electronic health record.

She said, ultimately, using DAX Express will help improve both the patient experience and the physician experience. 

Advertisement

“The physician and the patient can just face one another, they can communicate directly,” Nole said. “The patient feels listened to. It’s a very trusted experience.”



Source link

Advertisement

Advertisement
Continue Reading

Trending