Artificial Intelligence Is Working Remotely

A new tool from OpenAI shows how COVID-19 set the scene for a revolution in white-collar work.

3 Feb 2025 — 3 min read

In 2023, I wrote that AI and remote work were a match made in heaven. The COVID-19 lockdowns forced and expedited the online transition of many work processes. Today, most white-collar work and collaboration happen within digital environments: In a Slack conversation or a hybrid Zoom meeting, inside a Google sheet or a GitHub repository. Regardless of whether we are working at an office, hybrid, or fully remote, work happens in environments that are digitally accessible.

This accessibility makes it easier for humans to work from anywhere. But, more importantly, it makes it easier for non-humans to step right in and pick up human tasks. Specifically, it means that new AI models can plug directly into our conversations, documents, and work environments and make things happen. In 2023, that means connecting a Google Sheet to ChatGPT or bringing an AI assistant to summarize a Zoom meeting. In 2025, it would mean much more.

OpenAI recently showcased two new tools that illustrate what this means in practice. In late January, it announced "Operator," an AI agent that can use a browser to perform tasks on the user's behalf. For example, you can ask the Operator to "buy some healthy dog food for my 150-pound Great Dane dog," and it would then browse on its own to Target, search and compare different products, fill in the necessary forms, and complete the order on your behalf. You can also ask it to order food, book an Uber, book hotels and flight tickets, and more.

Yesterday, OpenAI announced "deep research," a new tool that can browse the web and use advanced reasoning to conduct detailed, multi-step research tasks and summarize them in detailed reports with tables and citations — just like a human would. In the example below, an OpenAI employee asks deep research, evaluate, and compare mobile adoption rates and usage patterns across different markets:

Just like a human employee, the tool responds to any requests with a series of questions. It wants to make sure it understands you perfectly.

The tool is available in the $200/month pro tier of ChatGPT and will be gradually rolled out to more users. And just like a human, it can deal with vague instructions and specific requests to rely on its own judgment. Once it is ready to begin, the model displays a feed that allows the user to see what it is doing: the websites and sources it visits, the questions it is pondering, and whatever sections of the final answers it is working on.

The ability to adopt a human workstyle — to respond to human requests, use tools that humans use, and deliver reports in the same formats and channels humans use — is a huge advantage. The quality of the work is also impressive.

In January, a group of researchers launched a new benchmark to test the quality of AI models. Named "Humanity's Last Test," it consists of around 3,000 challenges on a hundred topics. The challenges range from solving math and computer science problems to deciphering ancient Roman inscriptions and recalling details from Greek mythology.

In January, OpenAI's latest GPT-4O model managed to solve Humanity's Last Test with 3.3% accuracy, while comparable models like Claude 3.5 Sonnet and Google Gemini Thinking achieved 4.3% and 6.2%, respectively. DeepSeek-R1, the Chinese model that made waves last week, achieved 9.4%. Now, the new OpenAI deep research model has managed to achieve 26.6% on the test, and it did so without browsing the web to look for answers and without using code to solve math problems.

It looks like the latest OpenAI model is very doing well across many topics.
My guess is that Deep Research particularly helps with subjects including medicine, classics, and law. pic.twitter.com/x8Ilmq1aQS
— Dan Hendrycks (@DanHendrycks) February 3, 2025

This is an incredible result and an incredible amount of progress in a matter of weeks. How long would it take for AI models to accurately answer nearly all questions that humans can come up with? By the looks of it, not very long.

Employers and landlords are still arguing with employees about "return to the office," with Elon Musk leading the latest battle to get government employees back at their desks — or out of their jobs. While the battle rages on, we might be missing the bigger picture. As we look back on the COVID-19 lockdowns, their most significant impact will be in paving the way for the replacement rather than the displacement of many white-collar jobs. It doesn't mean everyone will be unemployed, but it does mean people will be doing very different things, and they will likely do them in spaces that are designed and managed differently.

Deep research is available to ChatGPT's $200/month pro tier. Have you tried it? I would love to hear about your experience.

Have a great week,

🎤 How will AI reshape our cities, offices, and markets? My speaking schedule for the spring is filling up. Visit my speaker profile and get in touch to learn more.

Future of Work Artificial Intelligence

Dror Poleg Twitter

Comments

General Purpose Urbanism

Our cities and offices must learn from the architecture of AI models.

8 Apr 2025

Paid Members Public

Tariffs and Intangible Assymetry

There's a fundamental asymmetry between China and America: Ideas spread quickly but factories are built slowly. China can catch up with American innovation faster than America can keep up with Chinese manufacturing (even if America wanted to). Over the past decades, America's policy response has been

25 Mar 2025

Paid Members Public

Zoom and Boom

At work and at war, convenience trumps authority every time.

Dror Poleg Twitter

Dror Poleg Newsletter

Comments

Related Posts

General Purpose Urbanism

Tariffs and Intangible Assymetry

Zoom and Boom