Don't call me a programmer, I'm an "AI engineer", Musk: Start rolling natural language programming

Source: Heart of the Machine

**The job with the highest demand in the next ten years is "AI engineer"? **

After the emergence of ChatGPT, people predicted that "all industries will be reshaped by AI", some jobs will be replaced, and some jobs will change their form. What will their careers be like as programmers who build AI?

Recently, things seem to be on the spectrum. A group of engineers and scholars called out the concept of "AI engineer" and received many responses:

Due to the generalization and powerful capabilities of large language models such as GPT-4, the way we work may soon change to working with AI, and keeping up with the pace of artificial intelligence is a full-time job in itself.

It is said that this "AI engineer" is between the full-stack engineer and the machine learning engineer, occupying part of the back-end engineer and focusing on the construction of large models. Now it is still in the definition stage, but judging from the heated discussions, it should not be far from landing, after all, the speed of the ChatGPT revolution is so fast.

As soon as the idea came out, big Vs in the AI field quickly commented. Andrej Karpathy, an OpenAI scientist and former head of AI and autonomous driving at Tesla, agrees. "Large models create a whole new layer of abstraction and specialization, so far I've called it 'hinting engineers,' but now it's not just a matter of hinting."

In addition, he pointed out four main points:

  • Past machine learning work has typically involved training algorithms from scratch, and the results have typically had limited performance.
  • Large-scale model training is very different from traditional machine learning. The former system has a large workload, and a new role has been split to focus on large-scale training of Transformer on supercomputers.
  • Numerically, the number of AI engineers may be much higher than that of machine learning engineers/large model engineers.
  • You don't need any training to be successful in this role.

After reading it, Musk also said:

The position is in high demand, important and low barriers to entry. It seems exciting and anxious.

During the discussion, some people also proposed names such as "cognitive engineer" and "AI system engineer" as candidates. Nvidia AI scientist Jim Fan believes that this emerging profession should be called "gradient-free engineer" - from traditional tools 1.0 , to neural network 2.0, and then to 3.0 without gradient architecture, we finally waited for the 4.0 version of the GPT series of self-training.

In this regard, Sebastian Raschka, an assistant professor at the University of Wisconsin, said that this is only suitable for general assistants, and for most businesses, you don't need "general".

There are many names and definitions given, let us see what kind of position is this "AI engineer"?

We are witnessing a once-in-a-decade shift in applied AI, fueled by the breakthrough capabilities of fundamental models and open-source large models and APIs.

AI tasks that took five years and a research team to accomplish in 2013 now require only APIs, documentation, and a spare afternoon in 2023.

However, it’s the details that make the difference – the challenges of applying and productizing AI are endless:

  • On the model, there are from the largest GPT-4 and Claude model, to the open source Huggingface, LLaMA and other models;
  • Tools, from the most popular linking, retrieval, and vector search tools (such as LangChain, LlamaIndex, and Pinecone) to the emerging field of autonomous agents (such as Auto-GPT and BabyAGI);
  • Technically, the number of new papers, models, and techniques submitted each day has grown exponentially with interest and funding, to the point that understanding it all has become almost a full-time job.

If this situation is taken seriously, it should be considered a full-time job. As a result, software engineering will spawn a new subdiscipline dedicated to the application of artificial intelligence and effectively employing the emerging stack, like "Site Reliability Engineers" (SREs), "DevOps Engineers", "Data Engineers" and The same is true for the emergence of "analytical engineers".

The brand new (and least awesome) version of this role appears to be: artificial intelligence engineer.

We know that every startup has some kind of Slack channel for discussing AI use, and soon those channels will transition from informal groups to formal teams. Thousands of software engineers are currently working on producing AI APIs and OSS models, whether during office hours or evenings and weekends, in corporate Slacks or independent Discords, all professionalized and centralized under one title: AI engineer.

This is likely to be the most in-demand engineering job in the next decade.

AI engineers will be found everywhere, from tech giants like Microsoft and Google, to leading startups like Figma, Vercel, and Notion, to independent developers like Simon Willison, Pieter Levels, and Riley Goodside. They earn $300,000 a year for their engineering practice at Anthropic and $900,000 a year building software at OpenAI. They spend their free weekends pondering ideas at AGI House and sharing tips on the /r/LocalLLaMA subreddit on Reddit.

What they all have in common is the ability to translate advances in artificial intelligence into practical products used by millions of people almost overnight. And in it, you don't see a Ph.D. title. When delivering AI products, you need engineers, not researchers.

The big reversal of AI engineers and ML engineers

A set of data on the Indeed website shows that the number of positions for machine learning engineers is 10 times that of AI engineers, but in comparison, the growth rate in the AI field is faster, and it is predicted that this proportion will be within five years. The inversion occurs and there will be many times as many AI engineers as ML engineers.

HN Who's Hiring (which is a monthly post on Hacker News that provides a platform for employers to post job postings) Monthly Employment Trends by Category

The debate on the differences between AI and ML has been endless, but cautious. We also know that AI software can be built by ordinary software engineers. Recently, however, discussions have revolved around another issue, namely, a popular thread on Hacker News "How to get into AI engineering" has aroused widespread interest. This popular post also illustrates the basic limiting principles that still exist in the market, The distinction between each position is still very fine.

*Screenshot of a June 2023 post on Hacker News: "How to Get Into AI Engineering" Top Voted Answers. *

Until now, many people thought of AI engineering as a form of ML engineering or data engineering, so when someone asks how to get into a field, they tend to recommend the same prerequisites, as in the answers above, many people Recommend Andrew Ng's Coursera course. But none of those effective AI engineers have completed Wu Enda's course on Coursera, they are not familiar with PyTorch, and they don't know the difference between Data Lake (Data Lake) and Data Warehouse (Data Warehouse).

In the near future, no one is going to suggest that you start learning AI engineering by reading the Transformer paper "Attention is All You Need", any more than you start learning driving by reading blueprints for the Ford Model T. Of course, it is helpful to understand the fundamentals and the historical development of technology, which can help you find ways to improve your thinking and efficiency. But sometimes you can also use products to learn their characteristics through practical experience.

The reversal of AI engineers vs. ML engineers won't happen overnight, and for someone with a good data science and machine learning background, engineering and AI engineering may not look good for a long time. However, over time, the economics of demand and supply will prevail, and people's views on AI engineering will change.

**Why AI engineers will rise? **

At the model level, many basic models are now few-shot learners with strong context learning and zero-shot transfer capabilities. The performance of the model often exceeds the original intention of the training model. In other words, the people who create these models don't fully know the scope of the models' capabilities. And those who are not LLM (Large Language Model) experts can discover and exploit these capabilities by interacting more with the model and applying it to domains underestimated by research.

At the talent level, Microsoft, Google, Meta, and large basic model laboratories have monopolized scarce research talents, and they provide APIs for "AI research as a service". You may not be able to hire this kind of researcher, but you can rent their services. There are now about 5,000 LLM researchers and 50 million software engineers worldwide. This supply constraint dictates that AI engineers in the “middle” category will rise to meet talent demand.

At the hardware level, major technology companies and institutions have hoarded GPUs in large quantities. Of course, OpenAI and Microsoft were the first to do so, but Stability AI started the GPU competition for startups by emphasizing their 4,000 GPU clusters.

In addition, some new startups such as Inflection ($1.3B), Mistral ($113M), Reka ($58M), Poolside ($26M) and Contextual ($20M) have generally started raising huge sums Seed round of financing to own its own hardware facilities.

US tech executive and investor Nat Friedman even announced their Andromeda initiative, a $100 million GPU cluster with 10 exaflops of computing power dedicated to supporting the startups it invests in. On the other side of the API landscape, more AI engineers will be able to use models, not just train them.

In terms of efficiency, instead of requiring data scientists and machine learning engineers to perform tedious data collection before training a single domain-specific model and putting it into production, product managers and software engineers can build and verify product ideas by interacting with LLM.

Let's say the latter (data, ML engineers) outnumber the former (AI engineers) by 100 to 1000 times, and the way you work by interacting with LLM will get you 10 to 100 times faster than traditional machine learning. As a result, AI engineers will be able to validate AI products 10,000 times cheaper than before.

At the software level, there will be changes from Python to Java. The data and AI world has traditionally been centered around Python, as were the first AI engineering tools such as LangChain, LlamaIndex, and Guardrails. However, there should be at least as many Java developers as there are Python developers, so tools are increasingly extending in this direction, from LangChain.js and Transformers.js to Vercel's new AI SDK. The overall size of the market and opportunity for Java is impressive.

Whenever a subgroup comes along with a completely different background, speaks a completely different language, makes a completely different product, uses a completely different tool, they end up splitting into their own group.

The role of code in the evolution of software 2.0 to software 3.0

6 years ago, Andrej Karpathy wrote a very influential article describing Software 2.0, contrasting classical stacks of hand-written programming languages that accurately model logic with new stacks of machine learning neural networks that approximate logic. The article shows that software can solve many more problems than humans can model.

This year, Karpathy went on to post that the hottest new programming language is English, as hints from generative AI can be understood as human-designed code, in many cases in English, and interpreted by LLMs, eventually filling the gaps in his chart. gray area.

*Note: The classic stack of Software 1.0 (Software 1.0) is written in Python, C++ and other languages. Software 2.0 was written using neural network weights, and no one was involved in the process of writing this code because there are many weights. *

Last year, Engineering became a popular topic, and people started to apply GPT-3 and Stable Diffusion to work. People scoff at AI startups as OpenAI wrappers and worry about the vulnerability of LLM applications to hint injection and reverse hint engineering.

But a very important theme in 2023 is about re-establishing the role of code written by humans, from the giant Langchain with more than 200 million US dollars to the Voyager backed by Nvidia, showing the importance of code generation and reuse. Engineering is both overhyped and persistent, but the reemergence of the Software 1.0 paradigm in Software 3.0 applications is both a huge opportunity and a new space for a plethora of startups:

As human engineers learn to utilize AI, and AI increasingly takes over engineering jobs, in the future, when we look back, it will be difficult to tell the difference between the two.

Reference content:

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)