Ilya Sutskever. | Photo by JACK GUEZ/AFP via Getty Images
OpenAI’s cofounder and former chief scientist, Ilya Sutskever, made headlines earlier this year after he left start his own AI lab called Safe Superintelligence Inc. He has avoided the limelight since his departure but made a rare public appearance in Vancouver on Friday at the Conference on Neural Information Processing Systems (NeurIPS).
“Pre-training as we know it will unquestionably end,” Sutskever said onstage. This refers to the first phase of AI model development, when a large language model learns patterns from vast amounts of unlabeled data — typically text from the internet, books, and other sources.
During his NeurIPS talk, Sutskever said that, while he believes existing data can still take AI development farther, the industry is tapping out on new data to train on. This dynamic will, he said, eventually force a shift away from the way models are trained today. He compared the situation to fossil fuels: just as oil is a finite resource, the internet contains a finite amount of human-generated content.
“We’ve achieved peak data and there’ll be no more,” according to Sutskever. “We have to deal with the data that we…