EN / KO

Schools

Seminars

Home Schools Computational Sciences Seminars

FIELD
AI and Natural Sciences
DATE
May 14 (Wed), 2025
TIME
14:00 ~ 16:00
PLACE
7323
SPEAKER
Lee, Seok Hyeong
HOST
Choi, Jaesung
INSTITUTE
Seoul National University
TITLE
A simple model describing skill emergence and scaling laws
ABSTRACT
Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as input resources (training time, training data, or model size) increases, a phenomenon known as emergence. Another widely observed property of deep learning models is the scaling laws - how the learning loss scales with those input resources, which happens to be certain power laws universally observed. In this work, we present a simple framework which allows us to describe both phenomena. We assume a model where each skill is learnt separately via a simple multilinear network, and this allows us to explicitly solve the model and find analytic expressions for the skill emergence and scaling laws with training time, data size, model size, and optimal compute. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network. This is a joint work with Yoonsoo Nam, Nayara Fonseca, Chris Mingard and Ard A. Louis.
FILE