Seminars
Home
Schools
Computational Sciences
Seminars
- FIELD
- AI and Natural Sciences
- DATE
-
May 14 (Wed), 2025
- TIME
- 14:00 ~ 16:00
- PLACE
- 7323
- SPEAKER
- Lee, Seok Hyeong
- HOST
- Choi, Jaesung
- INSTITUTE
- Seoul National University
- TITLE
- A simple model describing skill emergence and scaling laws
- ABSTRACT
- Deep learning models can exhibit what appears to be a sudden ability to solve a
new problem as input resources (training time, training data, or model size) increases, a phenomenon known as emergence. Another widely observed property of deep learning models is the scaling laws - how the learning loss scales with those input resources, which happens to be certain power laws universally observed. In this work, we present a simple framework which allows us to describe both phenomena. We assume a model where each skill is learnt separately via a simple multilinear network, and this allows us to explicitly solve the model and find analytic expressions for the skill emergence and scaling laws with training time, data size, model size, and optimal compute. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.
This is a joint work with Yoonsoo Nam, Nayara Fonseca, Chris Mingard and Ard A. Louis.
- FILE
-