Ethan Caballero

I'm interested in finding all the downstream evaluations/measurements that matter and finding that which scales best according to all those downstream evaluations ~simultaneously, primarily in the largest scale regime (largest compute, largest dataset, ~largest model). These interests encompass all aspects of Artificial Neural Networks (unsupervised learning, reinforcement learning, capabilities, alignment, all modalities, science of deep learning, etc.).

I'm currently a PhD student at Mila working mostly with David Krueger, Irina Rish, and Blake Richards. Before I started focusing on the scaling perspective, I mostly worked on out-of-distribution generalization and generalization theory with Yoshua's and Aaron's students.

x = ethan ; y = victor ; z = caballero

email adresses: ;

email / cv / linkedin / twitter / google_scholar / github

Ethan Caballero

(fork of this website)