Choose Architecture Type

🏗️

ResNet50 Architecture

Interactive visualization of ResNet50 with 50 layers and skip connections

25.6M params 2015 D3.js Interactive
👁️

Vision Transformers

Transformer-based architectures for computer vision

ViT Swin DeiT CLIP

Available Architectures

ResNet-50

Residual Network with 50 layers, uses skip connections to solve vanishing gradient problem

25.6M params 2015

VGG-16

Very Deep Convolutional Networks with 16 layers, uses small 3×3 filters

138M params 2014

EfficientNet-B0

EfficientNet with compound scaling, balances accuracy and efficiency

5.3M params 2019

MobileNet-V2

Mobile-optimized network with inverted residuals and linear bottlenecks

3.5M params 2018

ViT-Base

Base Vision Transformer with 12 transformer blocks and 768 embedding dimensions

86.6M params 2020

ViT-Large

Large Vision Transformer with 24 transformer blocks and 1024 embedding dimensions

304M params 2020

Swin Transformer

Swin Transformer with shifted windows for efficient vision modeling

87.8M params 2021

DeiT

Data-efficient Image Transformer with knowledge distillation

86.6M params 2021