This post was originally published on Substack. Click the link to read the full article.
Why replacing a 7×7 convolution with three 3×3 layers isn’t about parameters — it’s about nonlinear expressivity.
Read the full article on Substack
This post was originally published on Substack. Click the link to read the full article.
Why replacing a 7×7 convolution with three 3×3 layers isn’t about parameters — it’s about nonlinear expressivity.
Read the full article on Substack