I recently came across an interesting thread on Twitter discussing a hypothetical scenario where research papers are published on GitHub and subsequent papers are diffs over the original paper. Information overload has been a real problem in ML with so many new papers coming every month.
If you could represent a paper as a code diff, many papers could be compressed down to <50 lines :) The diff would also be more intuitive to read and eval standardized.— Denny Britz (@dennybritz) April 25, 2020
Some ideas are so different that this wouldn’t apply, but I think it would work well for the majority. https://t.co/JoAcIK9Cm7
This post is a fun experiment showcasing how the commit history could look like for the BERT paper and some of its subsequent variants.