8,000+ lines in a single file??? I’m going to be sick
Oh that’s not uncommon in the industry. Especially when dealing with legacy code.
Personal best was 40k lines in a file called
misc.c
containing all the global functions that don’t fit anywhere else.Runner up was the one where each developer dumped their miscellaneous functions in their own files, so they don’t have to deal with merge conflicts. Which means we had x1.c, x2.c, x3.c … etc.
Best I can offer is a combined UI and logic class with 12,500 lines currently. It started out with less than 3,000 lines in the year 2000 (using the brand new Java 1.3), grew to 14,000 over time and survived our recent project-wide one-year cleanup project with only minor losses of code lines.
You should see Firefox source code, there are many files like that. Honestly it’s better than having 100,000 files which is what would happen with the size of Firefox.
As someone who professionally works in a project with many, many thousands of files (I don’t know the exact number right now, but we’re coming close to 10 million lines of code) and many of them having thousands of lines (see my other comment): No, longer files is not better than more files.
It depends, obviously if stuff is unrelated than they should be in separate files, but having in one folder 1000 files containing each function I think that would be very exhausting to search through to understand the code.
Yep, Longest Common Subsequence is usually greedy and that’s the earliest set of lines that satisfies the search. Happens when you just treat a file as lines and only match those.
You can get better results with more syntax or content awareness. Chunk into paragraphs or code blocks or functions, then sentences or statement lists, then lines, then words, etc. I think Beyond Compare can do this.