any problems with doing this?
int i = new StreamReader("file.txt").ReadToEnd().Split(new char[] {'\n'}).Length
-
Well, the problem with doing this is that you allocate a lot of memory when doing this on large files.
I would rather read the file line by line and manually increment a counter. This may not be a one-liner but it's much more memory-efficient.
Alternatively, you may load the data in even-sized chunks and count the line breaks in these. This is probably the fastest way.
-
Sure - it reads the entire stream into memory. It's terse, but I can create a file today that will fail this hard.
Read a character at a time and increment your count on newline.
EDIT - after some quick research If you want terse and want that shiny new generic feel, consider this:
public class StreamEnumerator : IEnumerable<char> { StreamReader _reader; public StreamEnumerator(Stream stm) { if (stm == null) throw new ArgumentNullException("stm"); if (!stm.CanSeek) throw new ArgumentException("stream must be seekable", "stm"); if (!stm.CanRead) throw new ArgumentException("stream must be readable", "stm"); _reader = new StreamReader(stm); } public IEnumerator<char> GetEnumerator() { int c = 0; while ((c = _reader.Read()) >= 0) { yield return (char)c; } } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } }
which defines a new class which allows you to enumerate over streams, then your counting code can look like this:
StreamEnumerator chars = new StreamEnumerator(stm); int lines = chars.Count(c => c == '\n');
which gives you a nice terse lambda expression to do (more or less) what you want.
I still prefer the Old Skool:
public static int CountLines(Stream stm) { StreamReader _reader = new StreamReader(stm); int c = 0, count = 0; while ((c = _reader.Read()) != -1) { if (c == '\n') { count++; } } return count; }
NB: Environment.NewLine version left as an exercise for the reader
spoulson : This wouldn't work when searching Environment.NewLine, which is usually a two character string (CrLf).JMD : He's got the right idea though. So how about using a RegEx to search for instances of Environment.NewLine? -
Assuming the file exists and you can open it, that will work.
It's not very readable or safe...
-
If you're looking for a short solution, I can give you a one-liner that at least saves you from having to split the result:
int i = File.ReadAllLines("file.txt").Count;
But that has the same problems of reading a large file into memory as your original. You should really use a streamreader and count the line breaks as you read them until you reach the end of the file.
-
The method you posted isn't particularly good. Lets break this apart:
// new StreamReader("file.txt").ReadToEnd().Split(new char[] {'\n'}).Length // becomes this: var file = new StreamReader("file.txt").ReadToEnd(); // big string var lines = file.Split(new char[] {'\n'}); // big array var count = lines.Count;
You're actually holding this file in memory twice: once to read all the lines, once to split it into an array. The garbage collector hates that.
If you like one liners, you can write
System.IO.File.ReadAllLines(filePath).Length
, but that still retrieves the entire file in an array. There's no point doing that if you aren't going to hold onto the array.A faster solution would be:
int TotalLines(string filePath) { using (StreamReader r = new StreamReader(filePath)) { int i = 0; while (r.ReadLine() != null) { i++; } return i; } }
The code above holds (at most) one line of text in memory at any given time. Its going to be efficient as long as the lines are relatively short.
0 comments:
Post a Comment