C++ fstream - use '\n' instead of std::endl
Written on
tl;dr; version: If you have to write a lot of lines to a file and time is of the essence in your C++ code, don’t use std::endl
to print a line delimiter. Use a simple “\n” instead. Why?
Story time
When coding, you innevitably get to a point where you need to write some stuff to a file. C++ offers the fstream library, with ifstream for input and ofstream for output. You also have access to the C-style stdio.h and its FILE type. Finally, C# has many ways of accessing files, but StreamReader and StreamWriter are the commonly recommended ones.
During work on my dissertation project I came across a direct comparison of performance in sorting 200.000 numbers between C++ and C#. That required an input file with 200.000 (technically 200.001, as the first line represents how many numbers follow) integers, one on each line.
The times to run were scary:
C#: 85ms
C++ (fstream): 1100ms
The algorithm employed by both languages for the default sort function (Array.Sort
and std::sort
) is Introsort, thus performance should be equivalent. Looking into this more, I decided to replace fstream with stdio.h. New C++ result?
C++ (FILE): 110ms
Now isn’t that interesting. Here’s the relevant code for comparison:
C++ fstream:
int n = 1;
ifstream f("input.txt");
f>>n;
int* v = new int[n];
for(int i=0;i<n;i++)
{
f>>v[i];
}
f.close();
std::sort(v, v+n);
for(int i=0;i<n;i++)
{
f2<<v[i]<<endl;
}
f2.close();
C++ FILE:
int n = 1;
FILE * pFile;
pFile = fopen("input.txt", "rb");
if (pFile == NULL) { fputs("File error", stderr); exit(1); }
fscanf(pFile, "%d", &n);
int* v = new int[n];
for(int i=0;i<n;i++)
{
fscanf(pFile, "%d", &v[i]);
}
std::sort(v, v+n);
FILE * oFile = fopen("output.txt", "w+");
for(int i=0;i<n;i++)
{
fprintf(oFile, "%d\n", v[i]);
}
The numbers above were as a result of running these tests on a Windows 8.1 machine. The C++ code was tested by compiling with both gcc and Microsoft’s C++ compiler and no differences in runtime were found. Each time, -O2 (or /O2) was used to optimize the code. Many other parameters were empirically tried, but none showed any relevance.
Now, I was curious what was the problem: reading from the input or writing to the output. Removing all output-related code and leaving just input and sorting resulted in these times:
C++ (FILE): 48ms
C++ (fstream): 51ms
Well, doesn’t that immediately show us where the problem is? Yes, the main issue is when writing to the file. But what if we put the output-related code back, but replace f2<<v[i]<<endl
with f2<<v[i]<<"\n"
?
C++ (FILE): 110ms
C++ (fstream): 111ms
std::endl
seems to be… not particularly fast.
Of course, I didn’t put the possibility of my machine being the root issue aside. I decided to compile and test the code on a dedicated Linux Mint server. Results:
C++ (FILE): 60ms
C++ (fstream with std::endl): 270ms
C++ (fstream without std::endl): 61ms
Purely for consistency I ran the C# executables through WINE.
C# (WINE): 350ms
Of course, I also made a C# application that does nothing, to see how much of that time is actually just WINE starting up and initialising. On average, the C# “Does Nothing” code ran under WINE in 250ms, thus we can safely say that the code itself needed ~100ms to run. Windows .NET libraries are different to Mono .NET libraries, so nothing can be seen from these C# numbers. Also, this wasn’t a very scientific approach.
You’re now yelling at me: std::endl
is equivalent to “\n” AND a buffer flush!!!! Yes, it is. And this seems to be a very costly operation. And the numbers above show just how costly this is on 2 different platforms.
Conclusion?
Avoid std::endl
unless you need to flush. Heh. Also, C# performance is amazing. Also, there are no obvious differences in performance during basic io operations between stdio.h and fstream.