Microsoft have made it impossible to use any standard C or C++ library facilities for opening/closing files on Windows1.
To the naïve observer, fopen or fstream may appear to do the job, but they will only work as long as the path you want to open can be represented in the codepage of the current System Locale. This is because Microsoft, in their wisdom, decided against using UTF-8 stored in a char array as the canonical representation of a path; instead, they chose UTF-16 stored in a wchar_t array. The result is that there is no way for a char array, the standard way to represent paths in C and hence C++, to actually represent all the possible paths on a Windows system. Hence, fstream and other standard facilities can not be reliably used on a Windows system, because they only deal with char paths.
I've tried to convert wchar_t paths to char with mbcstowcs and similar functions, but for paths that cannot be represented in the system locale this merely results in paths that have the offending characters replaced by ? characters. For example, consider the following program:
1 #include <cstdlib> 2 #include <fstream> 3 #include <iostream> 4 #include <stdexcept> 5 #include <string> 6 7 #include <windows.h> 8 9 int main () { 10 WIN32_FIND_DATAW d; 11 HANDLE h = FindFirstFileW (L"*", &d); 12 if (h == INVALID_HANDLE_VALUE) 13 throw std::runtime_error ("FindFirstFile failure"); 14 15 for (bool more = true; more; more = FindNextFileW (h, &d)) { 16 char mb_path[MAX_PATH] = {'\0'}; 17 std::size_t n = std::wcstombs (mb_path, d.cFileName, MAX_PATH - 1); 18 if (n != std::wcslen (d.cFileName)) 19 throw std::runtime_error ("wcstombs failed"); 20 21 std::cout << mb_path; 22 std::cout << '\t'; 23 24 std::ifstream f (mb_path); 25 std::cout << (f ? "ok" : "error") << '\n'; 26 } 27 28 if (GetLastError () != ERROR_NO_MORE_FILES) 29 throw std::runtime_error ("FindNextFileW failure"); 30 31 if (! FindClose (h)) 32 throw std::runtime_error ("FindClose failure"); 33 }
If run in a directory containing a file with a name such as bi☣hazard 鉄人.txt, it will produce the following output:
bi?hazard ??.txt error wopen.cpp ok wopen.exe ok
Microsoft's C implementation provides alternative facilities that behave like the standard facilities, but that take their paths as wchar_t*, such as _wfopen. Portable programs will have to use these functions and somehow parametrize the type that they use for storing paths.
Microsoft's C++ implementation also extends the standard classes such as std::basic_fstream so that they can deal with wchar_t* paths. This is not available to users of other C++ implementations, however, such as MinGW which uses the GNU Standard C++ Library.
The solution below consists of a basic_win32_ofstream class, which extends the standard basic_ostream class. It uses _wopen to actually open a file; it then gives the resulting file descriptor to a stdio_filebuf, which it attaches to its basic_ostream. The rest of the program can then use the standard std::ostream interface as normal.
I chose to use _wopen & file descriptors rather than _wfopen & FILE*s because the stdio_filebuf automatically closes the former when it is destroyed, but not the latter.
1 template <typename Ch, typename Tr = std::char_traits<Ch> > 2 class basic_win32_ofstream: public std::basic_ostream<Ch, Tr> { 3 std::tr1::shared_ptr<__gnu_cxx::stdio_filebuf<Ch> > buf; 4 int error_code; 5 public: 6 basic_win32_ofstream (const std::wstring& path) { 7 int fd = _wopen (path.c_str (), 8 _O_WRONLY | _O_TEXT | _O_CREAT | _O_TRUNC, 9 _S_IREAD | _S_IWRITE); 10 11 if (fd == -1) { 12 this->setstate (std::ios_base::badbit); 13 return; 14 } 15 16 buf.reset (new __gnu_cxx::stdio_filebuf<char> (fd, std::ios::out)); 17 rdbuf (buf.get ()); 18 } 19 20 const char* error () { 21 return strerror (error_code); 22 } 23 }; 24 25 typedef basic_win32_ofstream<char> win32_ofstream;
Ugly, but at least the ugliness can be confined to a Windows-specific implementation module. A more complete and powerful alternative would be to use Boost.Filesystem, which provides its own fstream classes that can be initialized from its own platform-independent path classes.
I can't possibly think of the reason why Microsoft would want this to be the case... (1)