6
\$\begingroup\$

I didn't realize I am not a very good programmer at recently. I just come out from Uni, I want to become a reliable teammate. Please help me to improve my coding style/habbits. I have carefully re-written a file indexer class which I didn't do very well in my work. I come here, hope I can improve it in terms of code correctness, best practices and design pattern usage.

--FileIndexer.h--

#ifndef FILE_INDEXER_90738592_HA8965371_DE0E_4ff7_95A0_11B956535E12_H // INDEXER is a name which is too common. It has potential to conflict with other macros. it's better to use a descriptive name.
#define FILE_INDEXER_90738592_HA8965371_DE0E_4ff7_95A0_11B956535E12_H
#include <vector>
#include <map>
#include <string>
// Try to avoid including the entire std, since it contaminates the namespace and may cause ambiguous matches.
//using namespace std; 
class FileIndexer // : public baseClass, it's a wise way to inherit basic functionalities from its super class, if necessary.
{
public :
 typedef unsigned int FileType;
 enum File_Type_Action
 {
 FILE_NEED_NOACTION = 0x0000,
 FILE_NEED_BACKUP = 0x0001,
 FILE_NEED_SCAN = 0x0002
 };
 struct FileTypeInfo
 {
 FileType type;
 unsigned actionRequired;
 };
 typedef std::map<std::wstring, FileTypeInfo> DynamicFileType; // we use the map data structure to allow to add more file types easily in future. 
 struct FileDef
 {
 // using the standard constant FILENAME_MAX is much safer than a hard-coded constant.
 wchar_t drive[_MAX_DRIVE];
 wchar_t name[_MAX_FNAME];
 wchar_t path[_MAX_DIR];
 wchar_t ext[_MAX_EXT];
 _fsize_t size;
 unsigned int type;
 };
 FileIndexer ();
 virtual ~FileIndexer ();
 unsigned int getMaxIndex ()
 {
 return m_maxIndices;
 }
 bool setMaxIndices (const unsigned int& maxIndices);
 bool processDirectory (const wchar_t* name);
 bool getFirstFile (const wchar_t* ext, unsigned int& fileNo, FileDef* fd);
 bool getNextFile (unsigned int& fileNo, FileDef* fd);
 void listFiles (const wchar_t* ext);
 bool needsAction (const File_Type_Action& action, const FileDef* fd);
 bool addSupportedFileType (const wchar_t* ext, const File_Type_Action& actionRequired = FILE_NEED_NOACTION);
 void listSupportedFileTypes ();
 // a possible functionality in future : import multiple file types
 bool importSupportFileTypes (const wchar_t** exts, const File_Type_Action* actionRequired) {};
 // a possible functionality in future : export multiple file types to a file
 bool exportSupportFileTypes (const wchar_t* filename) {};
 void _testIndexer ();
protected:
 DynamicFileType m_supportedFileTypes; //dynamical supported File types
 std::vector<FileDef> m_fileIndices; //allow more flexible memory allocation for file indexing
 unsigned int m_maxIndices; // the maximum number of indices supported
private:
 //Since data members are not pointer types, the default copy constructor and the default assignment operator are safe to use.
 //FileIndexer (FileIndexer const & other );
 //FileIndexer& operator= (FileIndexer const & other );
};
#endif //INDEXER_H

--FileIndexer.cpp--

#include "stdafx.h"
#include "FileIndexer.h"
#include "stdlib.h"
#include "stdio.h"
#include "io.h"
#include <stack>
FileIndexer::FileIndexer()
{
 m_maxIndices = 1024;
 m_fileIndices.reserve( m_maxIndices ); // This saves computational time for std::vector repeatly realllocating the memory, when the elements are added one by one.
 // Initialise the supported file types
 FileTypeInfo tmp;
 tmp.type = 1;
 tmp.actionRequired = FILE_NEED_NOACTION;
 m_supportedFileTypes[std::wstring(L".txt")] = tmp;
 tmp.type = 2;
 tmp.actionRequired = FILE_NEED_NOACTION;
 m_supportedFileTypes[std::wstring(L".xml")] = tmp;
 tmp.type = 3;
 tmp.actionRequired = FILE_NEED_SCAN;
 m_supportedFileTypes[std::wstring(L".exe")] = tmp;
 tmp.type = 4;
 tmp.actionRequired = FILE_NEED_BACKUP;
 m_supportedFileTypes[std::wstring(L".doc")] = tmp;
 tmp.type = 5;
 tmp.actionRequired = FILE_NEED_NOACTION;
 m_supportedFileTypes[std::wstring(L".xls")] = tmp;
 tmp.type = 6;
 tmp.actionRequired = FILE_NEED_NOACTION;
 m_supportedFileTypes[std::wstring(L".ppt")] = tmp;
}
FileIndexer::~FileIndexer()
{
 m_maxIndices = 0;
}
bool FileIndexer::setMaxIndices (const unsigned int& maxIndices)
{
 if (maxIndices < m_fileIndices.max_size() ) //This checks against the theoretical limit on the size of a vector.
 {
 m_maxIndices = maxIndices;
 return true;
 }
 else
 return false;
}
bool FileIndexer::addSupportedFileType(const wchar_t* ext, const File_Type_Action& _actionRequired)
{
 if (!ext) return false;
 std::wstring s(ext);
 DynamicFileType::iterator it = m_supportedFileTypes.find(s);
 if (it == m_supportedFileTypes.end())
 {
 FileTypeInfo tmp;
 tmp.type = m_supportedFileTypes.size() + 1;
 tmp.actionRequired = _actionRequired;
 m_supportedFileTypes[s] = tmp;
 }
 else
 {
 it->second.actionRequired = _actionRequired;
 }
 return true;
}
bool FileIndexer::processDirectory(const wchar_t* name) 
{
 if ( !name ) return false;
 if (m_fileIndices.size() >= m_maxIndices) return false;
 std::stack<std::wstring> pathNameST; //using stack is much more computationally efficient than the recursive function
 std::wstring initialPath = name;
 pathNameST.push(initialPath);
 std::wstring curS, search, fname;
 FileDef f;
 DynamicFileType::const_iterator it;
 do
 {
 curS = pathNameST.top();
 pathNameST.pop();
 search = curS + L"\\*";
 _wfinddata_t fd;
 intptr_t handle = _wfindfirst(search.c_str(), &fd);
 if (handle != -1)
 {
 do
 {
 // if it's a sub-dir and not ".." and "." string, then push to stack
 if ( fd.attrib & _A_SUBDIR && wcscmp(fd.name, L"..") && wcscmp(fd.name, L"."))
 {
 fname = curS + L"\\" + fd.name;
 pathNameST.push(fname);
 } /* else if it's a file (except system files) then */
 else if (!(fd.attrib & _A_SYSTEM) && !(fd.attrib & _A_SUBDIR))
 {
 f.size = fd.size;
 fname = curS + L"\\" + fd.name;
 _wsplitpath( fname.c_str(), f.drive, f.path, f.name, f.ext );
 it = m_supportedFileTypes.find ( std::wstring(f.ext) );
 if ( it != m_supportedFileTypes.end())
 {
 f.type = it->second.type;
 m_fileIndices.push_back(f);
 // if fileIndex.size is equal to the maximum number of indices, the process should be terminated.
 if (m_fileIndices.size() == m_maxIndices)
 {
 _findclose (handle);
 return true;
 }
 }
 }
 }
 while (_wfindnext(handle, &fd) == 0);
 }
 else
 {
 int err;
 _get_errno(&err);
 switch (err)
 {
 case EINVAL:
 case ENOMEM:
 _findclose (handle);
 return false;
 case ENOENT:
 break;
 default:
 _findclose (handle);
 return false;
 }
 }
 _findclose (handle);
 }
 while (!pathNameST.empty());
 return true;
}
bool FileIndexer::getFirstFile(const wchar_t* ext, unsigned int& fileNo, FileDef* fd)
{
 if (!ext || !fd) return false; 
 DynamicFileType::const_iterator it;
 bool ret = false;
 it = m_supportedFileTypes.find(std::wstring(ext));
 if (it == m_supportedFileTypes.end()) return false; 
 FileType type = it->second.type;
 for (unsigned int i = 0; i < (unsigned int) m_fileIndices.size(); i++)
 if (m_fileIndices[i].type == type)
 {
 *fd = m_fileIndices[i];
 fileNo = i;
 ret = true;
 break;
 }
 return ret;
}
bool FileIndexer::getNextFile(unsigned int& fileNo, FileDef* fd)
{
 if (!fd) return false; 
 bool ret = false;
 for (unsigned int i = fileNo + 1; i < (unsigned int) m_fileIndices.size(); i++)
 if (m_fileIndices[i].type == m_fileIndices[fileNo].type)
 {
 *fd = m_fileIndices[i];
 fileNo = i;
 ret = true;
 break;
 }
 return ret;
}
void FileIndexer::listSupportedFileTypes ()
{
 DynamicFileType::const_iterator it;
 wprintf(L"%10s%15s\r\n", L"File Type", L"Extension");
 for (it = m_supportedFileTypes.begin(); it != m_supportedFileTypes.end(); it++)
 {
 wprintf(L"%5u%15s\r\n", it->second.type , it->first.c_str());
 }
}
void FileIndexer::listFiles(const wchar_t* ext)
{
 FileDef fd;
 unsigned int fileNo;
 if ( getFirstFile(ext, fileNo, &fd) )
 {
 wprintf(L"%s files:\n", ext);
 wprintf(L"%90s%15s\r\n", L"NAME", L"SIZE");
 do
 {
 wprintf(L"%90s%15i\r\n", fd.name, fd.size);
 }
 while ( getNextFile(fileNo, &fd));
 }
}
bool FileIndexer::needsAction(const File_Type_Action& action, const FileDef* fd)
{
 if (!fd) return false;
 DynamicFileType::const_iterator it = m_supportedFileTypes.find ( std::wstring(fd->ext));
 if (it != m_supportedFileTypes.end())
 {
 return static_cast<bool>(it->second.actionRequired & action);
 }
 else
 return false;
}
void FileIndexer::_testIndexer()
{
 bool ret1 = setMaxIndices(3024);
 bool ret2 = addSupportedFileType(L".pdf");
 listSupportedFileTypes();
 bool ret3 = processDirectory(L"D:\\");
 bool ret4 = processDirectory(L"C:\\");
 //processDirectory("C:\\test directory1" );
 //processDirectory("C:\\test directory2" );
 listFiles(L".exe");
 listFiles(L".txt");
 listFiles(L".xml");
 listFiles(L".bat");
}

The code is compiled and tested in Windows. I have already tried to carefully improve it by myself. I know there are still lots of things I could miss. Please help me on this.

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Feb 24, 2013 at 12:40
\$\endgroup\$
1
  • \$\begingroup\$ You should certainly get rid of "stdio.h" and "stdlib.h" \$\endgroup\$ Commented Feb 24, 2013 at 13:56

2 Answers 2

5
\$\begingroup\$

Presumable this is supposed to be enum File_Type_Action:

struct FileTypeInfo
{
 FileType type;
 unsigned actionRequired; // File_Type_Action actionRequired;
};

Prefer std::wstring over character array:

struct FileDef
{
 wchar_t drive[_MAX_DRIVE];
 wchar_t name[_MAX_FNAME];
 wchar_t path[_MAX_DIR];
 wchar_t ext[_MAX_EXT];
 _fsize_t size;
 unsigned int type;
};

This is not a polymorphic type. So no need to have a virtual destructor:

/* virtual*/ ~FileIndexer ();
// ^^^^^^^^^ Remove (unless you want people to inhert and extend!

By making you destructor virtual you are indicating that this class can be extended. But you have no virtual methods. So I see no point in using vvirutal here (it is also misleading).

Pass by reference rather than pointer:

bool getFirstFile (const wchar_t* ext, unsigned int& fileNo, FileDef* fd);
// ^^^^ ^^^^^

In the case of ext: I would use std::wstring const&.
Unless fd can be NULL pass by reference. This is an indication that you don't accept NULL values. FileDef& fd

You can simplify and make this more readable:

FileTypeInfo tmp;
tmp.type = 1;
tmp.actionRequired = FILE_NEED_NOACTION;
m_supportedFileTypes[std::wstring(L".txt")] = tmp;
// In C++03 add a constructor.
m_supportedFileTypes[std::wstring(L".txt")] = FileTypeInfo(1, FILE_NEED_NOACTION);
// In C++11 use the new syntax:
m_supportedFileTypes[std::wstring(L".txt")] = {1, FILE_NEED_NOACTION};

The destructor is useless. Remove it:

FileIndexer::~FileIndexer()
{
 m_maxIndices = 0; // The object is gone after this.
 // Thus this value no longer exists.
 // Thus accessing it would be undefined behavior
 // Thus is does not matter its value.
}

Try not to use multiple lines to do very simple tasks. It makes it harder to read:

std::stack<std::wstring> pathNameST;
std::wstring initialPath = name;
pathNameST.push(initialPath);
// Can be done like this:
pathNameST.push(name);
answered Feb 24, 2013 at 19:38
\$\endgroup\$
3
  • \$\begingroup\$ thank you for your great work. 1. if one file type has multiple actions required to be performed. For example, ".exe" is required to be scanned and backed up. Then, actionRequired = FILE_NEED_BACKUP | FILE_NEED_SCAN; Is this a right way to do? \$\endgroup\$ Commented Feb 25, 2013 at 4:34
  • \$\begingroup\$ I used void _wsplitpath( const wchar_t *path, wchar_t *drive, wchar_t *dir, wchar_t *fname, wchar_t *ext ); which splits the full path name into separate wchar_t * buffers. Can I use std::string in this case? or are there any better way to accompanish this task if I want to use std::wstring? \$\endgroup\$ Commented Feb 25, 2013 at 5:03
  • \$\begingroup\$ boost::filesystem provides a comprehensive set of file system manipulation tools. \$\endgroup\$ Commented Feb 25, 2013 at 6:27
4
\$\begingroup\$

Here's a few things you may want to consider:

  1. Remove commented out code and statements right away. Particularly the ones that seems to be left in there for educational purposes are just noise.

  2. Wrap your class into a namespace, if this is part of a project you should at least have a project scope namespace.

  3. Don't use fixed length char arrays for strings, this is C++ so use std::string or another suitable string implementation instead.

  4. Make the FileDef type a class with it's own methods instead of manipulating the members directly. It can probably be generalized so it won't need to be a private class under FileIndexer.

You may also want to consider breaking the class into separate smaller classes, like DirectoryProcessor, FileIterator etc.

answered Feb 24, 2013 at 14:16
\$\endgroup\$
4
  • \$\begingroup\$ For 3, I used void _wsplitpath( const wchar_t *path, wchar_t *drive, wchar_t *dir, wchar_t *fname, wchar_t *ext ); which splits the full path name into separate wchar_t * buffers. Can I use std::string in this case? or are there any better way to accompanish this task? \$\endgroup\$ Commented Feb 25, 2013 at 5:01
  • \$\begingroup\$ For 4, I put it inside the FileIndex class, because I try to design it simple and specific for the FileIndexing. I don't know how much efforts need to design a general FileDef that deals with all situations for the various formats. So I keep it simple and put it inside the class to avoid conflicting with other implementations. Thank you. I will improve my code accordingly. \$\endgroup\$ Commented Feb 25, 2013 at 5:43
  • \$\begingroup\$ _wsplitpath is not a portable function, and looks unsafe too. Wrap it in an independent class or function, which can use std::string. As for #4, the namespace (#2) will avoid conflict with other implementations. \$\endgroup\$ Commented Feb 25, 2013 at 8:05
  • \$\begingroup\$ "Wrap it in an independent class or function, which can use std::string." I think this is a nice way to make it more general. Thanks lot! \$\endgroup\$ Commented Feb 26, 2013 at 7:59

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.