九天雁翎的博客
如果你想在软件业获得成功,就使用你知道的最强大的语言,用它解决你知道的最难的问题,并且等待竞争对手的经理做出自甘平庸的选择。 -- Paul Graham

图书索引创建器


国外的C++图书很多有索引,这样使得这些书在看完后还有很大的参考价值,像《TC++PL》,《C++ Primer》等经典著作无一不是,像《The C++ Standard Library》一书更是因为书中交叉索引丰富而出名,也使得其成为经典之一。而国内的大部分书,国外一些经典著作都没有索引,这样在看完后要回过头来找一些资料不是很方便,这点在我看《Programming Windows with MFC》中感受最深,所以我决定写一个这样可以自己方便创造出索引的程序,对于网上很多看图形类电子书的查找更不方便的朋友,这个程序的作用应该会更大吧。这个程序特意用了wstring来表达字符串,以方便中文的处理,虽然个人感觉很多时候用string的确也可以处理中文。很简单的一个创建图书索引工具,用户只需要注意用以结束的'!'必须是英文标点,其次就是确保读入的文件的确是由此程序创建或者负责其创建的格式,不然运行结果得不到保证。假如每个人都为自己的书创建索引未免麻烦,但是假如大家愿意共享大家创建后的结果,那么人人都可以用,就像共享资源一样。另外,希望假如有人对其修改的话保证原有索引的可用性,即向下兼容。假如程序本身有更好的处理方法,起码提供一个可以转换原有文件的程序。谢谢使用。你可以在http://groups.google.com/group/jiutianfile/files找到编译好的文件下载。

使用方法为BookIndexCreator filename.txt [ -s | -i | -c ]

filename.txt 参数为想处理的文件名

-s   参数表示在filename.txt中查找索引。

-i    参数表示在filename.txt中插入索引。

-c   参数表示在filename.txt中创建索引。

假如文件名中有空格,应该用括号括起来。以上三个参数同时只能用一个。

 

以下为源代码:

// ================================================================

//

//  Copyright (C) 2007  九天雁翎

//

//  ---------------------------------------------------------------

// 这是一个开源的免费软件,希望对你有用或对你的学习有帮助,你可以

// GNU General Public License的协议下对它进行任何修改,本人不对

// 该软件运行造成的问题承担任何责任。

// ################################################################

//

//  作者: 九天雁翎

//    软件: 图书索引创建器(Book Index Creator

//  文件: BookIndexCreator.cpp

//    版本: 0.1

//    描述:

//               很简单的一个创建图书索引工具,用户只需要注意用以结束的'!'必须是英文

//    标点,其次就是确保读入的文件的确是由此程序创建或者负责其创建的格式,不然运行

//    结果得不到保证。假如每个人都为自己的书创建索引未免麻烦,但是假如大家愿意共享

//    大家创建后的结果,那么人人都可以用。另外,希望假如有人对其修改的话保证原有索

//    引的可用性,即向下兼容。假如程序本身有更好的处理方法,起码提供一个可以转换原

//    有文件的程序。谢谢使用。

//  Download Webs: groups/google.com/group/jiutianfile

//  Blog: blog.csdn.net/vagrxie

//  E-mail: vagr@163.com

//  QQ      : 375454

// 

//  欢迎大家在上述网页发帖或来信探讨,或说明该软件的BUG和可以改进之处

//

//    最后修改时间: 2007年月日

// ################################################################

 

#include <iostream>

#include <iomanip>

#include <fstream>

#include <string>

#include <cstring>

#include <algorithm>

#include <map>

#include <set>

#include <boost/lexical_cast.hpp>  

using namespace std;

 

wfstream file; //主文件

enum argumentType{TYPEC,TYPEI,TYPES};  //确定参数类型

 

///////////////////////////////////////////////////////////////////////////////

//输出帮助内容

///////////////////////////////////////////////////////////////////////////////

 

void printHelp()

{

      cout << "How to use the BookIndexCreator:/n"

           << "bic filename.txt [ -s | -i | -c ]/n"

           << "No argument: Display like this.(the same to ?)/n"

           << " -s      Search a index from the filename.txt/n"

           << " -i       Insert more index to the filename.txt/n"

           << " -c           Create a new index as the filename.txt/n"

           << " Notice: As a tradition,if the filename has any space you must bracket it."

           << "**********************************************************************/n"

           << "Let's make books more useful!/n"

           << "You can search a index in any text editors after it created"<<endl;

}

 

///////////////////////////////////////////////////////////////////////////////

//检验输入的Y/N

///////////////////////////////////////////////////////////////////////////////

 

bool checkYN(string wrongInformation)

{

      string tempInput;

      cin >> tempInput;

      transform(tempInput.begin(),tempInput.end(),tempInput.begin(),tolower);

      while(tempInput != "y" && tempInput != "yes")

      {

           static int i = 0;

           if(tempInput == "n" || tempInput == "no")

           {

                 return false;

           }

           cout <<wrongInformation <<"? (Yes/No)" <<endl;

           if(++i > 5) //允许重试的次数

           {

                 return false;

           }

           cin >> tempInput;

      }

      return true;

}

 

///////////////////////////////////////////////////////////////////////////////

//识别输入的参数

///////////////////////////////////////////////////////////////////////////////

 

argumentType checkArgument(int argc, char *argv[])

{

      if(argc < 3)

      {

           printHelp();

           exit(EXIT_FAILURE);

      }

      string filename(argv[1]);

      if(!strcmp(argv[2], "-c") || !strcmp(argv[2], "-C"))

      {

           ifstream ifile(filename.c_str()); //临时文件检验是否已有文件

           if(ifile)

           {

                 ifile.close();

                 cout<<"There was exist a same name file,Overwrite "<<filename<<"? (Yes/No)"<<endl;

                

                 //不覆盖就只能退出程序

                 if(!checkYN("There was exist a same name file,Overwrite"))

                 {

                      cerr << "Haven't Created file!" <<endl;

                      exit(EXIT_FAILURE);

                 }

           }

 

           //打开文件

           file.open(filename.c_str(), ios::out | ios::trunc);

           if(file)

           {

                 cout << "Created an index named /""<<filename<<"/""<<endl;

           }

           else

           {

                 cerr << "Created file error!" <<endl;

                 exit(EXIT_FAILURE);

           }

           return TYPEC;     //返回参数确定的类型,表示参数为-c

      }

      else if(!strcmp(argv[2], "-i") || !strcmp(argv[2], "-I"))

      {

           file.open(filename.c_str());

           if(!file)

           {

                 cerr << "There wasn't a file named " <<filename <<"!"<<endl;

                 exit(EXIT_FAILURE);

           }

           return TYPEI; //返回参数确定的类型,表示参数为-i

      }

      else if(!strcmp(argv[2], "-s") || !strcmp(argv[2], "-S"))

      {

           file.open(filename.c_str(), ios::in);

           if(!file)

           {

                 cerr <<L"There wasn't a file named " <<filename <<"!"<<endl;

                 exit(EXIT_FAILURE);

           }

           return TYPES;      //返回参数确定的类型,表示参数为-s

      }

      else      //参数不正确

      {

           printHelp();

           exit(EXIT_FAILURE);

      }

}

int main(int argc, char *argv[])

{

      switch(checkArgument(argc,argv))

      {

 

///////////////////////////////////////////////////////////////////////////////

//当参数为-c,进行文件的创建

///////////////////////////////////////////////////////////////////////////////

 

           case TYPEC:

                 {

 

                      typedef map< wstring, set<int> >::iterator MapIter;

                       typedef set<int>::iterator SetIter;

                      map< wstring, set<int> > index;

                      cout<<"Input index like /"xxxxx 123/" type/n"

                            <<"xxxxx means the content wanted to be searched./n"

                            <<"123 means the xxxxx's page number in the book./n"

                            <<"And end inputing by '!'"<<endl;

                      wstring content;

                      int pageNumber;

 

                      //重复输入

                      while(true)

                      {

                            while(wcin >>content)

                            {

 

                                  //假如输入为'!'结束

                                  if(content == L"!")

                                  {

                                       break;

                                  }

 

                                  //预防输入错误,当输入错误时退出

                                  if( !(wcin >>pageNumber))

                                  {

                                       break;

                                  }

                                  set<int> setNumbers;  //set来保存页码,自动排序及去除重复

                                 

                                  //假如已有此索引key,先将set初始化为已有值

                                  MapIter pos = index.find(content);

                                  if(pos != index.end())

                                  {

                                       setNumbers = pos->second;

                                  }

                                  setNumbers.insert(pageNumber);

 

                                  //添加索引,假如原来有则改变值为新增页码的set

                                  index[content] = setNumbers;

                                  cout<<"OK,inserted,input next:" <<endl;

                                  content.clear();

                            }

                            wcin.clear();    //输入错误则允许重新输入

 

                            //假如输入为'!'结束

                            if(content == L"!")

                            {

                                  break;

                            }

                            else      //不然将输入错误的数据再读一次,然后清理,以便重新输入

                            {

                                  wcin >> content;

                                  content.clear();

                            }

                            cout <<"Wrong input last line and continue:" <<endl;

                      }

 

                      //写入文件

                      for(MapIter iter = index.begin(); iter != index.end(); ++iter)

                      {

                            file <<left <<setw(40)<<iter->first ;

                            for(SetIter pos = iter->second.begin(); pos != iter->second.end(); ++pos)

                            {

                                  file <<*pos <<L",";

                            }

                            file <<endl;

                      }

                      break;

                 }

 

///////////////////////////////////////////////////////////////////////////////

//当参数为-i,进行索引的插入

///////////////////////////////////////////////////////////////////////////////

 

           case TYPEI:

                 {

                      typedef map<wstring, wstring>::iterator MapIter;

                      map<wstring, wstring> index;

                      wstring content;

                      int pageNumber;

                      wstring strNumbers;

 

                      //先读入原来的文件

                      while(file >>content >>strNumbers)

                      {

                            index[content] = strNumbers;

                            static int progress = 0;

                            ++progress;

                            if( !(progress % 50) )

                            {

                                  cout<<".";

                            }

                      }

                      cout<<endl;

                      cout<<"OK,read file success,input content you want to insert:" <<endl

                            <<"Input index like /"xxxxx 123/" type/n"

                            <<"xxxxx means the content wanted to be searched./n"

                            <<"123 means the xxxxx's page number in the book./n"

                            <<"And end inputing by '!'"<<endl;

 

                      //重复输入

                      while(true)

                      {

                            while(wcin >>content)

                            {

 

                                  //假如输入为'!'结束

                                  if(content == L"!")

                                  {

                                       break;

                                  }

 

                                  //输入错误中止

                                  if( !(wcin >>pageNumber))

                                  {

                                       break;

                                  }

 

                                  //是否已有同样的索引key,有则改为添加

                                  MapIter pos = index.find(content);

                                  if(pos != index.end())

                                  {

                                       set<int> intSet;  //用临时set保存,自动排序,自动去重复值

 

                                       //以下为输入的以','分割的wstring转换为int的过程

                                       wstring::size_type idx1 = 0;

                                       wstring::size_type idx2 = pos->second.find(L',');

                                       while(idx2 != wstring::npos)

                                       {

                                             intSet.insert(boost::lexical_cast<int>(pos->second.substr(idx1,idx2 - idx1)));

                                             idx1 = idx2 + 1;

                                             idx2 = pos->second.find(L',',idx1);

                                       }

 

                                       //转换完后添加新的页码

                                       intSet.insert(pageNumber);

                                       strNumbers.clear();

 

                                       //临时set再转换为以','分割的wstring,比较麻烦

                                       for(set<int>::iterator iter = intSet.begin();

                                             iter != intSet.end(); ++iter)

                                       {

                                             strNumbers.append(boost::lexical_cast<wstring>(*iter));

                                             strNumbers.push_back(',');

                                       }

                                  }

 

                                  //索引中没有同样的key,直接添加页码

                                  else

                                  {

                                       strNumbers = boost::lexical_cast<wstring>(pageNumber);

                                  }

                                  index[content] = strNumbers;

                                  cout<<"/nOK,inserted,input next:" <<endl;

                                  content.clear();

                                  strNumbers.clear();

                                  pageNumber = 0;

                            }

                            wcin.clear();  //保证可以重复输入

 

                            //假如输入为'!'结束

                            if(content == L"!")

                            {

                                  break;

                            }

                            else

                            {

                                  wcin >> content;   //清除错误输入,以方便接下来的输入

                                  content.clear();

                            }

                            cout <<"Wrong input last line and continue:" <<endl;

                      }

 

                      //清理文件,假如再次打开文件失败,原有索引丢失,恐怖的问题

                      file.clear();

                      file.close();

                      file.open(argv[1], ios::out | ios::trunc);

                      if(!file)

                      {

                            cerr<<"file input error"<<endl;    //出现这个表示原有索引丢失,待改进

                            exit(EXIT_FAILURE);

                      }

 

                      //将新增索引输入

                      for(MapIter iter = index.begin(); iter != index.end(); ++iter)

                      {

                            file <<left <<setw(40) <<iter->first <<iter->second <<endl;

                      }

                      break;

                 }

 

///////////////////////////////////////////////////////////////////////////////

//当参数为-s,进行索引的查找

///////////////////////////////////////////////////////////////////////////////

 

           case TYPES:

                 {

                      typedef map<wstring, wstring>::iterator MapIter;

                      map<wstring, wstring> index;

                      wstring content;

                      wstring strNumbers;

 

                      //原文件读入

                      while(file >>content >>strNumbers)

                      {

                            index[content] = strNumbers;

                            static int progress = 0;

                            ++progress;

                            if( !(progress % 50) )

                            {

                                  cout<<".";

                            }

                      }

 

                      cout<<endl;

                      cout<<"OK,read file success,input content you want to search:" <<endl

                            <<"and end input by '!'." <<endl;

                      MapIter pos;

 

                      //重复输入

                      while(true)

                      {

                            content.clear();

                            wcin >>content;

 

                            //输入为'!'结束

                            if(content ==L"!")

                            {

                                  break;

                            }

 

                            //查找

                            pos = index.find(content);

 

                            //查找失败处理

                            if(pos == index.end())

                            {

                                  wcout<<L"There isn't the index included " <<content <<L"." <<endl

                                  <<L"Input next:" <<endl;

                                  continue;

                            }

 

                            //查找成功,输出页码

                            wcout <<left <<setw(40) <<pos->first <<pos->second.substr(0, pos->second.size() - 1 ) <<endl

                                  <<L"/nInput next:" <<endl;

                      }

                      break;

                 }

           default:

                 break;

      }

     

      //程序结束

      cout<<"OK,this program ended."<<endl;

      file.close();      //不忘关闭文件,似乎已无必要

      return 0;

}

 

 

 

分类:  我的程序 
标签: 

Posted By 九天雁翎 at 九天雁翎的博客 on 2007年12月01日

前一篇: 同一台电脑两个鼠标玩的五子棋,请大家试试 后一篇: 正则表达式测试程序(Boost Regex Tester)0.1(老版保留)