2009年06月14日

C++中的成员函数调用原理及this指针的传递方式

write by 九天雁翎(JTianLing) – www.jtianling.com

测试源代码：

#include   
#include   
 
**class** CTestThisPointer  
{  
**public** :  
CTestThisPointer(**int** ai):mi(ai) { }  
**int** Add(**int** ai)  
{  
mi += ai;  
**return** mi;  
}  
  
**private** :  
**int** mi;  
};  
 
 
**int** main()  
{  
CTestThisPointer loTest(10);  
 
printf("%d",loTest.Add(5));  
**return** 0;  
}  


反汇编：

.text:00401000 ; int __cdecl main(int argc, const char **argv, const char *envp)

.text:00401000 _main proc near ; CODE XREF: _mainCRTStartup+AFp

.text:00401000

.text:00401000 this = byte ptr -4

.text:00401000 argc = dword ptr 8

.text:00401000 argv = dword ptr 0Ch

.text:00401000 envp = dword ptr 10h

.text:00401000

.text:00401000 push ebp ; 在默认优化的时候，总算看到了正统的一个push ebp;

.text:00401000 ; mov ebp,esp的保护堆栈并用ebp的过程

.text:00401001 mov ebp, esp

.text:00401003 push ecx

.text:00401004 push 0Ah ; 构造函数的除了默认this外的唯一参数,10

.text:00401006 lea ecx, [ebp+this] ; 这个参数时this指针，通过寄存器ecx传递

.text:00401009 call CTestThisPointer__CTestThisPointer

.text:0040100E push 5 ; Add函数除默认的this外唯一的参数,5

.text:00401010 lea ecx, [ebp+this] ; this指针

.text:00401013 call CTestThisPointer__Add

.text:00401018 push eax ; 将add的返回值入栈，调用printf

.text:00401019 push offset aD ; "%d"

.text:0040101E call _printf

.text:00401023 add esp, 8 ; 变长参数的函数，只能是通过__cdecl调用约定罗，由main

.text:00401023 ; 函数来维护栈平衡

.text:00401026 xor eax, eax

.text:00401028 mov esp, ebp

.text:0040102A pop ebp

.text:0040102B retn

.text:0040102B _main endp

构造函数：

.text:00401030 CTestThisPointer__CTestThisPointer proc near ; CODE XREF: _main+9p

.text:00401030

.text:00401030 var_4 = dword ptr -4

.text:00401030 arg_0 = dword ptr 8

.text:00401030

.text:00401030 push ebp

.text:00401031 mov ebp, esp

.text:00401033 push ecx ; 将this指针入栈保存

.text:00401034 mov [ebp+var_4], ecx

.text:00401037 mov eax, [ebp+var_4] ; 将this的值传给eax，默认优化的代码实在够繁琐，接下来

.text:00401037 ; 几步都是的，其实本质无非就是一个

.text:00401037 ; mov [ecx],[ebp+arg_0]的过程，但是因为不能直接的从

.text:00401037 ; 内存mov到内存，所以中间用了临时变量，但是因为要用

.text:00401037 ; 寄存器做临时变量，所以得先保存寄存器。。。。。

.text:00401037 ; 其实还有那么多寄存器，咋不用呢？呵呵，因为是默认优化

.text:0040103A mov ecx, [ebp+arg_0]

.text:0040103D mov [eax], ecx

.text:0040103F mov eax, [ebp+var_4]

.text:00401042 mov esp, ebp

.text:00401044 pop ebp

.text:00401045 retn 4

.text:00401045 CTestThisPointer__CTestThisPointer endp

Add函数：

.text:00401050 CTestThisPointer__Add proc near ; CODE XREF: _main+13p

.text:00401050

.text:00401050 var_4 = dword ptr -4

.text:00401050 arg_0 = dword ptr 8

.text:00401050

.text:00401050 push ebp

.text:00401051 mov ebp, esp

.text:00401053 push ecx

.text:00401054 mov [ebp+var_4], ecx ; 这个临时变量用的最费

.text:00401057 mov eax, [ebp+var_4]

.text:0040105A mov ecx, [eax]

.text:0040105C add ecx, [ebp+arg_0] ; 因为返回的是mi，所以以下是老老实实的取出this指针中的

.text:0040105C ; 地址到寄存器，然后在通过mov re,[re]的形式取出值

.text:0040105C ; BTW:看没有优化的代码实在不习惯

.text:0040105F mov edx, [ebp+var_4]

.text:00401062 mov [edx], ecx

.text:00401064 mov eax, [ebp+var_4]

.text:00401067 mov eax, [eax]

.text:00401069 mov esp, ebp

.text:0040106B pop ebp

.text:0040106C retn 4

.text:0040106C CTestThisPointer__Add endp

write by 九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月14日

C/C++与汇编的函数相互调用分析

write by九天雁翎(JTianLing) – www.jtianling.com**

讨论新闻组及相关文件下载**

昨天好好研究了一下内嵌汇编的情况。。。。。更进一步的，该是看看独立编译的汇编程序与C/C++互相调用的情况了。

呵呵，最近怎么好像老在搞这个，想当年学习的时候，一门心思的学C++，到现在老是在弄诸如怎么在C/C++中调用LUA函数，或者反过来，怎么C/C++中调用Python函数，或者反过来，今天又是怎么在C/C++中调用汇编的函数，或者反过来。。。。。。。。。。。。。呵呵，自从学习的语言越来越多，类似的情况碰到的也就越来越多了，但是，只懂一门语言就不能在合适的时候使用合适的语言来解决问题，并且看问题会带有狭隘的偏见，谁说的来着？毕竟无论是lua,python,asm都会有用的上的时候，我最近似乎老是喜欢说闲话。。。。这是某些哥们说的自己的见解，还是某些时候无聊的牢骚呢？谁知道呢，过年了嘛，还不让人多说几句话啊。。。。。。-_-!

首先来看

C中调用汇编的函数

先添加一个汇编文件写的函数吧，在VS2005中对此有了明显的进步，因为就《加密与解密》一书描述，在2003中需要自己配置编译选项，但是在VS2005中很明显的，当你添加asm文件后，可以自己选定masm的编译规则，然后一切就由IDE把你做好了，这也算是IDE的一个好用的地方吧。。。。

非常不好的一点就是，VS2005中对于汇编没有任何语法高亮。。。。damnit!IDE怎么做的？就这点而言，其甚至不如一般的文本编辑工具！。。。又是废话了。。

因为是C，这个目前全世界可能是最具有可移植性的语言，所以问题要简单的多。但是。。。也不全是那么简单，先看看直觉的写法：

汇编代码：

PUBLIC GetArgument

.486                      ; create 32 bit code

.model flat       ; 32 bit memory model

;option casemap :none      ; case sensitive

_TEXT SEGMENT PUBLIC 'CODE'

GetArgument PROC

    MOV EAX, [ESP+4]

    RETN 

GetArgument ENDP

_TEXT ENDS

END

C语言代码：

#include <stdio.h>
#include <windows.h>

int GetArgument(int);
int _tmain(int argc, _TCHAR* argv[])
{

    printf("%d/n",GetArgument(10));

    system("PAUSE");

    return 0;
}

声明是必不可少的，毕竟汇编没有头文件给你包含，不过多的话，可以考虑组织一个专门用于包含汇编函数实现的头文件。但是在编译时却不会通过。

1>InlineAsm.obj : error LNK2001: 无法解析的外部符号_GetArgument
1>    D:/My Document/Visual Studio 2005/Projects/InlineAsm/Release/InlineAsm.exe : fatal error LNK1120: 1 个无法解析的外部命令

看到错误原因也知道是怎么回事了，C中的函数名被编译器处理时多了个前置的下划线，当然，这个问题好解决。

一种方式是改变汇编代码的函数，直接添加一个前置下划线就完了，一种方式是将其声明为C语言的方式，那么链接程序也知道正确的链接了。两种方式分别如下：

直接改变函数名：

PUBLIC _GetArgument

.486                      ; create 32 bit code

.model flat       ; 32 bit memory model

;option casemap :none      ; case sensitive

_TEXT SEGMENT PUBLIC 'CODE'

_GetArgument PROC

    MOV EAX, [ESP+4]

    RETN 

_GetArgument ENDP

_TEXT ENDS

END

改变.model声明：

PUBLIC GetArgument

.486                      ; create 32 bit code

.model flat,c       ; 32 bit memory model

;option casemap :none      ; case sensitive

_TEXT SEGMENT PUBLIC 'CODE'

GetArgument PROC

    MOV EAX, [ESP+4]

    RETN 

GetArgument ENDP

_TEXT ENDS

END

个人推荐第2种方式，因为看起来最舒服，将改变函数名的工作交给编译和链接程序吧。

假如是在C++中调用ASM函数的话，相对复杂一点,因为没有.model C++的生命方式。。。这个世界是不公平对待C和C++的。。。。呵呵，但是C++有完整的向C靠拢的机制，这就够了。

汇编代码不变，C++调用时用如下形式：

#include <stdio.h>
#include <windows.h>

extern "C" int _cdecl GetArgument(int);
int _tmain(int argc, _TCHAR* argv[])
{

    printf("%d/n",GetArgument(10));

    system("PAUSE");

    return 0;
}

即将C++函数完整的声明为C语言的形式。。。。。但是我在MSDN中看到了.model起码有stdcall的声明方式，有这种声明方式为什么不用呢？呵呵，用一下。

将汇编语言的.model声明改成下面这样：

.model flat,c,stdcall       ; 32 bit memory model

C++中函数声明为下面这样：

extern “C” int __stdcall GetArgument(int);

结果却是链接错误：

1> InlineAsm.obj : error LNK2001: 无法解析的外部符号_GetArgument@4

当我自以为声明一致时，却不知道发生了什么，假如是以前，我可能得一筹莫展。。。但是现在嘛。。。。既然知道obj文件其实也是可读的，那么，看看链接的时候出了什么问题，为什么汇编出来的obj文件中没有这个符号呢？可以在obj文件的最后一行看到答案：

原来汇编的代码声明stdcall后函数符号被解析成_GetArgument@0了，那不是表示没有参数吗？看来是我汇编写错了。

改成如下形式：

PUBLIC GetArgument

.486                      ; create 32 bit code

.model flat,c,stdcall       ; 32 bit memory model

;option casemap :none      ; case sensitive

_TEXT SEGMENT PUBLIC 'CODE'

GetArgument PROC x:DWORD

    MOV EAX, x

    RETN 4

GetArgument ENDP

_TEXT ENDS

END

然后再运行程序，崩溃。。。。。。

看看原因：

GetArgument PROC x:DWORD

00401030  push        ebp 
00401031  mov         ebp,esp 
    MOV EAX, x
00401033  mov         eax,dword ptr [x] 
    RETN 4
00401036  ret         4

很明显汇编编译后自动加了push ebp；mov ebp,esp这两句来保护栈指针esp，问题是却没有自动生成还原的代码。。。那还不崩溃？典型的栈错误。可以用下面的方式修复

GetArgument PROC x:DWORD

    MOV EAX, x

    pop ebp

RETN 4

但是这样做个人感觉实在是太不优美了。。。。。。。奇怪的是编译器为什么要这样解析代码。。。。呵呵，即便你是用汇编。。。特别是伪汇编。。。你都会发现编译器在你的背后动了太多手脚，很多，是你根本不想要它去做的。这一点也可能是我汇编代码声明或写的有问题，导致编译器奇怪的处理了，有知道正确结果的高人请指点一下.

下面，在汇编中调用C/C++函数：

在此不分辨C和C++了，差别仅在于一个extern “C”而已，调用约定采用__stdcall其他请参考

汇编代码如下：

PUBLIC GetArgument2

.486                      ; create 32 bit code

.model flat,stdcall       ; 32 bit memory model

;option casemap :none      ; case sensitive

GetArgument PROTO  :DWORD   ; 函数声明

_TEXT SEGMENT PUBLIC 'CODE'

GetArgument2 PROC x:DWORD

    INVOKE GetArgument,x

    MOV EAX, x

    POP EBP

    RETN 4

GetArgument2 ENDP

_TEXT ENDS

END

C++代码：

#include <stdio.h>
#include <windows.h>

extern "C" int __stdcall GetArgument(int ai)
{
    return ai;
}

extern "C" int __stdcall GetArgument2(int ai);

int _tmain(int argc, _TCHAR* argv[])
{

    printf("%d/n",GetArgument2(10));

    system("PAUSE");

    return 0;
}

至此，你会高兴的发现，一个10，从C++中调用汇编的函数GetArgument2，再从汇编中调用C++的函数GetArgument再返回，得到正确结果。。。真不容易啊。。。这例子举得真够折腾的：）呵呵，说明问题就好了。最重要的一句就在于GetArgument PROTO :DWORD ; 函数声明

一行，另外，这一行应该在.model声明以后，不然编译器不知道你该采用那种调用约定和名字编码方式。

参考资料：MSDN

write by九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月14日

《数据结构与算法分析C++描述》搜索二叉树的C++实现

《数据结构与算法分析C++描述》
搜索二叉树的C++实现**
**

write by 九天雁翎(JTianLing) – www.jtianling.com

《数据结构与算法分析c++描述》 Mark Allen Weiss著
人民邮电大学出版
中文版第93-100面，搜索二叉树

需要说明一点的是，此搜索二叉树并没有平衡算法，所以可能会导致有可能出现O(M logN)的最坏情况。

并且几乎所有代码都用递归实现，所以效率并不是太高，并且当N足够大的时候，很多操作都可能导致栈溢出。但是因为对于树的操作用递归描述起来理解上还是比循环好的多，并且以后可以用平衡算法，所以这里都用递归了。

搜索二叉树的实现：

 
#ifndef __BINARY_SEARCH_TREE_H__
#define __BINARY_SEARCH_TREE_H__
 
**template** <**typename** T>
**class** CBinarySearchTree
{
**public** :
CBinarySearchTree():mpRoot(NULL) { }
CBinarySearchTree(**const** CBinarySearchTree& aOrig)
{
mpRoot = Clone(aOrig.mpRoot);
}
~CBinarySearchTree()
{
MakeEmpty();
}
 
////////////////////////////////////////////
// const member function
////////////////////////////////////////////
**const** T* FindMin() **const** ;
**const** T* FindMax() **const** ;
 
**bool** Contains( **const** T& aElement) **const** ;
**bool** IsEmpty() **const**
{
**return** (mpRoot != NULL) ? true : false;
}
 
// I don't know how to print it in a good format
//void PrintTree() const;
 
////////////////////////////////////////////
// non-const member function
////////////////////////////////////////////
**void** MakeEmpty();
**void** Insert( **const** T& aElement);
**void** Remove( **const** T& aElement);
 
**const** CBinarySearchTree& **operator** =(**const** CBinarySearchTree& aOrig);
 
**private** :
**struct** CBinaryNode
{
CBinaryNode(**const** T& aElement, CBinaryNode* apLeft, CBinaryNode* apRight)
: mElement(aElement),mpLeft(apLeft),mpRight(apRight) { }
 
T mElement;
CBinaryNode *mpLeft;
CBinaryNode *mpRight;
};
 
// Root Node
CBinaryNode *mpRoot;
 
////////////////////////////////////////////
// private member function to call recursively
////////////////////////////////////////////
 
// I don't like to use reference to pointer
// so I used pointer to pointer instead
**void** Insert(**const** T& aElement, CBinaryNode** appNode) **const** ;
**void** Remove(**const** T& aElement, CBinaryNode** appNode) **const** ;
 
CBinaryNode* FindMin(CBinaryNode* apNode) **const** ;
CBinaryNode* FindMax(CBinaryNode* apNode) **const** ;
**bool** Contains(**const** T& aElement, CBinaryNode * apNode) **const** ;
**void** MakeEmpty(CBinaryNode** apNode);
//void PrintTree(CBinaryNode* apNode) const;
CBinaryNode* Clone(CBinaryNode* apNode) **const** ;
 
};
 
 
**template** <**typename** T>
**bool** CBinarySearchTree::Contains(**const** T& aElement) **const**
{
**return** Contains(aElement, mpRoot);
}
 
**template** <**typename** T>
**bool** CBinarySearchTree::Contains(**const** T &aElement;, CBinaryNode *apNode) **const**
{
**if**( NULL == apNode )
{
**return** false;
}
**else** **if** ( aElement < apNode->mElement )
{
**return** Contains(aElement, apNode->mpLeft);
}
**else** **if** ( aElement > apNode->mElement )
{
**return** Contains(aElement, apNode->mpRight);
}
**else**
{
**return** true; // Find it
}
}
 
**template** <**typename** T>
**void** CBinarySearchTree::Insert(**const** T &aElement;)
{
Insert(aElement, &mpRoot;);
}
 
**template** <**typename** T>
**void** CBinarySearchTree::Insert(**const** T& aElement, CBinaryNode** appNode) **const**
{
CBinaryNode *lpNode = *appNode;
**if**(NULL == lpNode)
{
*appNode = **new** CBinaryNode(aElement, NULL, NULL);
}
**else** **if**( aElement < lpNode->mElement )
{
Insert(aElement, &(lpNode->mpLeft) );
}
**else** **if**( aElement > lpNode->mElement)
{
Insert(aElement, &(lpNode->mpRight) );
}
 
// had not deal with duplicate
}
 
**template** <**typename** T>
**void** CBinarySearchTree::Remove(**const** T &aElement;)
{
Remove(aElement, &mpRoot;);
}
 
**template** <**typename** T>
**void** CBinarySearchTree::Remove(**const** T &aElement;, CBinaryNode** appNode) **const**
{
CBinaryNode* lpNode = *appNode;
**if**(NULL == lpNode)
{
**return** ; // Item removing is not exist
}
 
**if**( aElement < lpNode->mElement )
{
Remove(aElement, &(lpNode->mpLeft) );
}
**else** **if**( aElement > lpNode->mElement )
{
Remove(aElement, &(lpNode->mpRight) );
}
**else** **if**( NULL != lpNode->mpLeft && NULL != lpNode->mpRight) // Two children
{
lpNode->mElement = FindMin(lpNode->mpRight)->mElement;
Remove( lpNode->mElement, &(lpNode->mpRight) );
}
**else**
{
CBinaryNode *lpOldNode = lpNode;
// Even if lpNode equal NULL, this is still the right behavior we need
// Yeah,When lpNode have no children,we make lpNode equal NULL
*appNode = (lpNode->mpLeft != NULL) ? lpNode->mpLeft : lpNode->mpRight;
**delete** lpOldNode;
}
}
 
 
**template** <**typename** T>
**const** T* CBinarySearchTree::FindMin() **const**
{
CBinaryNode* lpNode = FindMin(mpRoot);
**return** (lpNode != NULL) ? &(lpNode->mElement) : NULL;
}
 
 
// damn it! So redundant words to fit to C++ syntax
// the only way to fix this problom is compositing defines and declares
// I even doubt that are there programmers could write it right
**template** <**typename** T>
**typename** CBinarySearchTree::CBinaryNode * CBinarySearchTree::FindMin(CBinaryNode* apNode) **const**
{
**if**( NULL == apNode)
{
**return** NULL;
}
**else** **if**( NULL == apNode->mpLeft)
{
// Find it
**return** apNode;
}
**else**   
{
**return** FindMin(apNode->mpLeft);
}
}
 
**template** <**typename** T>
**const** T* CBinarySearchTree::FindMax() **const**
{
CBinaryNode* lpNode = FindMax(mpRoot);
**return** (lpNode != NULL) ? &(lpNode->mElement) : NULL;
}
 
**template** <**typename** T>
**typename** CBinarySearchTree::CBinaryNode * CBinarySearchTree::FindMax(CBinaryNode* apNode) **const**
{
**if**( NULL == apNode)
{
**return** NULL;
}
**else** **if**( NULL == apNode->mpRight)
{
// Find it
**return** apNode;
}
**else**   
{
**return** FindMax(apNode->mpRight);
}
}
 
**template** <**typename** T>
**void** CBinarySearchTree::MakeEmpty()
{
MakeEmpty(&mpRoot;);
}
 
 
**template** <**typename** T>
**void** CBinarySearchTree::MakeEmpty(CBinaryNode** appNode)
{
CBinaryNode* lpNode = *appNode;
**if**( lpNode != NULL)
{
MakeEmpty( &(lpNode->mpLeft) );
MakeEmpty( &(lpNode->mpRight) );
**delete** lpNode;
}
 
*appNode = NULL;
}
 
// how long the syntax is...............
**template** <**typename** T>
**const** CBinarySearchTree& CBinarySearchTree::**operator** =(**const** CBinarySearchTree &aOrig;)
{
**if**(&aOrig; == **this**)
{
**return** * **this** ;
}
 
MakeEmpty();
mpRoot = Clone(aOrig.mpRoot);
 
**return** * **this** ;
 
}
 
// when you use nest class and template both,you will find out how long the C++ syntax is.....
// I use it once,I ask why couldn't we have a short once again.
**template** <**typename** T>
**typename** CBinarySearchTree::CBinaryNode* CBinarySearchTree::Clone(CBinaryNode *apNode) **const**
{
**if**(NULL == apNode)
{
**return** NULL;
}
 
// abuse recursion
**return** **new** CBinaryNode(apNode->mElement, Clone(apNode->mpLeft), Clone(apNode->mpRight));
}
 
 
 
 
#endif // __BINARY_SEARCH_TREE_H__

测试代码：

#include   
#include "BinarySearchTree.h"
**using** **namespace** std;
 
**int** _tmain(**int** argc, _TCHAR* argv[])
{
CBinarySearchTree<**int** > loTree;
 
loTree.Insert(10);
loTree.Insert(20);
loTree.Insert(30);
loTree.Insert(40);
cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20) "<20) <
loTree.Remove(40);
cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20) " <20) <
loTree.Remove(30);
loTree.Remove(20);
loTree.Remove(10);
 
 
loTree.Insert(40);
cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20) " <20) <
loTree.Insert(30);
loTree.Insert(20);
loTree.Insert(10);
cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20) " <20) <
loTree.Remove(40);
loTree.Remove(30);
loTree.Remove(20);
loTree.Remove(10);
 
loTree.Insert(30);
loTree.Insert(40);
cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20) " <20) <
loTree.Insert(10);
loTree.Insert(20);
cout <<"Min: " <<*loTree.FindMin() <<" Max: " <<*loTree.FindMax() <<" IsContains(20) " <20) <
CBinarySearchTree<**int** > loTree2 = loTree;
cout <<"Min: " <<*loTree2.FindMin() <<" Max: " <<*loTree2.FindMax() <<" IsContains(20) " <20) <
 
loTree.MakeEmpty();
 
 
 
system("pause");
**return** 0;
}


write by 九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月14日

《数据结构与算法分析C++描述》分离链接(separate chaining)哈希表的C++实现

write by 九天雁翎(JTianLing) – www.jtianling.com

《数据结构与算法分析c++描述》 Mark Allen Weiss著
人民邮电大学出版
中文版第138-142面，分离链接(separate chaining)哈希表,侯捷将其翻译成开链

这应该是最容易实现的哈希表方法了,次容易的应该是线性搜索.

想起我到目前公司干的第二件事情，就是实现了一个文件系统，其核心模块就是一个类似MPQ的打包文件格式.而这个打包格式的核心模块就是一个线性哈希表的实现。只不过这次的实现不是在内存中，而是在文件上。这里顺便想说明是MPQ的实现是个很有意思的东西，感兴趣的可以去看看

http://shadowflare.samods.org/inside_mopaq/

inside mopaq是我见过最详细也最有用的资料，至于我刚开始工作的一些原始的资料记录就是非常凌乱了，但是希望有人在做同样工作的时候还能有一些参考价值吧。

http://www.jtianling.com/archive/2008/06/02/2504503.aspx

http://www.jtianling.com/archive/2008/06/02/2504515.aspx

并且，因为我以前已经实现了这么一个线性搜索的哈希表了，所以此次也不准备再实现一次了。

最后。。。。暴雪那个哈希算法的确是很不错，要求和一般的哈希算法不一样，一般的要求是哈希表总数为质数，其要求为2的幂。我在一次测试中发现，2万个文件的冲突次数大概在2千次,即1/10,远远低于书中低于1.5次的预期.

这一次是在VS中实现的,直接拷贝过来了,没有用vim了.

分离链接(separate chaining)哈希表的实现：

#ifndef
__SL_HASH_TABLE_H__

#define
__SL_HASH_TABLE_H__

#include


#include


#include

using
namespace
std;

// 两个Hash函数,第一个由书上的例子提供，散列效果不明

int
hash( const
string& key)

{

  int
  liHashVal = 0;

  for( int
  i = 0; i < key.length(); ++i)

  {

    liHashVal = 37 * liHashVal \+ key[i];

  }
  return
  liHashVal;

}

// 书中没有提供这个散列函数的实现。。。。。郁闷了,随便写一个了。。。。

int
hash( int
key)

{

  return
  key;

}

// 参考了<>

static
const
int
gPrimesCount = 10;

static
unsigned
long
gPrimesArray[gPrimesCount] =

{

  53, 97, 193, 389, 769,

  1543, 3079, 6151, 12289, 24593

};

inline
unsigned
long
NextPrime(unsigned
long
n)

{

  const
  unsigned
  long* first = gPrimesArray;

  const
  unsigned
  long* last = gPrimesArray \+ gPrimesCount;

  const
  unsigned
  long* pos = lower_bound(first, last, n);

  return
  pos == last ? *(last \- 1) : *pos;

}

template <typename
HashedObj>

class
CSLHashTable

{

public:

  // 书中无实现，无提示,我第一次编译才发现。。。。。

  explicit
  CSLHashTable(size_t
  aiSize = 101) : miCurrentSize(aiSize)

  {

    moLists.resize(aiSize);

  }

  bool
  Contains( const
  HashedObj& x ) const

  {

    const
    list<HashedObj> & liListFinded = moLists[ MyHash(x)];

    return
    find( liListFinded.begin(), liListFinded.end(), x) != liListFinded.end();

  }

  void
  MakeEmpty()

  {

    for( int
    i=0; i<moLists.size(); ++i)

    {

      moLists[i].clear();

    }

  }

  bool
  Insert( const
  HashedObj& x)

  {

    list<HashedObj> & liListFinded = moLists[ MyHash(x)];

    if( find( liListFinded.begin(), liListFinded.end(), x) != liListFinded.end() )

    {

      return
      false;

    }

    liListFinded.push_back(x);

    if(++miCurrentSize > moLists.size())

    {

      ReHash();

    }

    return
    true;

  }

  bool
  Remove( const
  HashedObj& x)

  {

    list<HashedObj>& liListFinded = moLists[ MyHash(x)];

    list<HashedObj>::iterator
    lit = find(liListFinded.begin(), liListFinded.end(), x);

    if(lit == liListFinded.end())

    {

      return
      false;

    }

    liListFinded.erase(lit);

    \--miCurrentSize;

    return
    true;

  }

private:

  vector<list<HashedObj> > moLists;

  size_t
  miCurrentSize;

  void
  ReHash()

  {

    vector<list<HashedObj> > loOldLists = moLists;

    // 书中又一次的没有提供相关关键函数的实现,而且没有一点提示，NextPrime的含义自然是移到下一个素数上

    moLists.resize( NextPrime( 2 * moLists.size()));

    for( int
    j=0; j<moLists.size(); ++j)

    {

      moLists[j].clear();

    }

    miCurrentSize = 0;

    for(int
    i=0; i<loOldLists.size(); ++i)

    {

      list<HashedObj>::iterator
      lit = loOldLists[i].begin();

      while(lit != loOldLists[i].end())

      {

        Insert(*lit++);

      }

    }

  }

  int
  MyHash( const
  HashedObj& x) const

  {

    int
    liHashVal = hash(x);

    liHashVal %= moLists.size();

    if(liHashVal < 0)

    {

      liHashVal += moLists.size();

    }

    return
    liHashVal;

  }

};

#endif
// __SL_HASH_TABLE_H__

测试代码

#include
"SLHashTable.h"

#include

#include

using
namespace
std;

// 这里为了稍微纠正我最近用宏上瘾的问题。。。。强制自己使用了模板

// 其实还是有个问题。。。呵呵，具体的名字没有办法输出来了，当然，每次调用函数

// 输入字符串永远不在考虑的范围内

// 另外.....看到最后标准库的类型全名的时候,总是会感叹一下...实在是太长了,记得

// 有一次,一个复杂的带string的map,我根本没有办法从鼠标下面看到即时显示的调试信息

// 原因是类型太长了,加起来超出了一个屏幕!!!,所以实际的调试数值被挤到了屏幕以外!

// 所以只能通过添加watch的方式才能看到值-_-!!

template <typename
HashedObj, typename
Table >

void
Test(HashedObj
x, Table& table)

{

  if(table.Contains(x))

  {

    cout <<typeid(table).name() <<" Constains " <<x <<endl;

  }

  else

  {

    cout <<typeid(table).name() <<" don't Constains " <<x <<endl;

  }

}

int
main()

{

  // test Int

  CSLHashTable<int> loIntTable;

  loIntTable.Insert(10);

  loIntTable.Insert(20);

  loIntTable.Insert(30);

  loIntTable.Insert(40);

  loIntTable.Insert(50);

  Test(20, loIntTable);

  Test(30, loIntTable);

  Test(40, loIntTable);

  Test(60, loIntTable);

  Test(70, loIntTable);

  CSLHashTable<string> loStrTable;

  loStrTable.Insert(string("10"));

  loStrTable.Insert(string("20"));

  loStrTable.Insert(string("30"));

  loStrTable.Insert(string("40"));

  loStrTable.Insert(string("50"));

  Test(string("20"), loStrTable);

  Test(string("30"), loStrTable);

  Test(string("40"), loStrTable);

  Test(string("60"), loStrTable);

  Test(string("70"), loStrTable);

  return 0;

}

write by 九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月09日

ASIO—下一代C++标准可能接纳的网络库（3）UDP网络应用

write by九天雁翎(JTianLing) – www.jtianling.com**

讨论新闻组及文件

一、综述

接着前面

《ASIO—下一代C++标准可能接纳的网络库（1）简单的应用》

《ASIO—下一代C++标准可能接纳的网络库（2）TCP网络应用》

继续，讲了简单应用，讲了TCP,自然而然是UDP了。其实个人感觉UDP与TCP的接口假如经过封装是可以做到接口比较一致的，但是遗憾的是asio没有遵循这样的接口设计方案。

二、 Tutorial

1. Daytime.4 - A synchronous UDP daytime client（同步UDP daytime客户端）

还是先看看普通的socket API的情况：

#include <stdio.h>
#include <string.h>
#include "Winsock2.h"
#include "errno.h"
#include "stdlib.h"

#define MAXLINE 1000

void str_cli(SOCKET sockfd, const struct sockaddr* pservaddr, int servlen)
{
    int n;
    char recvline[MAXLINE] = {0};
    char sendline[2] = {0};
    sendto(sockfd, sendline, 2, 0, pservaddr, servlen);

    n = recvfrom(sockfd, recvline, MAXLINE, 0, NULL, NULL);
    recvline[n] = 0;
    printf("%s", recvline);
}


int main(int argc, char **argv)
{
    WORD wVersionRequested = 0;
    WSADATA wsaData;
    int err;

    wVersionRequested = MAKEWORD( 2, 2 );

    // windows下此初始化为必须，实际是初始化WinsockDLL的过程
    err = WSAStartup( wVersionRequested, &wsaData );
    if ( err != 0 ) {
       return -1;
    }
    SOCKET               sockfd;
    struct sockaddr_in   servaddr;

    if (argc != 2)
    {
       printf("usage: tcpcli <IPaddress>");
       exit(1);
    }

    sockfd = socket(AF_INET, SOCK_DGRAM, 0);

    ZeroMemory(&servaddr, sizeof(servaddr));
    servaddr.sin_family = AF_INET;
    servaddr.sin_port = htons(13);
    servaddr.sin_addr.s_addr = inet_addr(argv[1]);

    str_cli(sockfd, (const struct sockaddr*)&servaddr, sizeof(servaddr));

    system("pause");
    WSACleanup();
    exit(0);
}

相对来说，假如用了socket API，会发现UDP的程序编写逻辑是比TCP要简单的，起码省略了connect的过程，但是难就难在当网络情况不好时UDP程序的处理。这么简单的程序不再加更多说明了。

看看asio的例子:

#include <iostream>
#include <boost/array.hpp>
#include <boost/asio.hpp>
using boost::asio::ip::udp;

int main(int argc, char* argv[])
{
    try
    {
       if (argc != 2)
       {
           std::cerr << "Usage: client <host>" << std::endl;
           return 1;
       }
       boost::asio::io_service io_service;
       udp::resolver resolver(io_service);
       udp::resolver::query query(udp::v4(), argv[1], "daytime");
       udp::endpoint receiver_endpoint = *resolver.resolve(query);
       udp::socket socket(io_service);
       socket.open(udp::v4());
       boost::array<char, 1> send_buf = { 0 };
       socket.send_to(boost::asio::buffer(send_buf), receiver_endpoint);
       boost::array<char, 128> recv_buf;
       udp::endpoint sender_endpoint;
       size_t len = socket.receive_from(
           boost::asio::buffer(recv_buf), sender_endpoint);
       std::cout.write(recv_buf.data(), len);
    }
    catch (std::exception& e)
    {
       std::cerr << e.what() << std::endl;
    }
    return 0;
}

甚至没有感觉到有任何简化。一大堆的resolver(为了适应ipv6)，一大堆的array,其实并不优雅，在很简单的程序中，会发现，似乎asio就是简单的为socket进行了非常浅的封装一样，你还得了解一大堆本来可以不了解的东西，asio内在的高效率，异步啊，用的那些模式啊，都看不到。。。。。。。。。。呵呵，也许socket API本来就是Make the simple things simple吧，而asio就是为了应付绝对复杂的情况而做出相对复杂设计的吧。这样的例子没有任何说服力能让人放弃socket API而使用asio。。。。。。。不知道asio的文档中加入这些干啥。。。仅仅为了说明？-_-!

2. A synchronous UDP daytime server（同步的UDP daytime服务器）

还是先来个原始的socket API写的版本：

#include <time.h>
#include "Winsock2.h"
#include "errno.h"
#include "stdlib.h"

#define MAXLINE 1000

void str_svr(SOCKET sockfd, struct sockaddr* pcliaddr, int clilen)
{
    int n = 0;
    time_t ticks = 0;
    int len;
    char recvline[2] = {0};
    char sendline[MAXLINE] = {0};
    for(;;)
    {
       len = clilen;
       if( (n = recvfrom(sockfd, recvline, 2, 0, pcliaddr, &len)) == INVALID_SOCKET)
       {
            printf("recvfrom failed: %d/n", WSAGetLastError());
            return;
       }

       ticks = time(NULL);
       _snprintf(sendline, sizeof(sendline), "%.24s/r/n", ctime(&ticks));

       sendto(sockfd, sendline, strlen(sendline), 0, pcliaddr, len);
    }
}


int main(int argc, char **argv)
{
    WORD wVersionRequested = 0;
    WSADATA wsaData;
    int err;

    wVersionRequested = MAKEWORD( 2, 2 );

    // windows下此初始化为必须，实际是初始化WinsockDLL的过程
    err = WSAStartup( wVersionRequested, &wsaData );
    if ( err != 0 ) {
       return -1;
    }

    SOCKET               sockfd;
    struct sockaddr_in   servaddr,cliaddr;
    sockfd = socket(AF_INET, SOCK_DGRAM, 0);

    ZeroMemory(&servaddr, sizeof(servaddr));
    servaddr.sin_family      = AF_INET;
    servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
    servaddr.sin_port        = htons(13);  /* daytime server */

    if( bind(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr))
       == SOCKET_ERROR) 
    {
       printf("bind failed: %d/n", WSAGetLastError());
       closesocket(sockfd);
       WSACleanup();
       return 1;
    }

    str_svr(sockfd, (struct sockaddr*)&cliaddr, sizeof(cliaddr));

    closesocket(sockfd);
    WSACleanup();
    return 1;
}

与上篇文章中tcp服务器的例子很像，基本上来说，用socket写客户端还是相对简单一些，但是写个服务器就相对要复杂很多，这个例子还没有精细的判断每个返回值（比如send函数），但是已经比较复杂了。

接着是asio版本：

#include <ctime>
#include <iostream>
#include <string>
#include <boost/array.hpp>
#include <boost/asio.hpp>
using boost::asio::ip::udp;

std::string make_daytime_string()
{
    using namespace std; // For time_t, time and ctime;
    time_t now = time(0);
    return ctime(&now);
}

int main()
{
    try
    {
       boost::asio::io_service io_service;
       udp::socket socket(io_service, udp::endpoint(udp::v4(), 13));
       for (;;)
       {
           boost::array<char, 1> recv_buf;
           udp::endpoint remote_endpoint;
           boost::system::error_code error;
           socket.receive_from(boost::asio::buffer(recv_buf),
              remote_endpoint, 0, error);
           if (error && error != boost::asio::error::message_size)
              throw boost::system::system_error(error);
           std::string message = make_daytime_string();
           boost::system::error_code ignored_error;
           socket.send_to(boost::asio::buffer(message),
              remote_endpoint, 0, ignored_error);
       }
    }
    catch (std::exception& e)
    {
       std::cerr << e.what() << std::endl;
    }
    return 0;
}

我甚至觉得在asio中写服务器比写客户端更加简单-_-!也许一开始asio就是为了写高性能服务器而设计的，所以导致写客户端相对那么麻烦，但是写服务器却又简单很多吧。不过，熟悉socket API对于使用asio也是有意义的，比如这里的receive_from,send_to不过是对应的socket API函数换汤不换药的版本而已，使用起来除了参数传递方式上的变化，最后效果一致。

3. An asynchronous UDP daytime server(异步 UDP daytime 服务器)

又是有点意思了的程序了，asio的命名就是表示异步的io，所以展现异步的程序才能体现asio的实力及其简化了底层操作的本事。TCP的应用是这样，这里也不例外。

不过出于UDP应用相对于TCP应用本身的简单性，所以这个示例程序比对应的TCP版本就要简化很多，只是，不知道asio的UDP实现了更丰富的UDP特性没有，比如超时重发等机制。。。。

write by九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月07日

ASIO—下一代C++标准可能接纳的网络库（2）TCP网络应用

write by九天雁翎(JTianLing) – www.jtianling.com**

讨论新闻组及文件

一、综述

本文仅仅是附着在boost::asio文档的一个简单说明和讲解，没有boost::asio文档可能你甚至都不知道我在讲什么，boost::asio的文档自然是需要从www.boost.org上去下。

基本上，网络编程领域的”Hello World”程序就是类似Echo，daytime等服务器应用了。大牛Stevens经典的《Unix Network Programming》一书更是在这两个服务器上折腾了半本书，Comer的《Internetworking With TCP/IP vol III》也不例外。boost::asio的文档也就更不例外了，全部的网络方面的例子都是以daytime服务为蓝本来讲解的。呵呵，大家这样做是有道理的，毕竟从讲解网络编程的原理来看，echo,daytime等足够的简单：）

二、 Tutorial

首先，因为客户端程序相对服务器程序更为简单，所以一般都从客户端开始，boost::asio也是如此，第一节，给出了一个TCP 的Daytime的实现所谓示例，这里，我不拷贝其源码了，只是列出一个用windows 下用套接字接口实现的同样程序作为对比。

1. A synchronous TCP daytime client（一个同步的TCP daytime客户端程序）

原始的套接字实现：

#include <stdio.h>
#include <string.h>
#include "Winsock2.h"
#include "errno.h"
#include "stdlib.h"

#define MAXLINE 1000

void str_cli(SOCKET sockfd)
{
    char recvline[MAXLINE] = {0};
    while ( (recv(sockfd, recvline, MAXLINE, 0)) != NULL) 
    {
       printf("%s", recvline);
    }
    closesocket(sockfd);
}


int main(int argc, char **argv)
{
    WORD wVersionRequested = 0;
    WSADATA wsaData;
    int err;

    wVersionRequested = MAKEWORD( 2, 2 );

    // windows下此初始化为必须，实际是初始化WinsockDLL的过程
    err = WSAStartup( wVersionRequested, &wsaData );
    if ( err != 0 ) {
       return -1;
    }
    SOCKET               sockfd;
    struct sockaddr_in   servaddr;

    if (argc != 2)
    {
       printf("usage: tcpcli <IPaddress>");
       exit(1);
    }

    sockfd = socket(AF_INET, SOCK_STREAM, 0);

    ZeroMemory(&servaddr, sizeof(servaddr));
    servaddr.sin_family = AF_INET;
    servaddr.sin_port = htons(13);
    servaddr.sin_addr.s_addr = inet_addr(argv[1]);

    if( SOCKET_ERROR == connect(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr)))
    {
       printf("connet failed, Error Code: %d", WSAGetLastError());
       closesocket(sockfd);
       return -1;
    }

    str_cli(sockfd);     /* do it all */

    system("pause");
    exit(0);
}

共六十一行，并且需要处理socket创建，初始化等繁琐细节，做任何决定时基本上是通过typecode，其实相对来说也不算太难，因为除了socket的API接口属于需要额外学习的东西，没有太多除了C语言以外的东西需要学习，并且因为BSD socket是如此的出名，以至于几乎等同与事实的标准，所以这样的程序能被大部分学习过一定网络编程知识的人了解。

为了方便对比，我还是贴一下boost::asio示例中的代码：

#include <iostream>
#include <boost/array.hpp>
#include <boost/asio.hpp>
using boost::asio::ip::tcp;

int main(int argc, char* argv[])
{
    try
    {
       if (argc != 2)
       {
           std::cerr << "Usage: client <host>" << std::endl;
           return 1;
       }
       boost::asio::io_service io_service;
       tcp::resolver resolver(io_service);
       tcp::resolver::query query(argv[1], "daytime");
       tcp::resolver::iterator endpoint_iterator = resolver.resolve(query);
       tcp::resolver::iterator end;
       tcp::socket socket(io_service);
       boost::system::error_code error = boost::asio::error::host_not_found;
       while (error && endpoint_iterator != end)
       {
           socket.close();
           socket.connect(*endpoint_iterator++, error);
       }
       if (error)
           throw boost::system::system_error(error);
       for (;;)
       {
           boost::array<char, 128> buf;
           boost::system::error_code error;
           size_t len = socket.read_some(boost::asio::buffer(buf), error);
           if (error == boost::asio::error::eof)
              break; // Connection closed cleanly by peer.
           else if (error)
              throw boost::system::system_error(error); // Some other error.
           std::cout.write(buf.data(), len);
       }
    }
    catch (std::exception& e)
    {
       std::cerr << e.what() << std::endl;
    }
    return 0;
}

boost::asio的文档中的实现也有47行，用了多个try,catch来处理异常，因为其实现的原因，引入了较多的额外复杂度，除了boost::asio以外，即便你很熟悉C++,你也得进一步的了解诸如boost::array, boost:system等知识，（虽然其实很简单）并且，从使用上来说，感觉并没有比普通的socket API简单，虽然如此，boost::asio此例子还是有其优势的，比如ipv4,ipv6的自适应（原socket API仅仅支持ipv4），出错时更人性化的提示(此点由C++异常特性支持，相对比C语言中常常只能有个error code)。

当然，此例子过于简单，而asio是为了较大规模程序的实现而设计的，假如这么小规模的程序用原始的套接字就足够了。这点是需要说明的。

2. Daytime.2 - A synchronous TCP daytime server（同步的TCP daytime服务器）

有了客户端没有服务器，那客户端有什么用呢？^^所以，接下来boost::asio适时的给出了一个daytime的服务器实现，这里还是先给出使用一个原始套接字的例子：

#include <time.h>
#include "Winsock2.h"
#include "errno.h"
#include "stdlib.h"

#define MAXLINE 1000
int main(int argc, char **argv)
{
    WORD wVersionRequested = 0;
    WSADATA wsaData;
    int err;

    wVersionRequested = MAKEWORD( 2, 2 );

    // windows下此初始化为必须，实际是初始化WinsockDLL的过程
    err = WSAStartup( wVersionRequested, &wsaData );
    if ( err != 0 ) {
       return -1;
    }

    SOCKET               listenfd, connfd;
    struct sockaddr_in   servaddr;
    char              buff[MAXLINE];
    time_t            ticks;

    listenfd = socket(AF_INET, SOCK_STREAM, 0);

    ZeroMemory(&servaddr, sizeof(servaddr));
    servaddr.sin_family      = AF_INET;
    servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
    servaddr.sin_port        = htons(13);  /* daytime server */

    if( bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr))
        == SOCKET_ERROR) 
    {
           printf("bind failed: %d/n", WSAGetLastError());
           closesocket(listenfd);
           WSACleanup();
           return 1;
    }

    if (listen( listenfd, SOMAXCONN ) == SOCKET_ERROR)
    {
       printf("Error listening on socket./n");
       WSACleanup();
       return 1;
    }

    for ( ; ; ) 
    {
       connfd = accept(listenfd, (struct sockaddr*) NULL, NULL);
       if (connfd == INVALID_SOCKET) 
       {
           printf("accept failed: %d/n", WSAGetLastError());
           closesocket(listenfd);
           WSACleanup();
           return 1;
       } 

       ticks = time(NULL);
       _snprintf(buff, sizeof(buff), "%.24s/r/n", ctime(&ticks));
       if( SOCKET_ERROR == send(connfd, buff, strlen(buff), 0))
       {
           printf("send failed: %d/n", WSAGetLastError());
           closesocket(connfd);
           WSACleanup();
           return 1;
       }
       closesocket(connfd);
    }

    WSACleanup();
    return 0;
}

全程序75行，大部分用于socket的初始化，及其状态的转换，直到真正的进入监听状态并开始接受连接，每个socket API的调用都需要判断返回值，这也算是C语言程序共同的特点。

另外，看看boost::asio的实现。

#include <ctime>
#include <iostream>
#include <string>
#include <boost/asio.hpp>
using boost::asio::ip::tcp;

std::string make_daytime_string()
{
    using namespace std; // For time_t, time and ctime;
    time_t now = time(0);
    return ctime(&now);
}
int main()
{
    try
    {
       boost::asio::io_service io_service;
       tcp::acceptor acceptor(io_service, tcp::endpoint(tcp::v4(), 13));
       for (;;)
       {
           tcp::socket socket(io_service);
           acceptor.accept(socket);
           std::string message = make_daytime_string();
           boost::system::error_code ignored_error;
           boost::asio::write(socket, boost::asio::buffer(message),
              boost::asio::transfer_all(), ignored_error);
       }
    }
    catch (std::exception& e)
    {
       std::cerr << e.what() << std::endl;
    }
    return 0;
}

全程序35行，比使用原始套接字的版本省略了一半，并且还是保持着可移植性（我的例子只能在windows下运行）。

从其文档和实现来看，实现上将很多函数转化为类了，使用上也没有简化一些。。。做出这样的结论还是当我处于对socket API的熟悉程度要远大于boost::asio的情况。也许对于纯粹的初学者，要学习asio会比socket API简单更多一些。毕竟相当多的细节，比如各种情况下的错误返回，各类接口需要传入的适当的参数，甚至套接字初始化，状态的转换等等在boost::asio中都简化了太多。此例中的例子就是accept函数在boost::asio中实现为了acceptor类。

另外，这里值得说明一下，虽然BSD socket套接字属于事实上的标准，但是其实同一套程序不经过一定的更改要放在Linux,Windows上同时运行是不可能的，因为其中总有些细微的差别，总记得刚开始工作的时候，拿着《Unix Network Programming》在Windows下去学习，结果一个小程序都用不了。。。结果是完全不知道Windows下特有的WSAStartup初始化-_-!但是boost::asio就彻底的消除了这样的差别。这也应该算是boost::asio的一个优势吧。

3. An asynchronous TCP daytime server（异步TCP daytime服务器）

与原有asio的简单应用一样，从第三个例子开始就已经是有点意思了的程序了，程序的复杂性上来了，异步相对同步来说效率更高是不争的事实，并且其不会阻塞的特性使得应用范围更广，并且异步也是大部分高性能服务器实际上使用的方式，比如Windows下的完成端口，Linux下的Epoll等，asio的底层就是用这些方式实现的，只不过将其封装起来，使得使用更加简单了。这里提供异步的例子就不是那么简单了-_-!呵呵，偷懒的我暂时就不提供了。其实用select也是可以模拟出异步的特性的，asio在操作系统没有很好的支持异步特性的API时，就是利用select模拟出异步的。但是作为select的例子，可以参考我以前学习时写的《服务器 Select模型的实现》。**

例子中用tcp_server类处理accept事件，用tcp_connection类来处理连接后的写入事件，并且用shared_ptr来保存tcp_connection类的对象。

总结：

boost::asio的确在某种程度上简化了网络客户端/服务器程序的编写，并且易于编写出效率较高的网络应用，（效率能高到什么程度没有实测）但是，作为与C程序员一脉相承的C++程序员，在完全不了解诸如asio:: async_write，asio:: async_accept等函数的实现时有多大的胆量去放心使用，这是个问题。说要去真的理解其实现吧。。。那么就将陷入真正的boost精密C++技巧使用的泥潭，因为boost::asio与其他boost库结合的是如此的紧密，特别是boost::bind,而boost::bind现在的实现实在不是那么优美，并且在下一版的C++标准中variadic templates的加入，是会使其实现简化很多的，这样说来，用boost::asio还是不用。。。是个问题。也许真正能让人下定决心在项目中使用boost::asio的时候，就是在下一代C++标准中其变成了std::asio的时候吧^^

write by九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月07日

ASIO—下一代C++标准可能接纳的网络库（1）简单的应用

write by九天雁翎(JTianLing) – www.jtianling.com**

讨论新闻组及文件

一、综述

第一次看boost::asio的文档，觉得非常详细，可是因为boost库的惯例，与其他boost库结合的比较紧密，而个人除了对一些很基础的boost库,比如智能指针，还有正则表达式boost::regex有所了解外，以前对boost库还是没有太多的了解的，为了很好的学习和了解boost::asio，我做了很多前期工作，到今天，总算可以正式的开始boost::asio的学习了，另外，今天，从使用开始，对asio的学习作为前段时间网络编程方面学习的一种延续，不是像其他boost库一样，仅仅会使用即可，我要的是深入其原理及精髓。。。。（其实已经知道boost::asio在windows下使用的是完成端口，在Linux下使用的是EPoll）

基本上，asio的文档很详尽也是有道理的（从Overview,到Tutorial到Examples到Reference一共870页），作为一个封装良好的网络(接口？)库，虽然对普通的网络接口进行了很详尽的封装，但是因为网络程序本身的复杂性，asio在使用方式上的复杂度还是比较大，这在B.S.的语言中是，绝对复杂的事情需要相对复杂的工具来解决。

二、 Tutorial

首先，从使用上对ASIO库进行一定的了解，因为文档如此的详尽，Tutorial解释如此的精细，我就不再干C-C,C-V的工作了，也就为E文不太好的兄弟们稍微解释一下各个例子，大家可以对照boost::asio的文档来看。

1. Timer.1 - Using a timer synchronously（使用同步定时器）

就连asio库的使用也是从”Hello world”开始，可见K&R影响之深远。此例解释了展示了asio库的基本使用，包括包含boost/asio.hpp头文件，使用asio需要有boost::asio::io_service对象。还有asio::deadline_timer的使用（在此例中的使用和sleep区别不大）

2. Timer.2 - Using a timer asynchronously（使用异步定时器）

异步定时就是你写好函数等其调用的方式了，这里比同步多的复杂性在于你的函数/callable对象（称为handler）的编写，其他基本一样，不同的在于asio::deadline_timer的成员函数调用从wait换成了async_wait。

3. Timer.3 - Binding arguments to a handler（绑定参数到handler）

已经是有点意思了的程序了，程序的复杂性上来了，在异步调用中继续异步调用，形成类似嵌套的结构，用expires_at来定位超时，用boost::bind来绑定参数。bind也算是比较有意思的boost库之一，也是几乎已经拍板进入下一代C++标准的东西了，值得大家去看看其文档：）

4. Timer.4 - Using a member function as a handler

基本上也就是例3的一个类的翻版，展示的其实也就是boost::bind对于类成员函数的绑定功能，比起例三并没有太多新意（假如你熟悉boost::bind库的话，呵呵，还好我扎实过boost基本功），并且因为无端的使用了类结构，增加了很多程序的复杂性，当然，对于在一个类中组织类似的程序还是有一定的指导意义。

5. Timer.5 - Synchronising handlers in multithreaded programs

加入了多线程的实现，展示了boost::asio::strand类在多线程程序中同步回调handler的用法。

这里我看到几个关键点，首先，asio保证，所有的回调函数只在调用了io_service::run()函数的线程中才可能被调用。其次，假如需要多个线程同时能调用回调函数(其实这里都不算太准，因为callable的东西都行，这里用callback object也许更好)，可以再多个线程中调用io_service::run()，这样，可以形成一种类似线程池的效果。

这里展示的同步方式是将多个callback object用一个strand包装起来实现的。其实，用其他的线程同步方式明显也是可行的。

没有展示太多asio的新东西（假如你熟悉boost::thread的话，关于boost::thread可以参考《boost::thread 库，奇怪的文档没有Tutorial的库，但是却仍然相当强大》，呵呵，关于boost的学习不是白学的：），懂的了一些相关库，看asio的例子起码是没有难度的。。。。。想当年我第一次看的时候那个一头雾水啊。。。。。

其实按文档中的例子还可能让初学者有点不清楚，

将原文中的两句改为（不会还要问我是哪两句吧？-_-!以下是在Windows中的例子）

std::cout «“ThreadID:” «GetCurrentThreadId() «” Timer 1: “ « count_ « “/n”;

std::cout «“ThreadID:” «GetCurrentThreadId() «” Timer 2: “ « count_ « “/n”;

这样的形式，那么就能很显著的看到多个线程了。

boost::thread t(boost::bind(&boost::asio::io_service::run, &io));

这样的形式其实是利用boost::thread库创建了一个新的线程，创建新线程的回调又是利用了boost::bind库绑定类成员函数的用法，传递&io作为成员函数的第一参数this，调用io_service::run（），紧接着主线程又调用了io.run()，这样就形成了同时两个线程的情况。

6. MyExample1：Synchronising handlers in multithreaded programs in normal way

这里展示不用boost::asio::strand而是利用常规线程同步的手段来完成线程的同步。

#include <iostream>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/bind.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>

class printer
{
public:
    printer(boost::asio::io_service& io):
       timer1_(io, boost::posix_time::seconds(1)),
       timer2_(io, boost::posix_time::seconds(1)),
       count_(0)
    {
      timer1_.async_wait(boost::bind(&printer::print1, this));
      timer2_.async_wait(boost::bind(&printer::print2, this));
    }
    ~printer()
    {
      std::cout << "Final count is " << count_ << "/n";
    }
    void print1()
    {
      mutex_.lock();
      if (count_ < 10)
      {
          std::cout <<"ThreadID:" <<GetCurrentThreadId() <<" Timer 1: " << count_ << "/n";
          ++count_;
          timer1_.expires_at(timer1_.expires_at() + boost::posix_time::seconds(1));
          timer1_.async_wait(boost::bind(&printer::print1, this));
      }
      mutex_.unlock();
    }
    void print2()
    {
      mutex_.lock();
      if (count_ < 10)
      {
          std::cout <<"ThreadID:" <<GetCurrentThreadId() <<" Timer 2: " << count_ << "/n";
          ++count_;
          timer2_.expires_at(timer2_.expires_at() + boost::posix_time::seconds(1));
          timer2_.async_wait(boost::bind(&printer::print2, this));
      }
      mutex_.unlock();
    }

private:
    boost::asio::deadline_timer timer1_;
    boost::asio::deadline_timer timer2_;
    int count_;
    boost::mutex mutex_;
};

int main()
{
    boost::asio::io_service io;
    printer p(io);
    boost::thread t(boost::bind(&boost::asio::io_service::run, &io));
    io.run();
    t.join();
    return 0;
}

这样的效果和原boost::asio的例5是差不多的，boost::asio除了支持原生的线程同步方式外还加入了新的asio::strand是有意义的，因为这两种方式还是有区别的。

1. 用mutex的方式阻塞的位置是已经进入printe函数以后，而strand是阻塞在函数调用之前的。

2. 相对来说，当大量的同样回调函数需要同步时，asio::strand的使用更为简单一些。

3. 用mutex的方式明显能够更加灵活，因为不仅可以让线程阻塞在函数的开始，也可以阻塞在中间，结尾。

4. 对于同步的对象来说，asio::strand就是对其支持的回调对象，mutex是对本身线程的一种同步。

基本上，两者是相辅相成的，各有用处，但是实际上，假如从通用性出发，从额外学习知识触发，个人感觉strand似乎是可有可无的，不知道有没有必须使用strand的情况。。。。

到此，asio文档中tutorial中的timer系列例子是结束了。其实这里展示的以asio基本原理为主，甚至都还没有接触到任何与网络相关的东西，但是，这些却是进一步学习的基础。。。。。。

write by九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月06日

boost::thread库，奇怪的文档没有Tutorial的库，但是却仍然相当强大

boost::thread 库，奇怪的文档没有Tutorial的库，但是却仍然相当强大

write by 九天雁翎(JTianLing) – www.jtianling.com

讨论新闻组及文件

一直以来感觉boost的库作为开源的库文档是非常详细的，绝大部分库的文档由浅入深，一般先有Overview,从Introduction到简单的Tutorial到复杂的example,再到rationale，应有尽有，但是boost::thread是个例外，没有任何Introduction,Tutorial的内容，上来就是class/type的member function，头文件列举，列举完了了事，连一个example也没有，最奇怪的 boost库文档绝对非其莫属，甚至《Beyond the C++ Standard Library: An Introduction to Boost》这本书中也只字未提thread库，这样的确为学习boost::thread库加大了难度。对于初学者就更难受了，毕竟，本来多线程就是一个不那么容易的东西。。。。

但是，不要以为此库就是boost中最默默无名的库了，为C++添加多线程库的呼声一直比较高（虽然B.S.以前在D&E中认为其应该由第三方库来完成这样和操作平台相关性比较大的内容），亏boost::thread库还提案了好几次，结果文档都没有完善-_-!起码也算是可能进入C++标准的东西，咋能这样呢？

最新的提案信息，可以在其文档中搜寻到，已经进入Revision 1的阶段了。《Multi-threading Library for Standard C++ (Revision 1)》

其实，个人认为，一个多线程库可以很简单，实现简单的临界区用于同步就足够应付绝大部分情况了，相对而言，boost::thread这样的库还是稍微庞大了一点。类似于Python中的thread库其实就不错了（据《Programming Python》作者说原型来自于JAVA），通过继承形式使用线程功能（template method模式），还算比较自然，其实我们公司自己内部也实现了一套与之类似的C++版的线程库，使用也还算方便。但是boost::thread走的是另一条路。由于其文档中没有Introduction和Tutorial，我纯粹是摸石头过河似的实验，有用的不对的地方那也就靠大家指出来了。

一、 Introduction：

boost::thread不是通过继承使用线程那种用了template method模式的线程模型，而是通过参数传递函数(其实不仅仅是函数，只要是Callable，Copyable（因为需要复制到线程的本地数据）的就行）。这种模型是好是坏，我一下也没有结论，但是boost::thread库的选择总归是有些道理的，起码从个人感觉来说，也更符合标准库一贯的优先使用泛型而不是继承的传统和作风，这样的模型对于与boost::function,boost::bind等库的结合使用的确也是方便了很多，

1. 题外话：

假如你对win32/linux下的多线程有一定的了解有助于理解boost::thread的使用，假如没有win32/linux的多线程使用经验，那么起码也需要对多线程程序有概念性的了解，起码对于3个概念要有所了解，context switching,rare conditions, atomic operation，最好也还了解线程间同步的一些常见形式，假如对于我上面提及的概念都不了解，建议先补充知识，不然，即便是HelloWorld，估计也难以理解。另外，毕竟本文仅仅是个人学习boost::thread库过程中的一些记录，所以不会对操作系统，线程等知识有透彻的讲解，请见谅。

2. boost::thread的HelloWorld:

example1:

#include <windows.h>
#include <boost/thread.hpp>
#include <iostream>

using namespace std;
using namespace boost;

void HelloWorld()
{
    char* pc = "Hello World!";
    do
    {
       cout <<*pc;
    }while(*pc++);
    cout <<endl;
}

void NormalFunThread()
{
    thread loThread1(HelloWorld);
    thread loThread2(HelloWorld);
    HelloWorld();

    Sleep(100);
}

int main()
{
    NormalFunThread();

    return 0;
}

不知道如此形式的程序够不够的上一个thread的helloworld程序了。但是你会发现，boost::thread的确是通过构造函数的方式，（就是构造函数），老实的给我们创建了线程了，所以我们连一句完成的helloworld也没有办法正常看到，熟悉线程的朋友们，可以理解将会看到多么支离破碎的输出，在我的电脑上，一次典型的输出如下：

HHeellloHl eoWl olWrool rdWl!od

l d

呵呵，其实我不一次输出整个字符串，就是为了达到这种效果-_-!这个时候需要同步，join函数就是boost::thread为我们提供的同步的一种方式，这种方式类似于利用windows API WaitForSingleObject等待线程结束。下面利用这种方式来实现。

example2:

#include <boost/thread.hpp>
#include <iostream>

using namespace std;
using namespace boost;

void HelloWorld()
{
    char* pc = "Hello World!";
    do
    {
       cout <<*pc;
    }while(*pc++);
    cout <<endl;
}

void NormalFunThread()
{
    thread loThread1(HelloWorld);
    loThread1.join();
    thread loThread2(HelloWorld);
    loThread2.join();
    HelloWorld();

}

int main()
{
    NormalFunThread();

    return 0;
}

这样，我们就能完成的看到3句hello world了。但是这种方式很少有意义，因为实际上我们的程序同时还是仅仅存在一个线程，下一个线程只在一个线程结束后才开始运行，所以，实际中使用的更多的是其他同步手段，比如，临界区就用的非常多，但是我在boost::thread中没有找到类似的使用方式，倒是有mutex（互斥），其实两者对于使用是差不多的。下面看使用了mutex同步线程的例子：

example3:

#include <windows.h>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <iostream>

using namespace std;
using namespace boost;

mutex mu;
void HelloWorld()
{
    mu.lock();
    char* pc = "Hello World!";
    do
    {
       cout <<*pc;
    }while(*pc++);
    cout <<endl;
    mu.unlock();
}

void NormalFunThread()
{
    thread loThread1(HelloWorld);
    thread loThread2(HelloWorld);
    HelloWorld();

    loThread1.join();
    loThread2.join();
}

int main()
{
    NormalFunThread();

    return 0;
}

我们还是能看到3个完好的helloworld，并且，这在实际使用中也是有意义的，因为，在主线程进入HelloWorld函数时，假如第一个线程还没有执行完毕，那么，可能同时有3个线程存在，第一个线程正在输出，第二个线程和主线程在mu.lock();此句等待（也叫阻塞在此句）。其实,作为一个多线程的库，自然同步方式不会就这么一种，其他的我就不讲了。

作为boost库，有个很大的有点就是，互相之间结合的非常好。这点虽然有的时候加大了学习的难度，当你要使用一个库的时候，你会发现一个一个顺藤摸瓜，结果都学会了,比如现在，关于boost库的学习进行了很久了，（写了4，5篇相关的学习文章了），从boost::for_each,boost::bind,boost::lambda,boost::function,boost:: string_algo,到现在的boost::thread，其实原来仅仅是想要好好学习一下boost::asio而已。当你真的顺着学下来，不仅会发现对于C++语言的理解，对STL标准库的理解，对于泛型的理解，等等都有更深入的了解，我甚至在同时学习python的时候，感觉到boost库改变了C++的很多语言特性。。。虽然是模拟出来的。呵呵，题外话说多了，其实要表达的意思仅仅是boost::thread库也是和其他boost库有很多紧密结合的地方，使得其使用会非常的方便。这里一并的在一个例子中演示一下。

example4:

#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <iostream>

#include <boost/function.hpp>
#include <boost/bind.hpp>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>
using namespace std;
using namespace boost;

void HelloWorld()
{
    char* pc = "Hello World!";
    do
    {
       cout <<*pc;
    }while(*pc++);
    cout <<endl;
}

void NormalFunThread()
{
    thread loThread1(HelloWorld);
    thread loThread2(HelloWorld);
    HelloWorld();

    loThread1.join();
    loThread2.join();
}

void BoostFunThread()
{
    thread loThread1(HelloWorld);
    function< void(void) > lfun = bind(HelloWorld);
    thread loThread2(bind(HelloWorld));
    thread loThread3(lfun);

    loThread1.join();
    loThread2.join();
    loThread3.join();
}

int main()
{
//  NormalFunThread();
    BoostFunThread();

    return 0;
}

一如既往的乱七八糟：

HHHeeelllllolo o W WoWoorrrlldld!d!

但是，正是这样的乱七八糟，告诉了我们，我们进入了真实的乱七八糟的多线程世界了-_-!

还记得可怜的Win32 API怎么为线程传递参数吗？

看看其线程的原型

**DWORD ThreadProc(**

  **LPVOID** _lpParameter_

**);**

这里有个很大的特点就是，运行线程的函数必须是这样的，规矩是定死的，返回值就是这样，参数就是LPVOID(void*)，你没有选择，函数原型没有选择，参数传递也没有选择，当你需要很多数据时，唯一的办法就是将其塞入一个结构，然后再传结构指针，然后再强行使用类型转换。其实这是很不好的编程风格，不过也是无奈的折衷方式。

注意到没有，其实我们的HelloWold根本就是没有符合这个要求，不过我们一样使用了，这也算是boost::thread的一个很大优点，最大的优点还是在于参数传递的方式上，彻底摆脱了原来的固定死的框架，让你到了随心所欲的使用线程的地步。

看个例子：

example5:

#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <iostream>

#include <boost/function.hpp>
#include <boost/bind.hpp>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>
using namespace std;
using namespace boost;

mutex mu;
void HelloTwoString(char *pc1, char *pc2)
{
    mu.lock();
    if(pc1)
    {
       do
       {
           cout <<*pc1;
       }while(*pc1++);
    }
    if(pc2)
    {
       do
       {
           cout <<*pc2;
       }while(*pc2++);
       cout <<endl;
    }
    mu.unlock();
}

void BoostFunThread()
{
    char* lpc1 = "Hello ";
    char* lpc2 = "World!";
    thread loThread1(HelloTwoString, lpc1, lpc2);
    function< void(void) > lfun = bind(HelloTwoString, lpc1, lpc2);
    thread loThread2(bind(HelloTwoString, lpc1, lpc2));
    thread loThread3(lfun);

    loThread1.join();
    loThread2.join();
    loThread3.join();
}

int main()
{
    BoostFunThread();

    return 0;
}

这里不怀疑线程的创建了，用了同步机制以方便查看结果，看看参数的传递效果，是不是真的达到了随心所欲的境界啊：）

最最重要的是，这一切还是建立在坚实的C++强类型机制磐石上，没有用hack式的强制类型转换，这个重要性无论怎么样强调都不过分，这个优点说他有多大也是合适的。再一次的感叹，当我责怪牛人们将C++越弄越复杂的时候。。。。。。。。先用用这种复杂性产生的简单的类型安全的高效的库吧。。。。。。关于boost::thread库就了解到这里了，有点浅尝辄止的感觉，不过，还是先知其大略，到实际使用的时候再来详细了解吧，不然学习效率也不会太高。

write by九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

2009年06月01日

Python爱好者的幽默，关于SIP的命名

为方便C/C++为Python写扩展，有个很著名的工具，叫SWIG（本身也是一个单词，意为痛饮），英文全称是The Simplified Wrapper and Interface Generator，即简单化的包装和接口生成器，然后呢，当Python的爱好者准备建立一个轻量级的专门为Python建立的类似工具时，就命名为SIP（小口的喝），其命名让人叫绝。。。。。。。。。。。。我发现基本上用C/C++的程序员碰到程序相关话题的幽默感都比较强,呵呵，不要小看了SIP,PyQT就是依赖其建立的。

阅读全文....

2009年05月30日

继续boost的相关学习， boost::string_algo让std::string这样的废柴类有用一点

继续 boost的相关学习， boost::string_algo让std::string这样的废柴类有用一点

write by 九天雁翎(JTianLing) – www.jtianling.com

讨论新闻组及文件

一、对string态度的慢慢转变

作为一个初学者，本来没有批评大牛们创造的语言及其标准库的资本，初期看的一些教学书籍总是推荐你在绝大多数时候不要使用char*，使用string,比如C++ Primer,C++ Effective等，但是再接着看深点，就开始有书批评C++中的某些库了，比如C++ STL Library，（批评string的书比较多，我倒是一下子举不出例子-_-!）但是，甚至在看过很多书批评std::string后，我还是感觉尽量去使用std::string比较好，知道越用感觉越多的不方面，除了与标准库算法搭配较为方便外，std::string甚至比不过CString好用，性能上的问题也是比我想象的要严重的多，以前在做代码优化的时候发现，按照常用的for循环记法，绝大多数标准库容器都会将begin(),end()这样的迭代器函数调用内联，但是唯独std::string不会，傻傻的每次都调用（在VS2005上测试,全优化结果一致）。

假如这方面你还可以说是MS在string实现上的缺陷的话（事实上现有STL几乎统一是来源于HP的库），那么作为一个string，在std中甚至连一个大小写转换的函数都没有，甚至连大小写不敏感的比较都没有，那就实在说不过去了，想想python,lua等语言中一个string是何其的强大！难道是因为C ++的设计者们都做操作系统，一般不处理字符串？-_-!没有正则表达式的历史原因我就不多说了。。。。

二、轮子没有总会有人造的

于是乎，在需要相关功能的时候，需要用replace，stricmp加上一大串参数来完成，从理解的角度来说其实并不那么直观，对于普通功能的实现我能理解，但是对于常用功能老是这样用，感觉总是不爽，于是问题又来了，自己做个自己的库吧，下次重复利用就完了，才发现，当一种语言需要人造一个本应该存在的轮子的时候，这个语言本身就有问题。

再于是乎，造轮子的人boost::string_algo诞生了，可见std::string的恶劣。。。。自然，像这种库要进标准肯定是没有戏了，自己偷着用吧。

三、 boost::string_algo个人概述

string_algo我粗略的看了一下，相对而言还是比较强大，实现了很多功能。个人觉得必要的功能，比如大小写转换：to_upper，to_lower有了。在python中用的很愉快的字符串处理函数，很有用的功能,比如trim, trim_left , trim_right (Python中对应的函数名字是strip，lstrip,rstrip)，Python中经常用到的starts_with也有了，相关的很多find函数也出来了，最最让我高兴的是，splice函数也有了。。。。这个函数在处理CSV数据的时候相当有用，一个函数往往可以节省很多的工作量。另外，其实从函数的参数来看就能看出两种语言的文化，C++的很多算法喜欢以_if的形式提供，要求用户提供判断函数，甚至是trim_left这样的函数也不提供一个以字符形式提供出来的参数以指定到底削除啥，而是提供了trim_left_if的函数来由你判断。。。。个人感觉，虽然_if的用法有的时候能极大的提高算法的适用范围，但是对于简单应用来说，实在离make simple things easy原则，何况，在没有lambda语法，没有好用的bind,却有强类型的复杂函数定义形式，复杂的函数指针语法，超复杂的成员函数指针语法的C++中，要好好的用对，用好一个类似的情况，实在并不见得那么简单。。。。。。。。。。。

最近老是将C++与Python做比较，感觉自己似乎更像是一个初学C++的Python程序员在不停的抱怨。。。其实本人还是完全靠C++吃饭的，仅仅是业余时间学习一下Python(工作中也用，但是比较少)，本人的观点一向是，虽然将不关心效率，奇慢的动态类型语言Python和以效率为根本的强类型C++做语法比较对C++非常不公平，但是，来自另一个语言的经验其实对于反思一个语言需要什么和怎么样更好的使用现有语言还是很有帮助的，就是因为这种相互的促进，我才会从需求的角度去学习了很多boost的库，比如《多想追求简洁的极致，但是无奈的学习C++中for_each的应用》，《其实C++比Python更需要lambda语法，可惜没有。。。。》，《boost::function，让C++的函数也能当第一类值使用》不然，都是纯粹的跟着文档走的学习，那个效果要差的太多。

四、例子：

这里将我试用我感兴趣函数的例子列出来，其实，知道个大概以后，留份参考文档，以后有需要的时候，会发现工具箱中实在是又多了很重要的一项工具。

1. 大小写转换

void CaseChange()
{
    string lstr("abcDEF");
    string lstr2(lstr);

    // 标准形式
    transform(lstr.begin(), lstr.end(), lstr.begin(), tolower);
    cout <<lstr <<endl;

    // string_algo
    to_lower(lstr2);

    cout <<lstr2 <<endl;
}

比较奇怪的就是，为啥to_lower这样的函数不返回转换后的字符串，这样可以支持很方便的操作，难道也是出于不为可能不需要的操作付出代价的原则考虑？。。。。。。很有可能，想想标准的map.erase函数吧，看看《C++ STL Library》中相关的评价及说明。这是C++程序员的一贯作风。

2. 大小写不敏感的比较

void CaseIComp()
{
    string lstrToComp("ABCdef");
    string lstr("abcDEF");
    string lstr2(lstr);

    string lstrTemp;
    string lstrTemp2;
    // 标准形式worst ways
    transform(lstr.begin(), lstr.end(), back_inserter(lstrTemp), tolower);
    transform(lstrToComp.begin(), lstrToComp.end(), back_inserter(lstrTemp2), tolower);
    cout <<(lstrTemp == lstrTemp2) <<endl;

    // 标准形式
    cout << !stricmp(lstr.c_str(), lstrToComp.c_str()) <<endl;

    // string_algo 1
    cout << (to_lower_copy(lstr2) == to_lower_copy(lstrToComp)) <<endl;

    // no changed to original values
    cout << lstr2 <<" " <<lstrToComp <<endl;

    // string_algo 2 best ways
    cout << iequals(lstr2, lstrToComp) <<endl;
}

好了，我们有比调用stricmp更好的办法了，如例子中表示的。

3. 修剪

void TrimUsage()
{
    // 仅以trim左边来示例了，trim两边和右边类似
    string strToTrim="     hello world!";

    cout <<"Orig string:[" <<strToTrim <<"]" <<endl;

    // 标准形式:
    string str1;
    for(int i = 0; i < strToTrim.size(); ++i)
    {
       if(strToTrim[i] != ' ')
       {
           // str1 = &(strToTrim[i]);  at most time is right,but not be assured in std
           str1 = strToTrim.c_str() + i; 
           break;
       }
    }
    cout <<"Std trim string:[" <<str1 <<"]" <<endl;
   
    // string_algo 1
    string str2 = trim_left_copy(strToTrim); 
    cout <<"string_algo string:[" <<str2 <<"]" <<endl;

    // string_algo 2
    string str3 = trim_left_copy_if(strToTrim, is_any_of(" "));
    cout <<"string_algo string2:[" <<str3 <<"]" <<endl;
}

可见一个常用操作在std中多么复杂，甚至直接用char*实现这样的操作都要更简单，效率也要更高，string版本不仅失去了效率，也失去了优雅。但是在有了合适的库后多么简单，另外，假如std可以有更简化的办法的话，欢迎大家提出，因为毕竟我在实际工作中也没有boost库可用。

4. 切割

不是经常处理大量数据的人不知道字符串切割的重要性和常用长度，在比较通用的数据传递方式中，CSV（Comma separated values ）是比较常用的一种，不仅仅在各种数据库之间传递很方便，在各种语言之间传递数据，甚至是用excel打开用来分析都是非常方便，Python中string有内置的split函数，非常好用，我也用的很频繁，C++中就没有那么好用了，你只能自己实现一个切割函数，麻烦。

示例：

void SplitUsage()
{
    // 这是一个典型的CSV类型数据
    string strToSplit("hello,world,goodbye,goodbye,");

    typedef vector< string > SplitVec_t;
    SplitVec_t splitVec1;
    // std algo 无论在任何时候string原生的搜寻算法都是通过index返回而不是通过iterator返回，总是觉得突兀
    // 还不如使用标准库的算法呢，总怀疑因为string和STL的分别设计，是不是string刚开始设计的时候,还没有加入迭代器？
    string::size_type liWordBegin = 0;
    string::size_type liFind = strToSplit.find(',');
    string lstrTemp;
    while(liFind != string::npos)
    {
       lstrTemp.assign(strToSplit, liWordBegin, liFind \- liWordBegin);
       splitVec1.push_back(lstrTemp);
       liWordBegin = liFind+1;
       liFind = strToSplit.find(',', liWordBegin);
    }
    lstrTemp.assign(strToSplit, liWordBegin, liFind \- liWordBegin);
    splitVec1.push_back(lstrTemp);

    BOOST_FOREACH(string str, splitVec1)
    {
       cout <<" split string:" <<str <<endl;
    }

    // string_algo
    SplitVec_t splitVec2; 
    split( splitVec2, strToSplit, is_any_of(",") ); 

    BOOST_FOREACH(string str, splitVec2)
    {
       cout <<" split string:" <<str <<endl;
    }
}

5. 其他：

至于其他的搜寻，替换，由于标准库中的已经比较强大了，string_algo也就算锦上添花吧，我的对搜寻和替换的需求不是那么强烈。大家参考boost本身的文档吧。在搜寻中提供搜寻迭代器的方式倒是还是使易用性有一定提高的。至于正则表达式这个字符串处理的重头戏嘛，因为tr1中都有了regex库了，平时使用的时候还是以此为主吧，相对string_algo这个宿命就是存在于boost的库来说，tr1的regex库在将来肯定可移植性及通用性更强。

五、稀泥终究糊不上墙

虽然string_algo可以弥补std::string的功能缺陷，但是string的性能问题又怎么解决啊？。。。。因为string如此的弱，导致stringstream的使用都有很大的问题（在内存上），有兴趣的可以再网上搜搜，当年工作的时候就是因为太相信标准库，结果用了stringstream来实现整数到string的转换，结果经历除了痛苦就是痛苦，公司服务器因为此问题全服停止了2次。stringstream在作为局部变量存在的时候都会无止尽的消耗内存，这点我以前怎么也无法想象。

实际中的使用也是反应了std::string的悲哀，看看有多少实际的项目中没有实现自己的一套string?

其实寄希望于regex库出来后，std::string就能涅槃也不是太可能，std::string的string可能在平时的使用中是个简化编程的好办法，但是在对性能和内存占用稍微有点追求的程序，估计，std::string的使用还是会少之又少。。。。。。。。。。。

write by九天雁翎(JTianLing) – www.jtianling.com

阅读全文....

上一页 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 下一页

文章列表

C++中的成员函数调用原理及this指针的传递方式

C/C++与汇编的函数相互调用分析

C中调用汇编的函数

下面，在汇编中调用C/C++函数：

《数据结构与算法分析C++描述》 分离链接(separate chaining)哈希表的C++实现

分离链接(separate chaining)哈希表的实现：

测试代码

ASIO—下一代C++标准可能接纳的网络库（3）UDP网络应用

一、 综述

二、 Tutorial

1. Daytime.4 - A synchronous UDP daytime client（同步UDP daytime客户端）

2. A synchronous UDP daytime server（同步的UDP daytime服务器）

3. An asynchronous UDP daytime server(异步 UDP daytime 服务器)

ASIO—下一代C++标准可能接纳的网络库（2）TCP网络应用

一、 综述

二、 Tutorial

1. A synchronous TCP daytime client（一个同步的TCP daytime客户端程序）

2. Daytime.2 - A synchronous TCP daytime server（同步的TCP daytime服务器）

3. An asynchronous TCP daytime server（异步TCP daytime服务器）

ASIO—下一代C++标准可能接纳的网络库（1）简单的应用

一、 综述

二、 Tutorial

1. Timer.1 - Using a timer synchronously（使用同步定时器）

2. Timer.2 - Using a timer asynchronously（使用异步定时器）

3. Timer.3 - Binding arguments to a handler（绑定参数到handler）

4. Timer.4 - Using a member function as a handler

5. Timer.5 - Synchronising handlers in multithreaded programs

6. MyExample1：Synchronising handlers in multithreaded programs in normal way

一、 Introduction：

1. 题外话：

2. boost::thread的HelloWorld:

一、 对string态度的慢慢转变

二、 轮子没有总会有人造的

三、 boost::string_algo个人概述

四、 例子：

1. 大小写转换

2. 大小写不敏感的比较

3. 修剪

4. 切割

5. 其他：

五、 稀泥终究糊不上墙

《数据结构与算法分析C++描述》分离链接(separate chaining)哈希表的C++实现

一、综述

一、综述

一、综述

一、对string态度的慢慢转变

二、轮子没有总会有人造的

四、例子：

五、稀泥终究糊不上墙