topameng's profileQuake3 启示录PhotosBlogListsMore ![]() | Help |
|
|
April 21 is_integral 宏展开struct integral_c_tag { static const int value = 0; }; template< bool C_ > struct bool_ { static const bool value = C_; typedef integral_c_tag tag; typedef bool_ type; typedef bool value_type; operator bool() const { return this->value; } }; typedef bool_<true> true_;
typedef bool_<false> false_; template< typename T, T N > struct integral_c;
template <class T, T val> struct integral_constant: public integral_c<T, val> { typedef integral_constant<T,val> type; }; template<> struct integral_constant<bool,false>: false_
{ typedef integral_constant<bool,false> type; }; template<> struct integral_constant<bool,true>: true_
{ typedef integral_constant<bool,true> type; }; template< typename T > struct is_integral : integral_constant<bool,false> { };
template<> struct is_integral<unsigned char> : integral_constant<bool,true>
{ /*BOOST_TT_AUX_BOOL_TRAIT_VALUE_DECL(true) \ BOOST_MPL_AUX_LAMBDA_SUPPORT_SPEC(1,is_integral,(unsigned char)) */ }; BOOST_TT_AUX_BOOL_TRAIT_SPEC1(is_integral,unsigned char const,true) BOOST_TT_AUX_BOOL_TRAIT_SPEC1(is_integral,unsigned char volatile,true) BOOST_TT_AUX_BOOL_TRAIT_SPEC1(is_integral,unsigned char const volatile,true) March 18 _beginthread还是CreateThreadhttp://www.diybl.com/ I. 起因 今天一个朋友问我程序中究竟应该使用_beginthread还是CreateThread,并且告诉我如果使用不当可能会有内存泄漏。其实我过去对这个问题也是一知半解,为了对朋友负责,专门翻阅了一下VC的运行库(CRT)源代码,终于找到了答案。 II. CRT CRT(C/C++ Runtime Library)是支持C/C++运行的一系列函数和代码的总称。虽然没有一个很精确的定义,但是可以知道,你的main就是它负责调用的,你平时调用的诸如strlen、strtok、time、atoi之类的函数也是它提供的。我们以Microsoft Visual.NET 2003中所附带的CRT为例。假设你的.NET 2003安装在C:\Program Files\Microsoft Visual Studio .NET 2003中,那么CRT的源代码就在C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\crt\src中。既然有了这些实现的源代码,我们就可以找到一切解释了。 III. _beginthread/_endthread 这个函数究竟做了什么呢?它的代码在thread.c中。阅读代码,可以看到它最终也是通过CreateThread来创建线程的,主要区别在于,它先分配了一个_tiddata,并且调用了_initptd来初始化这个分配了的指针。而这个指针最后会被传递到CRT的线程包装函数 _threadstart中,在那里会把这个指针作为一个TLS(Thread Local Storage)保存起来。然后_threadstart会调用我们传入的线程函数,并且在那个函数退出后调用_endthread。这里也可以看到, _threadstart用一个__try/__except块把我们的函数包了起来,并且在发生异常的时候,调用exit退出。(_threadstart和endthread的代码都在thread.c中) IV. CreateThread和CRT 或许有人会说,我用CreateThread创建线程以后,我也调用了C运行库函数,并且也使用ExitThread退出了,可是我的程序运行得好好的,既没有因为CRT没有初始化而崩溃,也没有因为忘记调用 _endthread而发生内存泄漏,这是为什么呢,让我们继续我们的CRT之旅。 V. 使用ptd的函数 那么,究竟那些函数使用了_getptd呢?很多!在CRT目录下搜索_getptd,你会发觉很多意想不到的函数都用到了它,除了strtok、 rand这类需要保持状态的,还有所有的字符串相关函数,因为它们要用到ptd中的locale信息;所有的mbcs函数,因为它们要用到ptd中的 mbcs信息,...。 VI. 测试代码 下面是一段测试代码(leaker中用到了atoi,它需要ptd): 代码: #include <windows.h> volatile bool threadStarted = false; void leaker() DWORD __stdcall CreateThreadFunc( LPVOID ) DWORD __stdcall CreateThreadFuncWithEndThread( LPVOID ) void __cdecl beginThreadFunc( LPVOID ) int main() VII. 总结 如果你使用了DLL方式链接的CRT库,或者你只是一次性创建少量的线程,那么你或许可以采取鸵鸟策略,忽视这个问题。上面一节代码中第3种方法基于对 CRT库的了解,但是并不保证这是一个好的方法,因为每一个版本的VC的CRT可能都会有些改变。看来,除非你的头脑清晰到可以记住这一切,或者你可以不厌其烦的每调用一个C函数都查一下CRT代码,否则总是使用 _beginthread(或者它的兄弟_beginthreadex)是一个不错的选择。 [后记] March 11 tokenizer 使用例子#include<iostream> int main() string str = "just test;;Hello|world||-foo--bar;yow;baz|";
1
March 06 如何编写异常安全的C++代码http://tech.163.com 2006-04-17 09:19:59 来源: 网易学院(广州) 网友评论0 条 论坛关于C++中异常的争论何其多也,但往往是一些不合事实的误解。异常曾经是一个难以用好的语言特性,幸运的是,随着C++社区经验的积累,今天我们已经有足够的知识轻松编写异常安全的代码了,而且编写异常安全的代码一般也不会对性能造成影响。 3.需要一定的开销,频繁执行的关键代码段避免使用 C++异常处理机制. 7.将结构化异常处理结合/转换到C++异常对象,可以更好地处理WINDOWS程序出现的异常. 用得恰到好处,方显C++异常之美妙!catch(…) 1.7异常规范: 1.8陷阱:派生类中的异常规范 使用dbghelp获取调用堆栈--release下的调试方法学Author : Kevin Lynx当软件作为release模式被发布给用户时,当程序崩溃时我们很难去查找原因。常见的手法是输出LOG文件,根据LOG文件分析 要获取call stack(所谓的调用堆栈),就需要查看(unwind)stack的内容。We could conceivably attempt to unwind the StackWalk64声明如下: 具体每个参数的含义可以参见MSDN。这里说下ContextRecord参数,该参数指定了CPU各个寄存器的内容。StackFrame指定了stack frame的内容。stack frame是什么,我也不知道。(= =) StackWalk64函数需要用户指定当前frame的地址,以及当前程序的指令地址。这两个信息都被填充进ContextRecord,然后传进StackWalk64函数。 那么如何获取当前的stack frame地址和当前程序指令地址呢?如前所说,你可以使用内联汇编。(对于程序指令地址,因为要获取EIP寄存器的内容,而该寄存器不能被软件访问)也可以使用GetThreadContext一次性获取当前线程当前运行情况下的CPU各个寄存器内容。补充下,当前frame地址被放在EBP寄存器里,当前程序指令地址放在EIP寄存器里。但是,如同MSDN对GetThreadContext函数的说明一样,该函数可能获取到错误的寄存器内容(You cannot get a valid context for a running thread)。 另一种获取Context(包含EBP and EIP)的方法就是使用SEH(结构化异常处理),在__except中使用GetExceptionInformation获取。 GetExceptionInformation 传回一个LPEXCEPTION_POINTERS指针,该指针指向一个EXCEPTION_POINTERS结构,该结构里包含一个Context的指针,即达到目标,可以使用StackWalk函数。 补充一下,你可以直接使用StackWalk函数,StackWalk被define为StackWalk64(windows平台相关)。 unwind栈后,可以进一步获取一个stack frame的内容,例如函数名。这里涉及到SymFromAddr函数,该函数可以根据一个地址返回符号名(函数名)。还有一个有意思的函数:SymGetLineFromAddr,可以获取函数对应的源代码的文件名和行号。 当然,这一切都依赖于VC产生的程序数据库文件(pdb),以及提供以上API函数的dbghelp.dll。 参考一段简单的代码: #include <windows.h> #pragma comment( lib, "dbghelp.lib" ) void dump_callstack( CONTEXT *context ) sf.AddrPC.Offset = context->Eip; DWORD machineType = IMAGE_FILE_MACHINE_I386; HANDLE hProcess = GetCurrentProcess(); for( ; ; ) if( sf.AddrFrame.Offset == 0 ) pSymbol->SizeOfStruct = sizeof( symbolBuffer ); DWORD64 symDisplacement = 0; IMAGEHLP_LINE lineInfo = { sizeof(IMAGEHLP_LINE) }; if( SymGetLineFromAddr( hProcess, sf.AddrPC.Offset, &dwLineDisplacement, &lineInfo ) ) DWORD excep_filter( LPEXCEPTION_POINTERS lpEP ) dump_callstack( lpEP->ContextRecord ); if( SymCleanup( GetCurrentProcess() ) ) return EXCEPTION_EXECUTE_HANDLER; void func1( int i ) void func2( int i ) void func3( int i ) void test( int i ) int main() return 0; 以上代码在release模式下需要关掉优化,否则调用堆栈显示不正确(某些函数被去掉了?),同时需要pdb文件。当用户使用时可以只抛出挂掉时候的地址,然后开发者通过crashfinder这样的软件,启动拥有pdb的程序 March 04 smart_ptr 使用例子使用智能指针不会因为忘记delete指针而造成内存泄露。还有当第三方的lib中某些函数返回指针,这样的返回的指针被client使用的时候,lib就会失去对返回的指针的控制,这样delete的指针的任务一般就会交给调用方client,但是如果client忘记调用delete或是调用的时机不正确,都有可能导致问题,在这种情况下最好使用智能指针。 shared_ptr<boost/shared_ptr.hpp>:使用shared_ptr进行对象的生存期自动管理,使得分享资源所有权变得有效且安全. scoped_ptr<boost/scoped_ptr.hpp>: 用于确保能够正确地删除动态分配的对象。scoped_ptr 有着与std::auto_ptr类似的特性,而最大的区别在于它不能转让所有权而auto_ptr可以。事实上,scoped_ptr永远不能被复制或被赋值!scoped_ptr 拥有它所指向的资源的所有权,并永远不会放弃这个所有权。
void PrintIfString(const boost::any& Any) const boost::weak_ptr<std::string>* s1 = } int main(int argc, char* argv[]) return 0; scoped_prt 源码中,支持 if(smartptr) 判断的实现 令人发晕的类型unspecified_bool_type,这种类型实际上是"指向类的某内部成员变量的指针",不要认为它是指针类型(c必知必会上有说明)。(可以参考<<深入c++对象模型>>来查看这样的偏移值是多少)。unspecified_bool_type可以被当作bool类型使用,它要么为0, February 07 Gamebryo限制4层材质由于地形贴图尺寸很大,所以无法使用全局贴图。而是把多层Tiling 的纹理使用alpha通道互相融合起来。诸如WOW[1],天堂2 等大型室外地形渲染多采用此技术。此技术在每一层Tiling 贴图依然用到了一张全局alpha 贴图。以及一张全地形唯一的静态光影贴图。在multitexture 中根据显卡的multitexture 处理单元数量,进行multi-pass 和multitexture 的混合渲染。 如果进行优化,4 层ALPHA 可以混合称为一张贴图。Lightmap 一张。那么也至少需要2 张全局贴图进行地表绘制。 December 23 WildMagic D3d Memory leak通过设置 dxcpl.exe 发现 wildMagic4.8 有内存泄漏,经过跟踪最终发现,作者设置默认字体不能使用UnladFont 函数卸载,d3d 设备也没有释放,顺便加上。修改如下
Dx9Renderer::~Dx9Renderer () {
// release all fonts for (int i = 1; i < (int)m_kFontArray.size(); i++) { UnloadFont(i); } m_kFontArray[0]->Release(); //添加 // clean up cursor if (!m_bCursorVisible) { ShowCursor(true); } m_pqDevice->Release(); //添加 m_pqMain->Release(); //添加 } 对于学习来说这是款不错的引擎。4.0以上版本完全是shader驱动的,不在使用固定管线。 用的是cg shader. 需要用nvdia cgc 编辑器编译cg shader脚本 October 10 杂事两三件使用左(右)手拇指指向坐标轴正向,握住坐标轴,4指环绕方向为正方向 不管是左手右手,行向量还是列向量。矩阵存储基本都是[row][col]形式 这时该如何建立矩阵呢? 也许有人会问为什么不区分是左手坐标系还是右手坐标系呢? 如果你用的是行向量:由于行向量只能左乘矩阵(注意乘与乘以的区别) gamebryo 矩阵和dx矩阵相乘方向相同 Rx * Ry * Rz 先执行Rx,然后 Ry,最后Rz。 对于NiNode代表的场景图。要节省效率可以做2个场景图。一个放入静态对象。这些对象不需要Update更新。另一个放入动画对象。这个场景图需要一直更新。 material 只是引用了 shader 。 scene graph bounding Spheres是用摄影机进行裁切。地表系统用了quadtree。 NiMesh 比 NiGeomentry 有什么不同 因为floodgate 等的使用,要提供一个更有弹性的系统。 NiMesh 取代之。更多东西可以在Mesh上分享。还有如GPU Instancing等功能。以前有三角形 ,线条等。现在统一为NiMesh gamebryo2.5 支持1024*1024地形大小。由于gb对应max是1:1的,这样相当于1平方公里. lightspeed 版本好像没有限制了。这个也可以不是1平方公里。就像魔兽世界一个地形网格代表的是3.333的样子。也不是1.就是说max人物建筑之类的可以缩小到1/3放入地形场景。这样地形就大了。如果要求不太高还可以缩小。 renderclick 是一个单独的渲染pass.可以设置摄像机和 culler. 一个renderstep的多个renderclick,操作的是相同的rendertargetgroup NiShaderFactory::UpdateGlobalShaderConstant("LightDiffuse",sizeof(afLightDiffuse), &afLightDiffuse); July 23 光照对于灯光上面的环境光、漫反射、镜面光分别乘以当前材质的环境、漫反射、镜面分量,然后再叠加到顶点颜色上。 对于direct3d 图形与动画程序设计上面的 环境、漫反射、镜面 shader 是相对于方向光的 shader. 如果是点光源还要有衰减,而聚光灯还要有内外夹角。 当然对于light 可以使用 dx 的固定管线灯光,这样可以不用写shader(但受8个灯光的限制)。下面是一个摘录的opengl 的 shader 光照模型。跟directx 在镜面光上可能有些不同
然后,需要app传入的参数:
主函数:
//对于方向光源的计算:
//对于点光源:
//对于聚光灯:
这样,对于场景之中的任意对象,它所能够接受计算的光源就可以突破8个的限制了。 July 19 Diffuse Lighting (Direct3D 9)After adjusting the light intensity for any attenuation effects, the lighting engine computes how much of the remaining light reflects from a vertex, given the angle of the vertex normal and the direction of the incident light. The lighting engine skips to this step for directional lights because they do not attenuate over distance. The system considers two reflection types, diffuse and specular, and uses a different formula to determine how much light is reflected for each. After calculating the amounts of light reflected, Direct3D applies these new values to the diffuse and specular reflectance properties of the current material. The resulting color values are the diffuse and specular components that the rasterizer uses to produce Gouraud shading and specular highlighting. Diffuse lighting is described by the following equation. Diffuse Lighting = sum[Cd*Ld*(N.Ldir)*Atten*Spot]
The value for Cd is either:
Note If either DIFFUSEMATERIALSOURCE option is used, and the vertex color is not provided, the material diffuse color is used. To calculate the attenuation (Atten) or the spotlight characteristics (Spot), see Attenuation and Spotlight Factor (Direct3D 9). Diffuse components are clamped to be from 0 to 255, after all lights are processed and interpolated separately. The resulting diffuse lighting value is a combination of the ambient, diffuse and emissive light values. ExampleIn this example, the object is colored using the light diffuse color and a material diffuse color. The code is shown below. D3DMATERIAL9 mtrl; ZeroMemory( &mtrl, sizeof(mtrl) ); D3DLIGHT9 light; ZeroMemory( &light, sizeof(light) ); light.Type = D3DLIGHT_DIRECTIONAL; D3DXVECTOR3 vecDir; vecDir = D3DXVECTOR3(0.5f, 0.0f, -0.5f); D3DXVec3Normalize( (D3DXVECTOR3*)&light.Direction, &vecDir ); // set directional light diffuse color light.Diffuse.r = 1.0f; light.Diffuse.g = 1.0f; light.Diffuse.b = 1.0f; light.Diffuse.a = 1.0f; m_pd3dDevice->SetLight( 0, &light ); m_pd3dDevice->LightEnable( 0, TRUE ); // if a material is used, SetRenderState must be used // vertex color = light diffuse color * material diffuse color mtrl.Diffuse.r = 0.75f; mtrl.Diffuse.g = 0.0f; mtrl.Diffuse.b = 0.0f; mtrl.Diffuse.a = 0.0f; m_pd3dDevice->SetMaterial( &mtrl ); m_pd3dDevice->SetRenderState(D3DRS_DIFFUSEMATERIALSOURCE, D3DMCS_MATERIAL); According to the equation, the resulting color for the object vertices is a combination of the material color and the light color. These two images show the material color, which is gray, and the light color, which is bright red. The resulting scene is shown below. The only object in the scene is a sphere. The diffuse lighting calculation takes the material and light diffuse color and modifies it by the angle between the light direction and the vertex normal using the dot product. As a result, the backside of the sphere gets darker as the surface of the sphere curves away from the light. Combining the diffuse lighting with the ambient lighting from the previous example shades the entire surface of the object. The ambient light shades the entire surface and the diffuse light helps reveal the object's 3D shape. Diffuse lighting is more intensive to calculate than ambient lighting. Because it depends on the vertex normals and light direction, you can see the objects geometry in 3D space, which produces a more realistic lighting than ambient lighting. You can use specular highlights to achieve a more realistic look. Ambient Lighting (Direct3D 9) 环境光Ambient lighting provides constant lighting for a scene. It lights all object vertices the same because it is not dependent on any other lighting factors such as vertex normals, light direction, light position, range, or attenuation. It is the fastest type of lighting but it produces the least realistic results. Direct3D contains a single global ambient light property that you can use without creating any light. Alternatively, you can set any light object to provide ambient lighting. The ambient lighting for a scene is described by the following equation. 环境光为场景提供了一种恒定不变的光照。环境光对所有物体的顶点的照明效果相同,因为它与其余光照因子,如顶点法向、光的方向、光的位置、范围或衰减等无关。环境光是最快的一种类型,但它提供的真实感最少。Direct3D 包含了一个全局的环境光属性,应用程序可以直接使用而无需创建任何光源。另外,应用程序也可以设定光源提供环境光照。场景中环境光的计算由以下公式描述。 Ambient Lighting = Ca*[Ga + sum(Atti*Spoti*Lai)] Where:
The value for Ca is either:
Note If either AMBIENTMATERIALSOURCE option is used, and the vertex color is not provided, then the material ambient color is used. To use the material ambient color, use SetMaterial as shown in the example code below. Ga is the global ambient color. It is set using SetRenderState(D3DRS_AMBIENT). There is one global ambient color in a Direct3D scene. This parameter is not associated with a Direct3D light object. Lai is the ambient color of the ith light in the scene. Each Direct3D light has a set of properties, one of which is the ambient color. The term, sum(Lai) is a sum of all the ambient colors in the scene. Ca的值可以是: 顶点颜色1,如果AMBIENTMATERIALSOURCE = D3DMCS_COLOR1,并且顶点声明中给出了第一个顶点的颜色。 注意 如果使用了任何一种AMBIENTMATERIALSOURCE,但是没有提供顶点颜色,那么系统会使用材质的环境反射色。 要使用材质的环境反射色,按以下示例代码使用SetMaterial方法。 Ga为全局的环境反射色,通过SetRenderState(D3DRENDERSTATE_AMBIENT)设置。Direct3D场景中只有一个全局环境反射色,它与其余Direct3D光源无关。 Lai为场景中第i个光源的环境反射色。每个Direct3D光源都有一组属性,其中一个就是环境反射色。符号sum(Lai)表示场景中所有环境反射色的总和。 ExampleIn this example, the object is colored using the scene ambient light and a material ambient color. #define GRAY_COLOR 0x00bfbfbf // create material D3DMATERIAL9 mtrl; ZeroMemory(&mtrl, sizeof(mtrl)); mtrl.Ambient.r = 0.75f; mtrl.Ambient.g = 0.0f; mtrl.Ambient.b = 0.0f; mtrl.Ambient.a = 0.0f; m_pd3dDevice->SetMaterial(&mtrl); m_pd3dDevice->SetRenderState(D3DRS_AMBIENT, GRAY_COLOR); According to the equation, the resulting color for the object vertices is a combination of the material color and the light color. These two images show the material color, which is gray, and the light color, which is bright red. The resulting scene is shown below. The only object in the scene is a sphere. Ambient light lights all object vertices with the same color. It is not dependent on the vertex normal or the light direction. As a result, the sphere looks like a 2D circle because there is no difference in shading around the surface of the object. To give objects a more realistic look, apply diffuse or specular lighting in addition to ambient lighting. July 11 Gamebryo/Artist's Guides/Gamebryo 3ds max Plug in/Geometry and PerformanceIntroduction to Geometry, Performance, and 3ds Max 对所有实时3D美工人员面对的基本的几何学问题是"为了维持良好实时渲染速度,我应该怎么构建我的场景?" The Triangle/Mesh Ratio Transform-Rate, Fill-Rate Clipping & Culling Triangle/Mesh Ratios vs. Clipping and Culling Grouping Skinning & Morphing Cloning and Instancing Multi/Sub-Object and Triangle/Mesh Ratios Multiple UVs, Smoothing Groups and Vertices Precache Custom Attributes Terminology The following section will introduce key terms and phrases that you should understand before discussing geometry. Stripification This term describes an operation in which a list of independent triangles is transformed into a list of triangles that are linked together in a chain. Although this operation can increase performance, it is mutually exclusive with the Vertex Cache optimization which only operates on triange lists. Vertex cache optimization generally gives better performance results. 这个术语描述了一个操作,在这个操作中一系列独立的三角被转化成一个三角形条带。尽管这个操作可以提高性能,但它和三角形列表上的顶点缓存优化是互斥的,而顶点缓存优化通常能带来更高的性能。(而且条带化要条带足够长才能优化,而很多三角形不能形成很好的条带) Mesh (or NiMesh) This term describes the Gamebryo representation of geometry. A mesh contains all of the per-vertex information about a piece of geometry, including vertex positions, normals, vertex colors, and UV sets. A mesh is usually composed of independent triangle primitives, but if the mesh has been stripified it will be composed of a set of triangle strips. The Triangle/Mesh Ratio The triangle to mesh ratio is the most important geometric metric for game performance. The issue with triangles and meshes is that when rendering an mesh, Gamebryo must do a fixed amount of work on the CPU (property-state setup, texture swapping, etc.) each time it passes down the set of triangles, no matter how big. You should, thus, try to pack as many triangles as possible into each mesh. In general, a game should never have fewer than 20 triangles per mesh. You don't have to increase the number of triangles just to improve the triangle/mesh ratio, but if doing so will improve vertex lighting or some geometric detail, it won't hurt the performance. Another way to tackle improving the ratio is to collapse similar meshes with the same materials that are close together in a scene. This collapsed mesh will be converted to a single mesh instead of several separate meshes, thus improving the overall ratio. The importance of a large triangle/mesh ratio (i.e. a lot of triangles per mesh) is increased on hardware transform and lighting cards (high-end graphics cards). Hardware transform and lighting cards perform vertex transformation, lighting, and rasterization on the graphics card. Earlier cards could only perform the rasterization while the CPU was forced to do the vertex transformation and lighting. Hardware transform and lighting cards both free the CPU of this task and perform it faster than the CPU ever could. This division of labor decreases the time required to render an individual polygon but leaves the fixed amount of work that Gamebryo must do for each mesh (discussed earlier) unchanged. In a low triangle/mesh situation the CPU will become the bottleneck (doing the rendering setup) and the full rendering capabilities of the graphics card will not be used. In contrast, a high triangle/mesh ratio will allow the graphics card to draw as many polygons as possible and leave the CPU free to perform other operations. Performance Metrics Performance analysis done at nVidia and ATI revealed that on a 1 GHz CPU, you can render 25,000 objects per second before you spend all of your time on the CPU, a circumstance you wish to avoid at almost all costs. What do these statistics mean to an artist? For performance of 60 fps on a 1 GHz CPU, try to keep the number of visible objects in any given scene below 417 objects. In other words, make every object count! As the CPU speed of the target machine increases, the number of visible objects that may be rendered per frame will correspondingly increase. Transform-Rate, Fill-Rate Transform-Rate The transform-rate is the number of vertices a graphics card can process in a given time period. When the number of vertices to be transformed (moved, rotated, and lit) per time period exceeds the graphics card's capabilities, an application is said to be "transform limited." The following table shows the maximum T&L rate for various PC graphics hardware:
These values are, of course, theoretical peaks and do not represent real-world game situations. However, these numbers can be useful in examining performance. If you wish to achieve 60 fps on a GeForce 3, the absolute top amount of vertices you can transform in a frame is 666,666. Let's say that every object you render requires two passes. The theoretical top you could transform is now halved to 333,333 vertices. Mind you, these vertices all belong to one object and are untextured and flat shaded, drawn as optimally as possible with absolutely nothing else happening in the application. No interesting game could ever hope to achieve this situation. Fill-Rate Not only is a graphics card limited in the number of vertices it can transform, it is also limited in the number of pixels it can write to the backbuffer per second. The backbuffer is the portion of a graphics card's memory that is used as a scratch pad while the final image is being assembled. When this process of writing to the backbuffer exceeds the graphics card's capability, an application is described as being "fill-rate limited." 显卡不仅仅在变换顶点的数量方面受到限制, 而且每秒填充到后缓冲区的像素个数也受到了限制。后备缓冲区是显卡显存的一部分,做为正在组装的最终图像便签。当写入后备缓冲区操作超出显卡能力时,应用程序称为"填充率上限"
These values are, of course, theoretical peaks and do not represent real-world game situations. However, these numbers can be useful in examining performance. Let's see what we can do with a GeForce 3 at a display resolution of 1024 by 768. We'll assume for the moment that transformation and lighting comes for free (which it never does). 1024 by 768 resolution is 786,432 pixels. We'd like to run at 60 fps, so that involves rendering that 1024 by 768 image 60 times for a grand total of 47.19 million pixels. Assuming each pixel is drawn more than once, the maximum number of pixel writes we can do on each pixel is roughly 17. This, of course, assumes that all operations that write a pixel cost the same. Multitexturing and the complexity of pixel shaders quickly lower this number. Overuse of complex pixel shaders can quickly make an application fill rate limited. Clipping & Culling The clipping/culling behavior of Gamebryo is another issue that must be kept in mind when creating scene geometry. Clipping is the process of dividing polygons into only the fragments that will appear on screen. Culling is an attempt to avoid clipping by rejecting whole NiMesh objects if no part of them appears on the screen. In general, culling is preferable to clipping because clipping is vastly more expensive than culling. You can structure your scenes to improve culling by limiting the volume of space an NiMesh occupies. A small NiMesh is more likely to be completely off screen while a very large one (e.g. a huge floor) is likely to be, at least, partially on screen all the time. Triangle/Mesh Ratios vs. Clipping & Culling These two issues, the triangle/mesh ratio and the culling/clipping behavior, place somewhat contradictory demands on you. To improve the triangle/mesh ratio you must have the most triangles in a mesh possible (mesh collapsing, increasing the number of triangles in the mesh, etc.). Simultaneously, the meshes must be kept compact to allow for efficient culling. There is no simple solution to this problem and you must constantly balance the two constraints. In general, triangles should be grouped into meshes so that culling will still be effective, but the meshes should contain the most triangles possible. For example, when modeling the four walls of a room, if the walls are relatively complex, each wall should be in its own mesh. Having all the walls in a single mesh would improve the triangle/mesh ratio but would force clipping on all the geometry. By dividing the walls into four meshes, two of them will usually be culled leaving the other two to be clipped. However, if the walls were very simple (i.e. 2 triangles each) then it might make sense to clump all the wall triangles together to avoid having several 2 triangle meshes. In modern hardware, side plane clipping is avoided as much as possible through various tricks. Near plane clipping remains a problem, however. 对于这两个问题,三角形/网格比值和裁切/剔除形为,对于你来说处于某种对立的位置。为了改善三角形/网格比值,你必须尽可能让更多的三角形在一个网格中(折叠网格,增加网格中三角形的数量等等)。同时地,网格必须保持紧凑的以允许有效的剔除。对这个问题没有一个简单的解决办法,你必须自己在两个限制间找平衡。总的来说,三角形应该被组合到网格中,这样剔除仍然高效,但网格应该包含尽可能多的三角形。 例如,当为一个屋子的4面墙壁建模,如果墙壁相对较复杂,每个墙壁应该有它自己的网格。把所有的墙壁放入一个单独的网格中可以改善三角形/网格比值,但是将强迫裁切所有的几何体。通过将墙壁分成4个网格,它们中的两个将总是执行剔除操作,而留下另外两个执行裁切。然而,如果墙壁是非常简单的 (比如每个2个三角形)那么可以考虑将整个墙壁整合起来以避免出现几个拥有两个三角形的网格。 在现代的硬件中,侧平面(side plane)裁切可以通过多种多样的窍门尽可能避免,然而,近平面(near plane)裁切仍然是一个问题。 Gamebryo/Artist's Guides/Gamebryo 3ds max Plug in/Geometry and Performance(II)Grouping and 3ds Max As with Multi/Sub-Object materials, grouping should be used with care. Everything in a scene in Max has a corresponding node in its scene graph. In Gamebryo, that node is called an NiNode. An NiMesh represents a group of triangles while an NiNode contains a list of children and a list of transforms. Whenever a collection of objects in Max is grouped, the Gamebryo 3ds max plug-in has to add another NiNode to group them together in Gamebryo. Indiscriminate use of grouping can have a significant effect on the triangle/mesh ratio. In general, a high-triangle single object has much better frame-rate performance then many objects grouped together. 同使用Multi/Sub-Object材质一样,群组应该小心使用。在Max场景中的每个东西都有一个对应的节点在它的场景图中。在Gamebryo中,这个节点称为NiNode。一个NiMesh描述了一组三角形而一个 NiNode 包含了一个子结点列表以及一个变换列表。无论何时物体的集合在Max中被群组化,Gamebryo 3ds max插件必须加入另一个NiNode 将它们在Gamebryo中组合起来。 不加选择的应用群组会对triangle/mesh ratio比率产生重大的影响。总的来说,一个复杂三角形单一对象比许多对象群组在一起,在帧速率上可以获得更好的性能。 Skinning & Morphing with 3ds Max Skinning for Artists There are a few important bits of hardware knowledge that a character animator and modeler should be aware of before jumping into skinning a character. The hardware skinning pipeline is broken into two important numbers, the maximum number of bones the hardware can handle and the maximum number of bones that can influence a vertex. Each number has an important role to play in determining how your skinned character will perform in a game. The maximum number of bones the hardware can handle by default for most platforms is four. What exactly does this mean? Four is a really small number for a character. Behind the scenes, Gamebryo will break the skinned mesh apart into pieces that obey this limit. These pieces are referred to as skin partitions. For each partition we generate, we have to render the partition's geometry in a separate rendering call. For instance, a skinned character with 28 bones could be broken into 23 partitions in order to obey the maximum bone limit. This means that we are rendering the model in 23 separate pieces. Furthermore, each partition can only be composed of the vertices that use only those bones. If you're not careful in your weight assignments, you could end up with a partition with only one triangle! The maximum number of bones influencing a vertex by default on most platforms is also four. This means that a vertex that is influenced by five or more bones will only use the four most influential bones in its skinning. This may result in cracks in sections where many bones come together like the back of the neck and the crotch. Careful modeling and weight assignment will help to minimize this problem. This does not mean that you should always use four bones per vertex. In fact, it is best to use as few weights as are visually acceptable per vertex. This will help substantially when partitioning the mesh because more vertices will be able to fit into a partition. What should you take away from this discussion? First off, how you skin your character will directly transfer into performance for that character. Listed below are some useful hints on how to analyze your skinning performance and tips for getting good performance in general.
Physique vs. Skin modifier The Gamebryo Max Plug-in supports the two major modifiers for skinning. We have found the recent versions of Skin to be more reliable and robust then Physique, even when applied to bipeds, however both are supported. Gamebryo doesn't support floating bones for physique. If your model requires this capability, we suggest modeling with Skin instead of physique. In the Physique Level-Of-Detail panel, we only support Rigid Skin Update. When using Deformable, the exporter treats it like Rigid. Linking your skinned object Do not attach a skin to a bone in the hierarchy below that to which the skin is bound. This causes the mesh to translate twice, once for the skin binding and another for the child translation. You can create a node above the Bip01or base node to which the skin is attached, or use a character node to organize a bone structure and skin together. The mesh that has the Skin or Physique modifier uses the bones to determine its bounding volume. The hierarchy is updated by a depth-first traversal of the scene. If the skin occurs before the bones in a depth first traversal of the scene, the skin's bounding volume will lag one frame. Move the skin so that it occurs after the bone hierarchy in a depth first traversal to avoid this problem. Scaling your skinned object Do not scale a mesh after binding. If you scale a mesh you must unbind it first and rebind it after it has been properly scaled and translated. Scale it first and reset its transforms before you bind it to the bone system for best results. Morph and Skin work together Even though we support combining skin & morph targets we do not suggest it. If you want to make a character have facial animation morph targets, detach the head from the rest of the body and have it attached to the head bone as a direct link. The head mesh can use a morph target and translate via its relation with the head bone while the body uses a skin or physique modifier. When both modifiers are combined on a mesh it is much slower as all the vertices are being transformed twice, once in software and once in hardware. Please take this into consideration when making characters that have facial expression. See the section on Morphing Faces on Skinned Characters for more information. Morph Targets The Morph Modifier and Morph compound object are both supported in Gamebryo. We have found the Morph Modifier more reliable then the compound object in Max. Morph is inherently slower than skinning due to the fact that morphing is still done in software. However, there are many things that are easier to do with morph targets than with bone movement. Suggested uses for Morph targets are things like animated flags, facial expression and non-uniform scales. Instancing with Skinning and Morphing Objects that are skinned or morphed are exported such that clones (instances) are independently controlled and animated. However, when the Mesh Instancing tool plug-in is used, CPU skinned and morphed objects that are instances or exact copies in the Max scene will be exported as hardware instances. As a result, there is only one animated object and every other instance appears exactly the same. Move CPU skinned and morphed instances to another Max scene or use export selected to prevent this behavior when using the Mesh Instancing plug-in. Reading Skin Analyizer Plug-in Output The following section is a sample output from the skin analyzer plug-in. The model was broken into 23 partitions out of 28 bones. This model will likely perform moderately well. The key problems come around partition 19 in the list. Here we drop below 40 triangles per partition. At this point we are paying a fair amount of overhead for 6 partitions that don't have that many triangles in them. DirectX especially pays for this overhead. Often, this is unavoidable in modeling as some sections are the nexus of many bones and will be prone to small partitions (the neck and groin in particular). In general, the thing to be wary of is when the number of partitions exceeds the number of bones. This is horrible for performance because it also means that many of the partitions are incredibly small. Listed below the main chunk of text is a breakdown of each partition. This lists the bones involved in that partition and what the average weights were for that bone. This text can often be useful in pinpointing problem vertices since the bones in that partition would influence them Cloning and Instancing with 3ds Max 3ds Max中的克隆和实例 Instanced objects in Max are shared in Gamebryo as well. This technique can be extremely useful on memory-starved consoles. Note: If you adjust the pivot or non-uniform scale of an instance, it will become unique. Uniform scale is supported, but non-uniform scale cannot work with instances in Gamebryo because non-uniform scale is baked into the geometry. It should also be noted that instanced objects in Gamebryo must share the same material, unlike in Max. Instanced objects that need different materials should be made unique. Non-Uniform & Uniform Scale 不均匀和均匀缩放 Gamebryo does not support animating non-uniform scales (i.e. scales with different values in the x, y, and z axes). Animating non-uniform scale is not supported and will be treated like a uniform scale when animating. All static non-uniform scales are baked into the geometry data itself. If you need to have something non-uniformly scaled or squashed, use a morph target to achieve the action. Morph targets can be created by squashing an object in 3ds max and using Tools / Snapshot to grab a desired morph target. Gamebryo不支持动画不均匀的缩放(比如在x,y和z轴上有不同的缩放比例) 。动画的不均匀缩放是不支持的并且在动画中做为等比例缩放的方式来处理。所有静态的不均匀缩放会烘培到几何数据本身。 如果你需要使某些东西非等比例的缩放或者挤压,应用变形动画修改器对象(morph target)来完成这个任务。变形对象(morph target)可以在3ds max中通过挤压来制作,并应用Tools/Snapshot来获取一个想要的变形体对象(morph target)。 Multi/Sub-Object and Triangle/Mesh Ratios Multi/Sub-对象 和 Triangle/Mesh 比率 A convenient way of texturing an object in Max is to use the Multi/Sub-Object material. However, Gamebryo only supports one material per mesh because most hardware can only handle one material per set of triangles. When the Gamebryo 3ds max Plug-in encounters a Multi/Sub-Object material it must split the Max Mesh into multiple NiMesh objects (one for each material). 在Max中贴图一个对象便利的方法是应用Multi/Sub-Object 材质。然而,Gamebryo只支持每个网格一个材质,因为大部分硬件只能为一个三角形集合处理一个材质。当 Gamebryo 3ds max 插件遇到一个Multi/Sub-Object材质,它必须将Max网格分开成多个NiMesh 对象 (每个网格对应一个材质)。 Over-use of Multi/Sub-Objects is the most common offender in terms of performance in most data sets support receives. If used indiscriminately, Multi/Sub-Object materials can lay waste to the triangle/mesh ratio and you end up with a single object made of many little pieces. Multi/Sub-Object materials do not need to be completely avoided but they should be used cautiously and with a consideration of the triangle/mesh ratio constraints. When doing characters and discreet objects, the concept of a single texture 'Skin' for the subject is recommended. Multiple UVs, Smoothing Groups and Vertices 多重 UVs,平滑群组和顶点 Another geometric concern is Max's Mesh representation itself. Max can have more than one normal and UV (within a single UV channel) per vertex while real-time engines cannot. To resolve this incompatibility our 3ds max Plug-in will add additional vertices so that the NiMesh created has only one normal and UV per vertex. The NiMesh will look the same as the Max Mesh but will have more vertices. This vertex bloat will push the transform-rate but it is seldom a big problem. Using a lot of smoothing groups or complicated UV Mappings will make the problem worse but some vertex bloat is unavoidable. Having things smooth is faster than having things flat shaded. Vertexbloat.max is also a good example of this phenomenon. A 120-triangle sphere with 1 smoothing group has 62 vertices and a flat or un-shaded version has 358 vertices. 另一个几何上的关注点是Max网格描述本身。Max每个顶点可以拥有超过一个的法线和UV (带一个单独的UV通道) 而实时的引擎却不可以。为解决这种不兼容性的3ds max 插件加入了额外的顶点,所以创建的NiMesh每个顶点只有唯一的法线及UV。NiMesh 看上去将和Max中网格一样但它拥有更多的顶点。这种顶点膨胀将给变换速率造成负担,但是它不是一个大问题。应用很多平滑群组或者复杂的UV Mappings将会加重这个问题,但是一些顶点的膨胀是无法避免的。 使物体原滑比使他们平坦着色更快些。 Vertexbloat.max也是这个现象的很好的例子。一个120三角面的球带一个平滑的群组拥有62个顶点而一个平坦或者无着色的版本有358个顶点。
Mesh Profile Custom Attribute Every piece of geometry has a Mesh Packing Profile associated with it. This is used by the Packer tool plug-in to create the platform-specific geometry streams used by the graphics card. Depending on the situation, it can be necessary for a specific mesh to override the scene default packing profile and use its own profile. To override the scene default packing profile one would click on the "Add/Remove Mesh Profile attribute" button to add a Mesh Profile Attribute to the selected mesh. For additional information on mesh profiles see Introduction to Mesh Profiles. Precache Custom Attributes
These custom attributes tell the Gamebryo renderers how to treat an object once they have created the platform-specific versions of it. Creating the platform-specific version is called "pre-caching" the object. In some cases it may be useful for applications to keep information lying around after it has been "pre-cached". As an important example, triangle-triangle collision detection will not work if the triangles are thrown away once the renderer has pre-cached its data. Typically, artists will not need to add these attributes, since any application that pre-caches geometry, can set the "Keep" flags to hold on to data that is needed, as well. The UI exists so that advanced users can have complete control over how specific art assets are used by the renderers. These flags can be added through the user-interface via the Gamebryo toolbar. Click on the "Add/Remove Precache flags" icon. 这些定制属性告诉Gamebryo渲染器,在创建平台细节的版本时如何对待对象。创建具体平台版本称为"pre-caching"对象。在某些情况下,应用程序在"pre-cached"之后,保持闲置信息是十分有用的。 这些标志这些标志可以通过用户界面的 Gamebryo toolbar 加入。点击 "Add/Remove Precache flags" 图标。 Geometry Data Consistency This radio button allows you to set how the renderer treats the data at runtime. "Default" sets the consistency to whatever Gamebryo best determines the data to be. "Static" means that once the data is in the renderer it will never change. This is a good setting for set pieces like buildings. "Mutable" means that the object may change from time to time. This would be useful for an application effect like changing a car's geometry to reflect hits it has taken on the road while driving. "Volatile" means that the object changes every frame. This setting is best set for when you are changing vertex colors on the fly or manually animating the UV coordinates. Data To Preserve These check boxes allow the user to set which data is kept around after the "pre-cache" has occurred. 这些复选框允许用户设定哪些数据在 "pre-cache"后需要保留下来。 June 06 terrain对于2.3地形是通过max或者maya导出来的网格。可能只有地形。而花草树木等等地面上的东西要在场景里面加入,然后拼接在一起。2.5 说有场景编辑器开始自带地形编辑器。不知道是否好用 gamebryo release 与 ship 区别The Release builds include optimizaed code, but with the NiMemory system, NiMetrics, and release mode logging enabled. The Shipping builds do not have these systems enabled.
ship 和 release 工程设置基本相同,但没有NiMemory、NiMetrics、release mode logging.更像是非常稳定之后的版本 release 还会产生记录内存泄漏的xml。这个文件会越来越大。
ship 对类内变量初始化与release有些不同,印象中一个bug:一个变量在release未初始化然后乘0正常为0,但在ship中结果不为0.faint June 04 Gamebryo2.2 8*00系列显卡调试错误Hi, I am getting assertions in my debug builds when using Geforce 8600GT. Everything was working fine when using 6600GT...
If you have a copy of Gamebryo 2.3 lying around, you could move the NiDX9SystemDesc.h, .inl, and .cpp files from 2.3 into your copy of 2.2. You'll need to change any instances of "NIASSERT" to "assert", and there may be other minor changes necessary as well, but fixing this bug was the only significant change to occur in those files between those two branches. May 30 GameBryo 笔记很久以前就申请了Google 笔记本,还一直没有使用。现在先用它来记录 GameBryo 一些知识点。而且笔记本是可以多人一起修改的,谁想修改给我留言哈。地址: May 29 GameBryo2.2 SceneDesigner 不能启动装了gb 2.2 之后又安装了 gb2.3 而启动2.3 的场景编辑器之后 2.2 就不能启动了。因为这两个程序用的配置文件放入了相同的目录,并且有相同的名字,这样2.2 加载时发现版本不对,就会启动失败。 May 21 检测调试器是否存在BOOL IsDebuggerPresent() |
|
|