個人檔案Quake3 启示录相片部落格清單更多 ![]() | 說明 |
|
7月11日 Gamebryo/Artist's Guides/Gamebryo 3ds max Plug in/Geometry and PerformanceIntroduction to Geometry, Performance, and 3ds Max 对所有实时3D美工人员面对的基本的几何学问题是"为了维持良好实时渲染速度,我应该怎么构建我的场景?" The Triangle/Mesh Ratio Transform-Rate, Fill-Rate Clipping & Culling Triangle/Mesh Ratios vs. Clipping and Culling Grouping Skinning & Morphing Cloning and Instancing Multi/Sub-Object and Triangle/Mesh Ratios Multiple UVs, Smoothing Groups and Vertices Precache Custom Attributes Terminology The following section will introduce key terms and phrases that you should understand before discussing geometry. Stripification This term describes an operation in which a list of independent triangles is transformed into a list of triangles that are linked together in a chain. Although this operation can increase performance, it is mutually exclusive with the Vertex Cache optimization which only operates on triange lists. Vertex cache optimization generally gives better performance results. 这个术语描述了一个操作,在这个操作中一系列独立的三角被转化成一个三角形条带。尽管这个操作可以提高性能,但它和三角形列表上的顶点缓存优化是互斥的,而顶点缓存优化通常能带来更高的性能。(而且条带化要条带足够长才能优化,而很多三角形不能形成很好的条带) Mesh (or NiMesh) This term describes the Gamebryo representation of geometry. A mesh contains all of the per-vertex information about a piece of geometry, including vertex positions, normals, vertex colors, and UV sets. A mesh is usually composed of independent triangle primitives, but if the mesh has been stripified it will be composed of a set of triangle strips. The Triangle/Mesh Ratio The triangle to mesh ratio is the most important geometric metric for game performance. The issue with triangles and meshes is that when rendering an mesh, Gamebryo must do a fixed amount of work on the CPU (property-state setup, texture swapping, etc.) each time it passes down the set of triangles, no matter how big. You should, thus, try to pack as many triangles as possible into each mesh. In general, a game should never have fewer than 20 triangles per mesh. You don't have to increase the number of triangles just to improve the triangle/mesh ratio, but if doing so will improve vertex lighting or some geometric detail, it won't hurt the performance. Another way to tackle improving the ratio is to collapse similar meshes with the same materials that are close together in a scene. This collapsed mesh will be converted to a single mesh instead of several separate meshes, thus improving the overall ratio. The importance of a large triangle/mesh ratio (i.e. a lot of triangles per mesh) is increased on hardware transform and lighting cards (high-end graphics cards). Hardware transform and lighting cards perform vertex transformation, lighting, and rasterization on the graphics card. Earlier cards could only perform the rasterization while the CPU was forced to do the vertex transformation and lighting. Hardware transform and lighting cards both free the CPU of this task and perform it faster than the CPU ever could. This division of labor decreases the time required to render an individual polygon but leaves the fixed amount of work that Gamebryo must do for each mesh (discussed earlier) unchanged. In a low triangle/mesh situation the CPU will become the bottleneck (doing the rendering setup) and the full rendering capabilities of the graphics card will not be used. In contrast, a high triangle/mesh ratio will allow the graphics card to draw as many polygons as possible and leave the CPU free to perform other operations. Performance Metrics Performance analysis done at nVidia and ATI revealed that on a 1 GHz CPU, you can render 25,000 objects per second before you spend all of your time on the CPU, a circumstance you wish to avoid at almost all costs. What do these statistics mean to an artist? For performance of 60 fps on a 1 GHz CPU, try to keep the number of visible objects in any given scene below 417 objects. In other words, make every object count! As the CPU speed of the target machine increases, the number of visible objects that may be rendered per frame will correspondingly increase. Transform-Rate, Fill-Rate Transform-Rate The transform-rate is the number of vertices a graphics card can process in a given time period. When the number of vertices to be transformed (moved, rotated, and lit) per time period exceeds the graphics card's capabilities, an application is said to be "transform limited." The following table shows the maximum T&L rate for various PC graphics hardware:
These values are, of course, theoretical peaks and do not represent real-world game situations. However, these numbers can be useful in examining performance. If you wish to achieve 60 fps on a GeForce 3, the absolute top amount of vertices you can transform in a frame is 666,666. Let's say that every object you render requires two passes. The theoretical top you could transform is now halved to 333,333 vertices. Mind you, these vertices all belong to one object and are untextured and flat shaded, drawn as optimally as possible with absolutely nothing else happening in the application. No interesting game could ever hope to achieve this situation. Fill-Rate Not only is a graphics card limited in the number of vertices it can transform, it is also limited in the number of pixels it can write to the backbuffer per second. The backbuffer is the portion of a graphics card's memory that is used as a scratch pad while the final image is being assembled. When this process of writing to the backbuffer exceeds the graphics card's capability, an application is described as being "fill-rate limited." 显卡不仅仅在变换顶点的数量方面受到限制, 而且每秒填充到后缓冲区的像素个数也受到了限制。后备缓冲区是显卡显存的一部分,做为正在组装的最终图像便签。当写入后备缓冲区操作超出显卡能力时,应用程序称为"填充率上限"
These values are, of course, theoretical peaks and do not represent real-world game situations. However, these numbers can be useful in examining performance. Let's see what we can do with a GeForce 3 at a display resolution of 1024 by 768. We'll assume for the moment that transformation and lighting comes for free (which it never does). 1024 by 768 resolution is 786,432 pixels. We'd like to run at 60 fps, so that involves rendering that 1024 by 768 image 60 times for a grand total of 47.19 million pixels. Assuming each pixel is drawn more than once, the maximum number of pixel writes we can do on each pixel is roughly 17. This, of course, assumes that all operations that write a pixel cost the same. Multitexturing and the complexity of pixel shaders quickly lower this number. Overuse of complex pixel shaders can quickly make an application fill rate limited. Clipping & Culling The clipping/culling behavior of Gamebryo is another issue that must be kept in mind when creating scene geometry. Clipping is the process of dividing polygons into only the fragments that will appear on screen. Culling is an attempt to avoid clipping by rejecting whole NiMesh objects if no part of them appears on the screen. In general, culling is preferable to clipping because clipping is vastly more expensive than culling. You can structure your scenes to improve culling by limiting the volume of space an NiMesh occupies. A small NiMesh is more likely to be completely off screen while a very large one (e.g. a huge floor) is likely to be, at least, partially on screen all the time. Triangle/Mesh Ratios vs. Clipping & Culling These two issues, the triangle/mesh ratio and the culling/clipping behavior, place somewhat contradictory demands on you. To improve the triangle/mesh ratio you must have the most triangles in a mesh possible (mesh collapsing, increasing the number of triangles in the mesh, etc.). Simultaneously, the meshes must be kept compact to allow for efficient culling. There is no simple solution to this problem and you must constantly balance the two constraints. In general, triangles should be grouped into meshes so that culling will still be effective, but the meshes should contain the most triangles possible. For example, when modeling the four walls of a room, if the walls are relatively complex, each wall should be in its own mesh. Having all the walls in a single mesh would improve the triangle/mesh ratio but would force clipping on all the geometry. By dividing the walls into four meshes, two of them will usually be culled leaving the other two to be clipped. However, if the walls were very simple (i.e. 2 triangles each) then it might make sense to clump all the wall triangles together to avoid having several 2 triangle meshes. In modern hardware, side plane clipping is avoided as much as possible through various tricks. Near plane clipping remains a problem, however. 对于这两个问题,三角形/网格比值和裁切/剔除形为,对于你来说处于某种对立的位置。为了改善三角形/网格比值,你必须尽可能让更多的三角形在一个网格中(折叠网格,增加网格中三角形的数量等等)。同时地,网格必须保持紧凑的以允许有效的剔除。对这个问题没有一个简单的解决办法,你必须自己在两个限制间找平衡。总的来说,三角形应该被组合到网格中,这样剔除仍然高效,但网格应该包含尽可能多的三角形。 例如,当为一个屋子的4面墙壁建模,如果墙壁相对较复杂,每个墙壁应该有它自己的网格。把所有的墙壁放入一个单独的网格中可以改善三角形/网格比值,但是将强迫裁切所有的几何体。通过将墙壁分成4个网格,它们中的两个将总是执行剔除操作,而留下另外两个执行裁切。然而,如果墙壁是非常简单的 (比如每个2个三角形)那么可以考虑将整个墙壁整合起来以避免出现几个拥有两个三角形的网格。 在现代的硬件中,侧平面(side plane)裁切可以通过多种多样的窍门尽可能避免,然而,近平面(near plane)裁切仍然是一个问题。 回應 (1)
引用通告此內容的引用通告是: http://topameng.spaces.live.com/blog/cns!F962D4854A8233D!451.trak 引述這則內容的部落格
|
|
|