Jump to content
Sign in to follow this  
oktane

Shader discussion [Technical]

Recommended Posts

Signed release is here: http://forums.bistudio.com/showthread.php?t=97853

Ok so hopefully this discussion is allowed and in the right place. The generic 'addons discussion' forum is pretty busy, I thought the Arma2 Editing subforum would be a better place for it.

This thread is not for a release, no addon exists. When one does, a proper thread will be made. (test release is here http://forums.bistudio.com/showthread.php?p=1491288)

If you don't have a technical, focused insight to add in regards to the following topics, this thread is not for you:

  • GPU Shaders
  • Assembly Code (ASM)
  • HLSL
  • BIShaderCache file format and specification
  • Game's loading and handling of the shaders

Here's work so far... and some things to get started.

Documentation on Shader ASM and binary file format for compiled shader (output from fxc.exe, offline shader compiler get the DXSDK)

Useful tutorial

Tools:

DXSDK

Hex Workshop or other Hex editor

Programmers text editor (EditPlus, ultraedit)

Your favorite tool to investigate/debug binaries

Patience!

Here is a list of opcodes I have made (hex on left, dec on right)

00	D3DSIO_NOP		0,
01	D3DSIO_MOV		1,
02	D3DSIO_ADD		2,
03	D3DSIO_SUB		3,
04	D3DSIO_MAD		4,
05	D3DSIO_MUL		5,
06	D3DSIO_RCP		6,
07	D3DSIO_RSQ		7,
08	D3DSIO_DP3		8,
09	D3DSIO_DP4		9,
0A	D3DSIO_MIN		10,
0B	D3DSIO_MAX		11,
0C	D3DSIO_SLT		12,
0D	D3DSIO_SGE		13,
0E	D3DSIO_EXP		14,
0F	D3DSIO_LOG		15,
10	D3DSIO_LIT		16,
11	D3DSIO_DST		17,
12	D3DSIO_LRP		18,
13	D3DSIO_FRC		19,
14	D3DSIO_M4x4		20,
15	D3DSIO_M4x3		21,
16	D3DSIO_M3x4		22,
17	D3DSIO_M3x3		23,
18	D3DSIO_M3x2		24,
19	D3DSIO_CALL		25,
1A	D3DSIO_CALLNZ		26,
1B	D3DSIO_LOOP		27,
1C	D3DSIO_RET		28,
1D	D3DSIO_ENDLOOP		29,
1E	D3DSIO_LABEL		30,
1F	D3DSIO_DCL		31,
20	D3DSIO_POW		32,
21	D3DSIO_CRS		33,
22	D3DSIO_SGN		34,
23	D3DSIO_ABS		35,
24	D3DSIO_NRM		36,
25	D3DSIO_SINCOS		37,
26	D3DSIO_REP		38,
27	D3DSIO_ENDREP		39,
28	D3DSIO_IF		40,
29	D3DSIO_IFC		41,
2A	D3DSIO_ELSE		42,
2B	D3DSIO_ENDIF		43,
2C	D3DSIO_BREAK		44,
2D	D3DSIO_BREAKC		45,
2E	D3DSIO_MOVA		46,
2F	D3DSIO_DEFB		47,
30	D3DSIO_DEFI		48,
40	D3DSIO_TEXCOORD		64,
41	D3DSIO_TEXKILL		65,
42	D3DSIO_TEX		66,
43	D3DSIO_TEXBEM		67,
44	D3DSIO_TEXBEML		68,
45	D3DSIO_TEXREG2AR	69,
46	D3DSIO_TEXREG2GB	70,
47	D3DSIO_TEXM3x2PAD	71,
48	D3DSIO_TEXM3x2TEX	72,
49	D3DSIO_TEXM3x3PAD	73,
4A	D3DSIO_TEXM3x3TEX	74,
4B	D3DSIO_RESERVED0	75,
4C	D3DSIO_TEXM3x3SPEC	76,
4D	D3DSIO_TEXM3x3VSPEC	77,
4E	D3DSIO_EXPP		78,
4F	D3DSIO_LOGP		79,
50	D3DSIO_CND		80,
51	D3DSIO_DEF		81,
52	D3DSIO_TEXREG2RGB	82,
53	D3DSIO_TEXDP3TEX	83,
54	D3DSIO_TEXM3x2DEPTH	84,
55	D3DSIO_TEXDP3		85,
56	D3DSIO_TEXM3x3		86,
57	D3DSIO_TEXDEPTH		87,
58	D3DSIO_CMP		88,
59	D3DSIO_BEM		89,
5A	D3DSIO_DP2ADD		90,
5B	D3DSIO_DSX		91,
5C	D3DSIO_DSY		92,
5D	D3DSIO_TEXLDD		93,
5E	D3DSIO_SETP		94,
5F	D3DSIO_TEXLDL		95,
60	D3DSIO_BREAKP		96,
D3DSIO_PHASE		0xfffd,
D3DSIO_COMMENT		0xfffe,
D3DSIO_END		0xffff,
D3DSIO_FORCE_DWORD	0xffffffff


[10:00]
   Bits 0 through 10 indicate the register number (offset in register file). 
[12:11]
   Bits 11 and 12 are the fourth and fifth bits [3,4] for indicating the register type. 
[13]
   For vertex shader (VS) version 3_0 and later, bit 13 indicates whether relative addressing mode is used. If set to 1, relative addressing applies.

   For all pixel shader (PS) versions and vertex shader versions earlier than 3_0, bit 13 is reserved and set to 0x0.
[15:14]
   Reserved. This value is set to 0x0.
[19:16]
   Write mask. The bits of this mask have the following components:
   Bit	Component
   16	Component 0 (X;Red)
   17	Component 1 (Y;Green)
   18	Component 2 (Z;Blue)
   19	Component 3 (W;Alpha)

[23:20]
   Bits 20 through 23 indicate the result modifier. Multiple result modifiers can be used. The following result modifier types can be ORed together in this 4-bit value:
   Value	Result modifier type
   0x1	Saturate (vertex shaders)
   0x2	Partial precision (pixel shaders)
   0x4	Centroid (pixel shaders)

[27:24]
   For PS versions earlier than 2_0, bits 24 through 27 specify the result shift scale (signed shift).
   For PS version 2_0 and later and VS, these bits are reserved and set to 0x0.
[30:28]
   Bits 28 through 30 are the first three bits [0,1,2] for indicating the register type.
[31]
   Bit 31 is 0x1. 

Here are the 'blur' shaders located in one of the BI 'shadercache' files. PS is a pixel shader, vs is a vertex shader.

Shaders_DefPP.shdc:

  • PsPpRadialBlur.ps_3_0 - when running, blurs sides of screen
  • PSPostProcessGaussianBlurH.ps_3_0
  • PSPostProcessGaussianBlurV.ps_3_0
  • PSPostProcessGaussBlur.ps_3_0
  • PsPpRotBlur.ps_3_0 - some kind of fullscreen blur
  • VsPpDynamicBlurFinal.vs_3_0
  • PsPpDynamicBlurFinal.ps_3_0
  • VsPpDynamicBlur.vs_3_0
  • PsPpDynamicBlur.ps_3_0

I have successfully disabled the first one on that list. I need help figuring out what the rest of them are. My personal intent was to disable the blur shaders which I had a problem with, the running edge blur (success, but doesn't help much), and the full screen Vaseline blur which I think causes the blur when you turn around. (haven't found and gotten to that one yet) FYI: I am using the 'Low' Postproccess setting.

Questions:

  • What is the difference between Shaders_DefPP and ShadersPP cache files? Why would there be two separate ones?
  • What in game PP setting (low/high) corresponds with which shaders?
  • If the shaders are removed from the stock bin.pbo in an attempt to load them 'uncompressed' with $PBOPREFIX$, this doesn't seem to work. This begs the question, do updated shaders in external addons (ex: beta folder) get loaded? The latest beta patches have new and modified shaders, but are they being used?
  • What is the best way to disable them? My first guess would be 'NOP's but I tried that and the screen was black.. I think the frame register still needs to be returned to the GPU. But I don't really understand the code entirely.
  • When this is done, how will we release it as an addon? I would prefer it be part of kju's PROPER releases. Are we able to package a modified shader, sign it, and get it loaded by the game? Why did Keg release his old shader mod the way he did? (as an overwrite, vs modfolder capable) He must have had the same issues?

Example of crude modification by editing constants:

PsPpRadialBlur.ps_3_0 - when running, blurs sides of screen - in Shaders_DefPP

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.26.952.2844
//
// Parameters:
//
//   float4 projConsts;
//   float4 radialBlurPars;
//   sampler2D samplers[15];
//
//
// Registers:
//
//   Name           Reg   Size
//   -------------- ----- ----
//   radialBlurPars c0       1
//   projConsts     c6       1
//   samplers       s0       1
// NOTE: In this dump i have placed the binary code on the top of the assembly text. So if you wanted to find this first 'def' line, search for 5100000501000FA0000000BF000000000000803F00004040
// If you want to disable this effect, modify that line to be 51000005 01000FA0	 00000000 00000000 00000000 00000000
ps_3_0

//offset in Build_60141 bin.pbo's Shaders_DefPP
	       		       		       /changing these 4 constants to 0 kills the effect.
000064C2	51000005 01000FA0	 000000BF 00000000 0000803F 00004040
	def           c1,	     -0.5,       0,       1,       3
000064DA	51000005 02000FA0	 2249123E 00000000 00000000 00000000
	def           c2,     0.142857149,       0,       0,       0
1F000002 05000080 00002390 
dcl_texcoord_pp v0.xy

1F000002 00000090 00080FA0
dcl_2d s0

02000003 00002380 010000A0 0000E490
add_pp	r0.xy,       c1.x,       v0

58000004 00002C80 00004481 010055A0 0100AAA0
cmp_pp r0.zw,       -r0.xyxy,       c1.y,       c1.z
58000004 01002380 0000E480 010055A1 0100AAA1
cmp_pp r1.xy,       r0,       -c1.y,       -c1.z
02000003 00001380 0000E48B 0000EEA1
add_sat r0.xy,       r0_abs,       -c0.zwzw
05000003 000003800000E4800000E4A1
mul r0.xy,       r0,       -c0
02000003 00002C800000E48001004480
add_pp r0.zw,       r0,       r1.xyxy
05000003 000023800000E4800000EE80
mul_pp r0.xy,       r0,       r0.zwzw
05000003 00002480000055800600AAA0
mul_pp r0.z,       r0.y,       c6.z
05000003 00002480000055800600AAA0
mad_pp r0.yw,       r0.xxzz,       c1.w,       v0.xxzy
02000003 010023800000E8810000ED80
add_pp r1.xy,       -r0.xzzw,       r0.ywzw
42000003 02002F800000ED800008E4A0
texld_pp r2,       r0.ywzw,       s0
42000003 03002F800100E4800008E4A0
texld_pp r3,       r1,       s0
02000003 00002A800000A08101006080
add_pp r0.yw,       -r0.xxzz,       r1.xxzy
02000003 010027800200E4800300E480
add_pp r1.xyz,       r2,       r3
42000003 02002F800000ED800008E4A0
texld_pp r2,       r0.ywzw,       s0
02000003 00002A800000A0810000E480
add_pp r0.yw,       -r0.xxzz,       r0
02000003 010027800100E4800200E480
add_pp r1.xyz,       r1,       r2
42000003 02002F800000ED800008E4A0
texld_pp r2,       r0.ywzw,       s0
02000003 00002A800000A0810000E480
add_pp r0.yw,       -r0.xxzz,       r0
02000003 010027800100E4800200E480
add_pp r1.xyz,       r1,       r2
42000003 02002F800000ED800008E4A0
texld_pp r2,       r0.ywzw,       s0
02000003 00002A800000A0810000E480
add_pp r0.yw,       -r0.xxzz,       r0
02000003 000025800000E4810000F580
add_pp r0.xz,       -r0,       r0.yyww
42000003 03002F800000ED800008E4A0
texld_pp r3,       r0.ywzw,       s0
42000003 00002F800000E8800008E4A0
texld_pp r0,       r0.xzzw,       s0
02000003 010027800100E4800200E480
add_pp r1.xyz,       r1,       r2
02000003 010027800300E4800100E480
add_pp r1.xyz,       r3,       r1
02000003 000027800000E4800100E480
add_pp r0.xyz,       r0,       r1
05000003 000827800000E480020000A0
mul_pp oC0.xyz,       r0,       c2.x

01000002 00082880 0100AAA0
mov_pp oC0.w,       c1.z

// end
FFFF0000 00AAAA00
// approximately 30 instruction slots used (7 texture,       23 arithmetic)

Understandably, this is a touchy subject because of IP. I apologize if this thread breaks any rules (I looked first) and I understand if it is not allowed on these forums.

---------- Post added at 02:02 PM ---------- Previous post was at 01:25 PM ----------

A quick test after restoring everything to stock:

It is indeed possible to load shaders as mods. (evidenced in 'beta' folder) It is also possible to work on them uncompressed using PBOPrefix.. this will make working on them a breeze.

To do that PBOPREFIX trick, make a bin folder in your root which has the shaders in it that you have modified. This will overwrite (in memory) the built in shaders. Note that they are updated frequently in beta builds, so you'd want to keep them up to date or you may have big problems with the beta exe calling shaders that don't exist.

UPDATE: It is possible to load the modified shaders on the fly by using num - & FLUSH! No more restarting the game! Edit: This only works for POST PROCESSING shader file (PP).

---------- Post added at 02:49 PM ---------- Previous post was at 02:02 PM ----------

A simple script to help in finding the rotation blur shader:

_man = _this select 0;
_dir = 0;
abort=false;
while {!abort} do {
for [{_dir=0},{_dir<=360},{_dir=_dir+1}] do {
	_man setdir _dir;
// change depending on your FPS/computer
	sleep .0005;
};
};

Edited by oktane

Share this post


Link to post
Share on other sites

Okay, so as I mentioned, NOP'in out the code doesnt work because the shader must write its output to register oC0. The minimum instructions required to disable a shader such as PsPpRadialBlur is the following asm:

    ps_3_0
   dcl_texcoord v0.xy
   dcl_2d s0
   texld oC0, v0, s0

This compiles to the following bytecode:

1F00000205000080000003901F0000020000009000080FA04200000300080F800000E4900008E4A0

What it does is samples the data from the vertex shader (somewhere else in the pipeline, and just copy it to the output register.. essentially a bypass.

It works! And we're learning about shaders and stuff! (I had to talk to a graphics programmer today to verify these minimal instructions)

:yay:

Steps to reproduce:

  • Get beta (all hex offsets in the pics are for build 60141) and install it (beta is not required, but these instructions assume the use of it)
  • Depbo bin.pbo from the arma2\beta\dta\ directory to the root of the arma folder inside of a bin folder, ie d:\arma2\bin\*.* (files from bin.pbo in there)
  • This will cause the game to load that directory instead of the normal bin.pbo - when you are done playing, rename it to something other than 'bin' so that it isn't loaded all the time! If you try to load the normal non-beta, it will load it too, so don't do that unless its renamed. (the game will probably crash with a mismatched version of bin files)
  • Inside of that bin directory is config.bin, rename that to config.bak.. so the game doesn't load it.
  • Inside of that bin directory is Shaders_DefPP.shdc, make a backup of it, and open the original in a hex editor.
  • This is the BIS shadercache file. The best that I can describe it is: Its a big file filled with compiled shaders, ie 'fxo' files. So its a bunch of fxo files back to back inside of one file separated by 00AAAA00 which is a signal that the fxo has ended, the next fxo is beginning.
  • Search for the text string PpRadialBlur in the file.
  • You will see an area like this picture. In this I have highlighted/bookmarked some things for clarity. The red is the END signal.. FFFF0000 is the end of the compiled shader per MS standards, and 00AAAA00 is the separator for BIS's shadercache parser, so I have grouped them. Highlighted in green is the header of the pixelshader, it is not very important but it defines the parameters (registers) that the shader can have as input such as 'blur strength' in this instance. It isn't really important to me since I'm modifying the shaders actual code, which is highlighted in yellow. A rule of thumb, the shader code always starts after the string "HLSL Shader Compiler 9.26.952.2844" plus one null byte 00. It always ends with FFFF0000. You can change anything you want between those two markers, adhering to two rules:
    -The shader must compile and run without errors, otherwise the game will not start. (or crash if you already had it running)
    -The bytecode must be the same length as the original BIS bytecode. This is because of a length int variable somewhere (header probably) that haven't looked for yet.
  • To kill the old shader, overwrite the whole yellow area with zeros. (I highlight it and use the Fill command) Zero's are HLSL's 'NOP' instruction.
  • So if you were to run the game now after saving, it would work fine.. except the screen is blank. So we need to write the incoming image to the output buffer with the bypass code. Put the bypass bytecode I noted above into the code section without inserting. It should overwrite the zeros. (I highlight the number of bytes the bypass uses (40 bytes or 0x28) and do Paste Special.. CF_TEXT with Interpret as Hex String checked)
  • Here is what it looks like in the hex editor when you are done. The black code is the bypass asm bytecode, the rest is zero'd out.
  • When you are done and have saved, the file should be the same size as it was before you started.
  • You will notice a lack of silly blur around the edges of your screen when you move the player! (don't forget to turn PP back on) In both pictures, I was sprinting.
  • To undo the changes, rename or delete the bin folder from your Arma2 directory.

This concludes tonight's :cool: lesson.

A note on performance: Shader performance is rated in number of instructions. As I understand it, the NOP command is indeed an instruction. Because of the differing lengths of the opcodes + their parameters, theoretically, we have added many MORE instructions due to all the nops. But I didn't notice any performance hit.. this problem (if it is one) will be solved when the length is found in the code header.. then we wont have to fill out the rest of the space with zeros. (nops)

Edited by oktane
fixed image links, accidentally had them embedded - added note

Share this post


Link to post
Share on other sites

One more thing:

shader_wonderful_noblur.jpg

:bounce3::bounce3::bounce3:

This is the PsPpRotBlur patched. In both pictures, the gun was spinning at high speed. The difference is well, obvious. :D

---------- Post added at 09:46 PM ---------- Previous post was at 08:33 PM ----------

Found the length bytes..

Picture

In the picture, you can see the length bytes highlighted with the thin green border, they are always after 11 22 22 11 pattern. In this picture, the length bytes = 788. You can see I have added a bookmark to them called RadialBlurLength in the bookmarks window that looks at those 4 bytes. The black area (neon green and blue included) that I have selected is the rest of the shader and it is 788 bytes as seen on the status bar lower right.

The length is a signed or unsigned int16.

Edited by oktane

Share this post


Link to post
Share on other sites

Brilliant work. The PsPpRotBlur patching is the most important for the guys in my squad who is having performance issues, as it just makes them sick they claim.

Share this post


Link to post
Share on other sites

I know this is an old thread, but someone directed me to it after I asked a question about modifying ARMA shaders and adding new shaders. This is a very interesting thread, and deserving of some more of my time. Though I've been up to par with HLSL and ASM for years, I'm relatively new to playing/editing/modding ARMA and don't know much about its inner workings yet.

I'm curious, Mr. Oktane, if you're still around and have made any more progress on this front? I'm very interested in finding out more about it.

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  

×