making of 1 million particles

this time I will give you some insights on how to create a gpu driven particle system with opengl and glsl. for most of my opengl work I choose cinder and highly recommend to get in touch with it. already knowing cinder is not essential but gives a better understanding of the text. also since this is just a making of, not a step by step guide, some OpenGL and shader knowledge is required.

before we dive into the code I think it’s good to get an overview on how the system works. the base of this particle system is a so called ping-pong framebuffer object. ping-pong means that you have two framebuffer objects (fbo) which are drawn alternately. when fbo A is drawn fbo B is used for calculations. on the next frame B will be drawn and A is used for calculations and so on. the particle movement is calculated by an glsl shader, all results (current position, velocity,…) are saved into textures. the drawing of the particles is also controlled by a shader who controls opacity and size. each particle has a time to live, if it’s old enough it will be respawned at a new position with it’s initial velocity. you see there is not that much going on, so now let’s look at the code a little bit deeper!

buffer setup

at first we are going to initialize all the buffers which will contain our particle data. the first buffer contains the position of each particle. A good starting point is to let spawn them at random points inside the window space.

the velocity of each particle is also initialized with random values. For best results it’s necessary to play around with the values. in my case, very small values are working best. the noise texture is used to interfere the movement of each particle. the code is taken from the “Hello Cinder” tutorial and does all it have to – creating a nice noise texture.

the last things to do is to initialize the surfaces with all the calculated stuff. you see, nothing spectacular here.

//initialize buffer
Surface32f mPosSurface = Surface32f(PARTICLES,PARTICLES,true);
Surface32f mVelSurface = Surface32f(PARTICLES,PARTICLES,true);
Surface32f mInfoSurface = Surface32f(PARTICLES,PARTICLES,true);
Surface32f mNoiseSurface = Surface32f(PARTICLES,PARTICLES,true);

Surface32f::Iter iterator = mPosSurface.getIter();

while(iterator.line())
{
	while(iterator.pixel())
	{

          mVertPos = Vec3f(Rand::randFloat(getWindowWidth()) / (float)getWindowWidth(), Rand::randFloat(getWindowHeight()) / (float)getWindowHeight(),0.0f);

          //velocity
	  Vec2f vel = Vec2f(Rand::randFloat(-.005f,.005f),Rand::randFloat(-.005f,.005f));

            float nX = iterator.x() * 0.005f;
            float nY = iterator.y() * 0.005f;
            float nZ = app::getElapsedSeconds() * 0.1f;
            Vec3f v( nX, nY, nZ );
            float noise = mPerlin.fBm( v );

            float angle = noise * 15.0f ;

            //vel = Vec3f( cos( angle ) * 6.28f, cos( angle ) * 6.28f, 0.0f );

            //noise
            mNoiseSurface.setPixel(iterator.getPos(),
                                   Color( cos( angle ) * Rand::randFloat(.00005f,.0002f), sin( angle ) * Rand::randFloat(.00005f,.0002f), 0.0f ));

			//position + mass
			mPosSurface.setPixel(iterator.getPos(),
                                 ColorA(mVertPos.x,mVertPos.y,mVertPos.z,
                                 Rand::randFloat(.00005f,.0002f)));
			//forces + decay
			mVelSurface.setPixel(iterator.getPos(), Color(vel.x,vel.y, Rand::randFloat(.01f,1.00f)));

			//particle age
			mInfoSurface.setPixel(iterator.getPos(),
                                  ColorA(Rand::randFloat(.007f,1.0f), 1.0f,0.00f,1.00f));

		}
	}

vertex buffer object

the next step is to create a dummy vbo. why dummy? since we are creating our on data – particles with position and color we normally don’t want anything else drawn by OpenGL. but things don’t work this way. OpenGL needs at least something to do so we give it a vertex buffer object with the same amount of data as we have particles.

//fill dummy fbo
	vector<Vec2f> texCoords;
	vector<Vec3f> vertCoords, normCoords;
	vector<uint32_t> indices;

	gl::VboMesh::Layout layout;
	layout.setStaticIndices();
	layout.setStaticPositions();
	layout.setStaticTexCoords2d();
	layout.setStaticNormals();

	mVbo = gl::VboMesh(PARTICLES*PARTICLES,PARTICLES*PARTICLES,layout,GL_POINTS);

	for (int x = 0; x < PARTICLES; ++x) {
		for (int y = 0; y < PARTICLES; ++y) {
			indices.push_back( x * PARTICLES + y);
			texCoords.push_back( Vec2f( x/(float)PARTICLES, y/(float)PARTICLES));
		}
	}

	mVbo.bufferIndices(indices);
	mVbo.bufferTexCoords2d(0, texCoords);

when the vbo is filled the gpu gets something to draw and everyone is happy… time to move on to the real thing!

updating particle behavior

in the update method all the magic will happen. for this we have to enable one of our ping-pong FBOs to do the math. additionaly our texture targets need the be registered for GL since we want to store our results in there. things which need to be read inside the shader (particle position, velocity,…) have to be registered to (our three textures with the calculated data). after that you just need to configure the uniforms of the shader and then draw a full screen quad. the rest is done by the shader, but more on this later.

mFbo[mBufferIn].bindFramebuffer();

    //set viewport to fbo size
	gl::setMatricesWindow( mFbo[0].getSize(), false ); // false to prevent vertical flipping
    gl::setViewport( mFbo[0].getBounds() );

	GLenum buffer[3] = { GL_COLOR_ATTACHMENT0_EXT, GL_COLOR_ATTACHMENT1_EXT, GL_COLOR_ATTACHMENT2_EXT };
	glDrawBuffers(3,buffer);

	mFbo[mBufferOut].bindTexture(0,0);
	mFbo[mBufferOut].bindTexture(1,1);
	mFbo[mBufferOut].bindTexture(2,2);

	mVelTex.bind(3);
	mPosTex.bind(4);
    mNoiseTex.bind(5);

	mVelShader.bind();
	mVelShader.uniform("positions",0);
	mVelShader.uniform("velocities",1);
	mVelShader.uniform("information",2);
	mVelShader.uniform("oVelocities",3);
	mVelShader.uniform("oPositions",4);
  	mVelShader.uniform("noiseTex",5);

	glBegin(GL_QUADS);
	glTexCoord2f( 0.0f, 0.0f); glVertex2f( 0.0f, 0.0f);
	glTexCoord2f( 0.0f, 1.0f); glVertex2f( 0.0f, PARTICLES);
	glTexCoord2f( 1.0f, 1.0f); glVertex2f( PARTICLES, PARTICLES);
	glTexCoord2f( 1.0f, 0.0f); glVertex2f( PARTICLES, 0.0f);
	glEnd();

	mVelShader.unbind();

	mFbo[mBufferOut].unbindTexture();

	mVelTex.unbind();
	mPosTex.unbind();

	mFbo[mBufferIn].unbindFramebuffer();

	mBufferIn = (mBufferIn + 1) % 2;
    mBufferOut = (mBufferIn + 1) % 2;

particle drawing

The draw method uses buffer #2 which stores the results of the last update(). Again, bind the frame buffer, set up the textures which are needed for particle drawing (in my case it’s the position and velocity texture and a additional texture for my point sprites). After that just draw the dummy vbo. As you can see I also moved everything just to see all the particles in the middle of the screen.

gl::pushMatrices();

glScalef(getWindowHeight() / (float)PARTICLES , getWindowHeight() / (float)PARTICLES ,1.0f);

// draw particles
gl::draw( mVbo );

gl::popMatrices();

That’s almost everything you have to do on cpu side. The rest is done by two GLSL shaders which will be explained now.

shader #1 – particle calculations

the vertex shader is the simplest shader you can write, so no explanation is required. on the fragment shader a little bit more is done.

at first all the saved values from the textures have to be read. after that it’s time to update the velocity and the particle positions too. it’s always good to spawn new particles to make things more interesting. in my case I chose a lifetime for every particle. when the maximum age is reached the particle will be “killed” and a new one is created with it’s initial position and velocity.

    age += tStep;

    vel += vec3(noise.x,noise.y,0.0);

    pos.x += vel.x;
    pos.y += vel.y;

    if( age >= maxAge ) 
    {
        vec3 origVel = texture2D(oVelocities, texCoord.st).rgb;
        vec3 origPos = texture2D(oPositions, texCoord.st).rgb;

        age = 0.0;

        if(pos.x > 1.0 || pos.x < 0.0 || pos.y > 1.0 || pos.y < 0.0 )
            pos = origPos;

        vel = origVel;
    }

The last thing is to save the new values back in the textures on the slots 1,2 and 3.

//position + mass
gl_FragData[0] = vec4(pos, mass);
//velocity + decay
gl_FragData[1] = vec4(vel, decay);
//age information
gl_FragData[2] = vec4(age, maxAge, 0.0, 1.0);

shader #2 – particle drawing

the shader for the drawing does a bit more on the vertex side. here the position saved in our texture is tranformed to our view space (remember: particle data is stored in values from 0.0 to 1.0). be careful with the scaling of the point size. GL is not made for point cloud rendering and performance drops hard when you scale the size.

dv = texture2D( posTex, gl_MultiTexCoord0.st );

age = texture2D(infTex, gl_MultiTexCoord0.st).r;

//scale vertex position to screen size
newVertexPos = vec4(scale * dv.x, scale * dv.y, scale * dv.z, 1);

//adjust point size, increasing size kills performance
gl_PointSize = 1.0 - (1.0 * age);

the fragment shader colors each particle by its position. opacity is controlled by particle age – the older a particle is, the more it will be visible. alternatively you can read from a sprite texture to give particles a nicer look but again, be careful about performance drops.

vec4 colFac = vec4(1.0);//texture2D(spriteTex, gl_PointCoord);
colFac.rgb *= texture2D( posTex, gl_TexCoord[0].st ).rgb;

colFac.a *= age;

gl_FragColor = colFac;

in short, this is everything needed to create a nice looking gpu driven particle system. as I mentioned in the beginning, this is just a making of, not a step by step tutorial but if there are any questions left, please write them down in the comments section!

here’s the link to my code on github

Hackbarth GFX

Rockin' and Rollin' since 1985