23 December 2019

5 Level loader & Design

The engine uses pre-drawn low-resolution bitmaps as an input. Bitmaps consist of array of pixels which have certain color and X, Y position. Each level uses a grid where each cell contains one instance of "Blockclass. 

There are 11 types of blocks already defined inside the engine. Following list shows each block type as well as its description and color code in 24- bit RGB hexadecimal format:

  • EMPTY [0x000000] (non-solid transparent block, like air)
  • SOLID WALL [0xFFFFFF] (solid visible wall) 
  • PLAYER SPAWN [0xFFD800] (player will spawn on this position)
  • GRASS SPRITE [0x1FD500] (simple grass sprite)
  • WATER TILE [0x0000FF] (water floor tile with simple scrollable animation)
  • WOODEN PATH [0x441515] (static wooden path floor texture)
  • MONUMENT [0x00FFFF] (monument sprite with four orbiting orb sprites)
  • DOWN LADDER TILE [0xFF0000] (static ladder floor tile, level switcher)
  • UP LADDER TILE [0xFFFF00] (static ladder ceiling tile, level switcher)
  • BANNER 1 [0xB200FF] (simple banner block type 1)
  • BANNER 2 [0x6900FF] (simple banner block type 2)

Each block'color code has been chosen arbitrarily and its functionality is defined in "Block" class. Based on the given bitmap and its color distribution engine loads that level by iterating over each pixel in the bitmap and based on the pixel X, Y coordinate and color engine will place a certain block into to the game world. 

I have already predefined two levels – Park level & Underground level. Each level showcases different engine rendering capabilities. Following figure below shows both Park and Underground level bitmaps with some highlighted block types:

Bitmaps of both demo levels
Bitmaps of both demo levels

Additional levels can be added by drawing a new bitmap the same way as it is presented on the image above and by following the steps below:
  1. Place your new bitmap inside the "\Resources\" folder
  2. Create new class named "yourNewLevel.cs"
  3. Inherit all the properties from the "Level" class and add the base constructor call
  4. Override "Draw" method so that your new level can be drawn
  5. Create new instance of your level inside "Form1" class
  6. Call "Draw" function for your new level

Part 4: Drawing strategy in C#.NET & GDI+ <<< | >>> Part 6: Fundamental graphical operations

6 Fundamental graphical operations

Comming soon no worries m8.

4 Drawing strategy in C#.NET and GDI+

Windows Forms framework uses for its rendering a library called GDI+ [12, 13] (Graphics Device Interface). It is a managed wrapper library over the standard C/C++ based API called GDI which interacts with device drivers on behalf of applications. GDI+ provides a simple graphical functionality from drawing simple shapes such as triangles, circles or squares to drawing bitmaps, setting filter options and more. The only functionality I chose to use from this GDI+ library was setting and drawing bitmap onto the screen with given resolution. This is enough functionality since the bitmap itself is being set from Render3D class which I have created from scratch.

Drawing strategy I’ve chosen to implement consists of following steps:
  1. Let the Render3D class fill entire frame buffer of pixel colors
  2. Initialize new variable of type Bitmap
  3. Set the bitmap data to frame buffer data by looping over each element of frame buffer array and set bitmap data by calling Bitmap.SetPixel(x, y, color) method which will set each pixel of given bitmap to certain color
  4. Let the GDI+ draw the bitmap onto the screen
There was a performance problem with Bitmap.SetPixel(x, y, color) method however. Since the set pixel operation is called each time the loop iterates, set pixel method must lock and unlock bits in memory multiple times in one frame which is time-expensive operation [15].
So instead of using built-in set pixel function I had to implement my own function for writing bytes into bitmap memory which I called "RenderIntoMemory".

4.1 "RenderIntoMemory" function


The purpose of this method is to overcome the limitation of set pixel operation [11] and to create more efficient way to read and write the memory. RenderIntoMemory method executes following steps in sequence:

  1. Use build-in Bitmap.LockBits(RECT, LOCK_MODE, PIXEL_FORMAT) function which locks the bitmap into system’s memory and thus gaining access to the BitmapData structure it returns.
  2. Initialize RGB byte[] array which size is determined by the BitmapData.Stride multiplied by screen height. BitmapData.Stride is usually defined as a screen width multiplied by the number of color channels given pixel format uses. In standard pixel format I use, stride can be calculated as s = W * 4 where W represents screen width and value 4 represents four color channels (blue, green, red, alpha).
  3. Each R, G and B component of RGB byte array is initialized by decomposing Render3D frame buffer hex color while the alpha channel A is not used, hence it is set to 255.
  4. Invoke build-in Marshal.Copy(RGB[], OFFSET, SCAN_0, LENGTH) function which copies the content of initialized RGB byte array in step 3 into the BitmapData structure itself.
  5. Unlock bits by calling Bitmap.UnlockBits(BITMAP_DATA) function. This final call is going to set the bitmap contents and will unlock it from memory making it ready for final drawing by GDI.

4.2 GDI+ Bitmap Drawing


After the "RenderIntoMemory" function has finished execution, the final step is to draw the bitmap itself onto the screen. Before this operation can be executed an InterpolationMode has to be set to "NearestNeighbour" so that no additional image interpolation will be applied to the bitmap to prevent blurriness. Then Graphics.DrawImage(BITMAP, RECT) function is called to send the given bitmap onto the graphics device for final rendering. 

Following benchmark table shows the average performance difference between using the set pixel operation and implementing my own RenderIntoMemory function while iterating an entire image. Data were measured in first second of execution for defined pixel areas:

Table of measured time based on resolution
Table of measured time based on resolution

Table above show performance data of five different resolutions with aspect ratio of 4:3. This and all following performance tests have been measured on a cacomputer with following parameters:

  • OS: Windows 10 Home of type x64
  • CPU: Intel Core i7-7700 CPU @ 3.60GHz – 4.02GHz
  • GPU: Nvidia GeForce GTX 1080
  • RAM: 16.0 GB

One can see that choosing CPU as the main rendering-based calculations unit instead of GPU can be very inefficient since CPU is already busy with managing other applications as well. This was one of the many reasons why GPUs have been implemented later inside each computer hardware to accelerate rendering.

Following graph shows the time intervals measured in ticks against the resolution of both methods:


It is clear how the performance of "SetPixel" function scales linearly while the resolution doubles. The same applies to the "RenderIntoMemor" function except for the fact that RenderIntoMemory is approximately 12 to 21 times faster than the SetPixel method. 

From the table above, one can see the higher the resolution the greater the difference between individual rendering methods. This is caused by the lock / unlock bits operation which is executed for each pixel in SetPixel method while RenderIntoMemory locks pixels once in the beginning and unlocks them in the end which improves performance of the application.


Part 3: Modern Rendering Pipelines <<< | >>> Part 5: Level loader & Design

17 December 2019

3 Modern Rendering Pipelines

Rendering pipelines used by modern rendering frameworks consist of following stages:
  • Data Loading
  • Shading & Lighting
  • Optional rendering stages

In sections below, I describe each stage in detail to showcase what modern rendering frameworks share with my rendering framework. Picture below  summarizes entire rendering pipeline:


Modern rendering pipeline diagram

Red block "DATA" represents input to the pipeline while the blue block "SCREEN" represents final step (output) where contents of Frame Buffer are displayed onto a screen.


3.1 Data Loading


To draw a certain object, CPU needs to load its data file (fbx, obj or 3ds file) from disk into RAM, from where the object data can be passed into VRAM which is closer to GPU. This results in faster reading speeds. CPU and GPU communicate together via Command (Ring) Buffer, which dictates what objects should be drawn with what materials. 
Modern GPUs use batching which merges objects with same materials together to save memory and draw calls. When Command Buffer contains "Render State" command GPU starts its drawing procedure which is expensive, thus batching needs to be initiated.


3.2 Shading & Lighting


After data are loaded into VRAM a special type of program is executed on GPU for each vertex and pixel that belongs to the currently rendered object. This program is called "shader" and is specifically designed for GPUs only.

There are several types of shaders:
  • Vertex shader
  • Fragment shader
  • Surface shader
  • Compute shader

Bibliography

[1] J. Vijin (Tonc), Mode 7 | 2018

[modified 8.4.2010]

[2] tigrou, SNES Mode 7 & Affine Transformation | 2018

[modified 9.8.2015]

[3] Wikipedia, Mode 7 | 2018

[modified 11.11.2017]

[4] Kavita Bala, Perspective correct texturing and quaternion interpolation | 2012


[5] Nina Amenta, Perspective-Correct Texture mapping


[6] Wikipedia, Texture mapping | 2018


[7] Mikael Kalms, Perspective Texturemapping | 1997


[8] MSDN, Thread.Join() Method | 2018


[9] MSDN, Stopwatch.Frequency Field | 2018


[10] MSDN, Stopwatch.ElapsedTicks Property | 2018


[11] Michal Franc, LockBits vs Get Pixel Set Pixel - Performance | 2016


[12] MSDN, GDI+ | 2018


[13] c#corner, Introduction to GDI+ in .NET | 2003


[14] Wikipedia, Affine transformation | 2018


[15] Robert W. Powell, Using the LockBits method to access image data [online] 2003, available on internet:

11 December 2019

2 Introduction

This pseudo 3D engine will be implemented from scratch using only Windows Forms & C#.NET framework in Microsoft Visual Studio 2017. 

To showcase my engine’s functionality I have created two demo levels: Park level with scrollable skybox, animated water floor texture, walls and moving sprites as well as Underground level with solid floor and ceiling as well as in-game teleport in form of ladder to move user (player) between those levels. In-game levels are loaded in a form of low resolution bitmaps where each pixel color represents certain block type. These bitmaps are user-created and can be edited anytime. 

Implemented graphical API can render floors, sprites and walls so that each pixel on the screen is calculated manually without using any other external framework or library.

Nowadays rendering APIs like OpenGL and Direct3D use polygons and shaders to properly draw models onto a screen which are mathematically defined as volumetric shapes. The difference between my rendering framework and modern frameworks is the fact that modern rendering procedures use triangles while my engine works directly with pixels. 

Reason why triangle is the most popular elementary surface is because there are minimum of three points needed to create a flat surface which always form a triangle. Simplified rendering pipeline model can be seen on the picture below:

simplified rendering pipeline diagram image
Simplified rendering pipeline diagram


Procedures colored green indicate that both Vertex and Fragment shader functions can be programmed to do specific tasks. Each model file format (fbx, obj or 3ds) contains data like vertex positions, normals, tangents, triangle indices and uv coordinates. These data are passed into vertex shader function which applies certain transformations so they can be rasterized and passed further into fragment shader function which fills in the frame buffer. 

As a final step, frame buffer content is displayed on the screen with predefined framerate which can be either 30 FPS for mobile platforms and 60 or more FPS for pc and console platforms.

Modern games and other graphical software use hardware accelerated rendering where GPU is used to redraw the entire screen while my rendering framework doesn’t use any hardware acceleration support at all. Such a renderer, which uses part of CPU time instead of GPU is called software renderer. The reason why I chose this approach is the fact that I use Windows Forms framework to initialize window and create graphics object to redraw canvas. 

Windows Forms framework offers light and most suitable workflow for such project. Unfortunately it doesn’t support hardware accelerated rendering by default which is not a problem since drawing area of the window is set to have resolution of 160 x 120 (19 200 pixels) and screen ratio of 4:3.

List of Abbreviations and Symbols

It is important to define all abbreviations used throughout the entire blog, which are listed below.

  • FPS – Frames Per Second (Number of discrete frames that will be rendered within a single second)
  • API – Application Programming Interface
  • GDI – Graphical Device Interface
  • FOV – Field of View (Angle of how much of the world could be seen by a camera)
  • CG – Computer Graphics
  • STR – Rotation, Translation & Scale matrices multiplied together in this order
  • GPU – Graphical Processing Unit
  • CPU – Central Processing Unit
  • RAM – Random Access Memory (Used by CPU)
  • VRAM – Video RAM (RAM used by GPU)
  • PBR – Physically Based Rendering (Realistic light model)
  • AABB – Axis Aligned Bounding Box
  • AI – Artificial intelligence
  • ms – milliseconds


10 December 2019

1 Before We Start

WARNING: This blog describes how to implement pseudo 3D graphics along with simple game engine from scratch. If you don't care about that game engine part then you can skip to part 6: Fundamental graphical operations. You're welcome.

As a game developer I always wanted to learn how to create 3D graphcis from scratch without using OpenGL or Direct3D, but didn't know how. Then I saw Markus 'Notch' Persson (original creator of Minecraft) working on his game for a Ludum Dare 21 competition called "Prelude of the Chambered" which was exactly what I was looking for so I started to copy his code and research as much as I could about this approach.

Prelude of the Chambered gameplay
Prelude of the Chambered gameplay

Then I've recreated the same game engine for my Bachelor Thesis which this entire blog is describing.

Below you can see my pseudo 3D engine implementation based on Markus's approach with custom 'depth only' mode which when enabled, it shows contents of a depth buffer as a black & white image which you can see on the gif as well.


My Pseudo3D Engine demo
My Pseudo3D Engine demo

After all rendering functions were designed and implemented, testing phase has been performed where I've measured and calculated individual time intervals to draw a single pixel. 

Functionality of this engine can be further extended by implementing simple physics model for AABB collision detection & response as well as AI for in-game entities. Therefore simple games can be created for Windows platforms where further optimizations will be necessary to accelerate the rendering for very large levels. Input handler as well as two predefined levels are already included to showcase all the rendering capabilities of my pseudo3D engine.

This blog is a detailed tutorial I wish I had back when I was starting to learn how it all worked.