// transmission.log

Data Feed

> Intercepted signals from across the network — tech, engineering, and dispatches from the void.

1689 transmissions indexed — page 74 of 85

[ 2019 ]

20 entries
1462|blog.unity.com

Microsoft and Unity announce HoloLens 2 Development Edition

Releasing later this year alongside the launch of HoloLens 2, the all-new HoloLens 2 Development Edition offers even more value to jump-start your mixed reality development plans by combining the HoloLens 2 mixed reality device with $500 in free Azure credits, and 3-month free trials of both Unity Pro and the PiXYZ Plugin.“By bringing together HoloLens 2, Azure MR services, and the Unity platform, we are making it easier than ever for developers to get started building the real-time 3D experiences that are driving this 3rd wave of computing.”- Matt Fleckenstein, Senior Director, Mixed Reality, MicrosoftSince the release of first-generation HoloLens back in 2016, developers have been experimenting with a variety of mixed reality use cases, but it’s the solution-focused apps for industries like architecture, engineering, and construction (AEC), automotive and transportation that have resonated. Examples of mixed reality solutions include those from companies like Bentley Systems and Trimble, even Microsoft’s own Dynamics 365 apps. All are all made with Unity and brought to life with HoloLens 2.That’s why we partnered with Microsoft to offer a 3-month trial of Unity Pro and PiXYZ Plugin as part of the HoloLens 2 Development Edition. Unity Pro's enhanced benefits, paired with PiXYZ Plugin, enables the use of Computer Aided Drawing (CAD) and Building Information Management (BIM) design data to create mixed-reality applications for businesses that accelerate workflows and reduce costs.“Pairing HoloLens 2 with Unity’s real-time 3D platform enables industrial businesses to create immersive, interactive experiences that accelerate business and reduce costs. The addition of Unity Pro and the PiXYZ Plugin makes it easy to import 3D design data in minutes rather than hours.”- Tim McDonough, GM of Industrial, UnityIf you’d like to learn more about Unity’s mixed reality solutions for industry, interesting use cases, and PiXYZ Plugin workflows, check out the resources below.Accelerating the BIM workflow for AEC with fast and flexible imports (Webinar)Design, build, and operate faster with the PiXYZ Plugin for AEC (Blog)From CAD to Unity for Auto (Webinar)Unlock your CAD Data with Unity and PiXYZ (Video)Bringing Retail Stores to Life with XR for AEC (Video)Propelling AEC to New Heights by Using XR (Session)The HoloLens 2 Development Edition will release later this year starting as low as $99 per month (or $3500 outright) to registered members of Microsoft’s Mixed Reality Developer Program. Registration for the Mixed Reality Developer Program is free, and in doing so, you’ll be one of the first to know about the latest news and availability.We’re excited to partner with Microsoft on the release of the HoloLens 2 Development Edition later this year as we continue our collaboration on seamlessly integrating HoloLens 2 into Unity’s platform.Q: What’s included in the HoloLens 2 Development Edition?A: The HoloLens 2 Development Edition includes a HoloLens 2 mixed reality device, $500 Azure credits, Unity Pro 3-month trial, and PiXYZ Plugin 3-month trial.Q: When will the HoloLens 2 Development Edition release?A: No release date has been announced, but it will be available alongside other HoloLens 2 editions releasing later this year.Q: Can I pre-order the HoloLens 2 Development Edition?A: No. The HoloLens 2 Development Edition is not available for pre-order, but you can register for the Mixed Reality Developer Program to be notified of future availability.Q: How do I qualify to purchase a HoloLens 2 Development Edition?A: You must join Microsoft’s Mixed Reality Developer Program to be eligible to purchase a HoloLens 2 Development Edition.Q: What if I’ve already pre-ordered another HoloLens 2 edition, but want to change to the HoloLens 2 Development Edition?A: Later this year, Microsoft will contact those who pre-ordered a HoloLens 2, asking which edition they’d like to purchase. At that time, customers can express interest in the HoloLens 2 Development Edition.Q: Where will I be able to purchase the HoloLens 2 Development Edition?A: The HoloLens 2 Development Edition will be sold and fulfilled exclusively by Microsoft later this year.Q: Where can I learn more Unity and PiXYZ solutions for industry?A: Visit the Unity Industry Bundle for AEC or Unity Industry Bundle for automotive & transportation to learn more.Q: Where can I learn more about HoloLens 2 development with Unity?A: Visit the HoloLens 2 is coming: What you need to know blog to learn about experimenting with HoloLens 2 development. We’ll have more details to share later this year.

>access_file_
1466|blog.unity.com

Reality vs illusion

In March, Unity announced real-time ray tracing support for NVIDIA RTX technology. Real-time ray tracing introduces photorealistic lighting qualities to Unity’s High-Definition Render Pipeline (HDRP), unlocking new potential in Unity’s visual capabilities. Preview release of NVIDIA RTX in Unity is planned for 2019.3.To show off this new lighting technology, we took on the challenge to match a CG Unity-rendered BMW to a real BMW 8 Series Coupe that we filmed in a warehouse. The final project incorporated both the live interactive demo and a pre-rendered 4K video, cutting between the real car and the CG car. We challenged viewers to distinguish between the two. Our final video featured 11 CG BMW shots.Both the advertising and automotive industries thrive on fast turnaround. We produced a sample commercial featuring a BMW to demonstrate how decisions made during the advertisement creation process can benefit from the versatility of a real-time engine.We launched the demo and made official announcements at the Game Developers Conference (GDC) as well as at NVIDIA’s GPU Technology Conference (GTC).This blog post will take you through our production process.Previsualization was essential in helping us prepare for the shoot, and in creating a polished, professional ad. We worked directly in Unity alongside Director of Photography Christoph Iwanow to create a full CG version of the film. For this initial take, we used simple lighting and shaders and focused on nailing down pacing, camera placement, shot framing, depth of field, and areas of focus.Using Unity’s Physical Camera feature, we were able to match the real-life on-set camera and lenses to be used down to the most precise detail: sensor size, ISO, shutter speed, lens distortions, and more. This allowed us to match the look and feel of the real camera in order to achieve a 1:1 match on all levels.Cinemachine enabled us to easily create believable camera movements while retaining rapid iteration speeds.The previsualization stage was also our lighting experimentation playground. The real-life lighting equipment we planned to use was also replicated in Unity in every detail: the shapes and dimensions of the lights, temperature, and intensity. We could instantly see what light tubes hanging from the ceiling would look like and refine their placement, intensity, and color; we could generate perfect reflections on the car. This process would otherwise take up several hours of precious on-set time.By digitally replicating the real-world lighting and camera setups, Christoph was able to experiment with more combinations and find the look he wanted in advance of filming, so we could use filming time more effectively.The day after filming, we could start refining shots right away. Since the previz was done in Unity, we didn’t need to migrate assets. We used the lighting references we took on set, as well as referencing the real car, to ensure that our asset reacted realistically in look development. The technology now allows us to use image textures on area lights, so we were able to use photos of the actual light sources as textures for more realistic lighting and reflections.With the power of Unity, we were able to render in 4K ultra high definition from day one. This is an obvious advantage over offline rendering, where in many cases work-in-progress deliveries are rendered at lower resolutions to save costs and time. By delivering in 4K from the very first iterations, we were able to fine-tune small details in look dev and lighting. Many more iterations on renders in a short time allowed us to obtain strong visuals with a small art team. It also meant the results in the final renders were predictable.For the shots using plate integrations, the footage was tracked outside of Unity, and the camera and track data were then imported into Unity as an FBX file.As for the other cameras that we created directly in Unity, two Cinemachine features were essential in creating believable and realistic camera movements:Cinemachine Storyboard extension: Among its many features, the Storyboard extension allows you to align camera angles. It was an essential tool for us in easily replicating the specific camera movements required to recreate a shot completely in CG. We used a frame from the on-set camera footage as an overlay to act as a guide to align the CG camera. This was done for the first, middle and last frames of some shots.Cinemachine noise: Applying procedural noise to our camera moves made it easy for us to get rid of the unnatural perfectness of CG cameras by adding convincing micro-movements to the camera’s motion. We could ensure the movements were interesting, without being obviously repetitive.Our original concept featured a CG BMW with a different paint color than the real BMW. As the project evolved, we felt it was a stronger statement to have both cars be the same color so we could cut between them seamlessly. Changing the color of the car was a late-stage decision that could be made organically in Unity, as lighting and look dev artists could work on updates concurrently. A similar project in an offline renderer would have dailies with notes like “rotate the wheel 20 degrees to the right” or “drop the key light down a stop.” Instead of needing a full day turnaround for small technical changes, we could work out these changes interactively and focus our energy on creative decisions.For the two shots using plate integrations, we used Shader Graph to create a screen-space projected material with the original footage at HD resolution. We used this shader on the ground and walls around the car so that we could have realistic reflections from the plate onto the car. We supplemented with additional lighting, rendered the shots out of Unity, and then finished the final ground integration with the full-res plate using external compositing software.The usual technique used in game production to simulate reflections relies on a set of reflection probes set up at various locations combined with screen space ray tracing. This usually results in various light leaks and a coarse approximation of surface properties. With real-time ray tracing, we can now correctly reflect what is offscreen without any setup from the artists. However, such a reflection effect requires some arrangement of the rendering engine. In traditional game production, everything that isn’t within the frustum of the camera tends to be disabled, but now it is possible to reflect objects illuminated and shadowed by sources that are not initially visible onscreen. In addition to accurately simulating metal objects, effective ray tracing requires multiple bounces, which isn’t affordable within our performance constraint. We chose to handle only one bounce. The result of other bounces have been approximated but using the color of the metal multiply by current indirect diffuse lighting.Traditional offline renderers are good at managing the rendering of large textured area lights. However, using an offline renderer is costly and produces a lot of noise, (the more rays you use, the less noise, but that increases the rendering cost per frame). To achieve a real-time frame rate while upholding quality, our Unity Labs researchers developed an algorithm in conjunction with Lucasfilm and NVIDIA (see the paper they produced, Combining Analytic Direct Illumination and Stochastic Shadows). With this approach, the visibility (area shadow) can be separated from the direct lighting evaluation, while the visual result remains intact. Coupled with a denoising technique applied separately on these two components, we were able to launch very few rays (just four in the real-time demo) for large textured area lights and achieve our 30 fps target.Indirect diffuse or diffuse light bouncing enhances the lighting of a scene by grounding objects and reacting to changing lighting conditions. The usual workflow for game production is painful and relies on setting up Light Probes in a scene or using lightmaps. For the movie, we used a brute force one-bounce indirect diffuse approach with ray tracing – with this approach several rays are launched, allowing us to get the desired light bleeding effect.Such an approach gives artists immense freedom, without them having to set up anything. However, it is costly.For the real-time version, we selected a cheaper approach: with ray tracing, we were able to rebake a set of light probes each frame dynamically; traditionally, we would have used a set of pre-bake light probes baked lightmaps.Just as ray-traced reflection can replace the screen space technique, real-time ray tracing can generate ambient occlusion comparable to that produced by the widely used screen space technique. The resource-friendly indirect diffuse method mentioned above can be enhanced with ray-traced ambient occlusion to better handle the light leaks implied by the technique. For performance reasons, we chose not to support transparent objects, which would require handling the transmission of light through transparent objects.Real-time ray tracing is the only tool able to achieve the rendering of photorealistic headlights in real-time. The shape of a headlight and its multiple lens and reflector optics result in a complex light interaction that is challenging to simulate. We added the interaction of multiple successive smooth reflective and transmissive rays, which allows the light beam to shift as you see it in the real world. The fine details can be controlled with texture details that will influence the ray direction."Invent yourself and then reinvent yourself, don’t swim in the same slough. Invent yourself and then reinvent yourself and stay out of the clutches of mediocrity." – Charles BukowskiUnity’s real-time ray tracing is a new reality. We aren’t trying to rebuild traditional production pipelines from the ground up. But we are removing some of the pain points that are typically associated with a project like this. Having the power to interactively change shots and get immediate feedback from the creative director and director of photography is invaluable. Thanks to the decision to build this film in Unity, we could potentially migrate this work to other projects with ease and create a diverse yet cohesive campaign across multiple mediums. Real-time ray tracing is affording us the ability to refine the traditional automotive advertising production pipeline to work in a more creative, collaborative and affordable way.---You can explore NVIDIA RTX and Unity today with this experimental release. Please note that this is a prototype and the final implementation of DXR will be different from this version.

>access_file_
1467|blog.unity.com

Higher fidelity and smoother frame rates with Adaptive Performance

We recently wrapped up GDC 2019, where we spoke about Adaptive Performance during our Keynote. We’re excited to let you know that the Preview version and the Megacity mobile sample are now available so you can get started exploring this feature. This blog explains more about Adaptive Performance and how to apply it to your own projects.Unlike for a PC or console game, harnessing the full power of mobile hardware requires a delicate balance for games to look beautiful and play smoothly. Maxing out a device’s capabilities can quickly compromise your game’s performance by overtaxing the hardware, which leads to throttling, poor battery life, and inconsistent performance. For developers, this issue becomes even more problematic considering the wide range of low-end to high-end target devices.Today, developers take different tactics to solve this problem. The two main approaches we’ve seen are: trying to make sure games perform at their best on all target hardware, which means sacrificing graphics fidelity and frame rate, or attempting to anticipate hardware behavior, which is really difficult because there are not many options to precisely measure hardware trends.Adaptive Performance provides you with a better way to manage thermals and performance of your games on a device in real time, allowing you to proactively adjust on-the-fly performance and quality settings of your game and utilizing the hardware without overtaxing the device. The result is a predictable frame rate and a decrease in thermal buildup, enabling longer play times and a much more enjoyable player experience while preserving battery life.For developers, it means having a new, deeper insight into hardware with new tools to make your games more dynamic and flexible, providing your players with the smoothest and best-performing experiences when they're playing on mobile devices. It gives you control over decisions that usually the operating system makes, such as when to run at high clock speeds or what to adjust to avoid throttling.We gave several talks about this feature during GDC 2019. You can view the slide deck here and watch the Unity GDC Booth Talk - Megacity on mobile: How we optimized it with Adaptive Performance below.We’ve partnered with Samsung, the world’s largest Android mobile device manufacturer, to help bring this solution to fruition. Built on top of Samsung’s GameSDK, Adaptive Performance will first be available for Samsung Galaxy devices such as the Samsung Galaxy S10 and Galaxy Fold, followed by additional Samsung Galaxy devices later this year.These charts (shown during our Unity at GDC 2019 keynote) illustrate how Adaptive Performance helps deliver a steady high frame rate with Megacity running on the new Samsung Galaxy S10.In red, you can see the frame rate in Megacity before we added Adaptive Performance; and in blue, you can see the results after we added Adaptive Performance. With Adaptive Performance, the demo runs at 30 fps for a much longer time and is much more stable.Megacity is a futuristic, interactive city featuring millions of entities, demonstrating how Unity can run even the most complex projects on current-gen mobile hardware. It showcases the latest advances in our Data-Oriented Technology Stack (DOTS), the name for all projects under our “Performance by Default” banner, including Entity Component System (ECS), Native Collections, C# Job System, and the Burst Compiler. Megacity was first presented at Unite Los Angeles 2018 and was released for desktop during GDC 2019.Megacity is the right project to demonstrate a sample implementation of Adaptive Performance, as it provides us with the flexibility to adapt the game dynamically and proactively to best utilize the hardware. Adaptive Performance was built with scalability in mind, which works great with the principles of DOTS used to build the foundation in Megacity.The mobile version of the project has 4.5M mesh renderers, 200K building components, 100K audio sources, and more than 6M entities – an ideal candidate for demonstrating Adaptive Performance’s capabilities.After you install Adaptive Performance via the Unity Package Manager, Unity automatically adds the Samsung GameSDK subsystem to your project when you build to a device. During runtime, Unity creates and starts an Adaptive Performance Manager on supported devices, which provides you with feedback about the thermal state of the mobile device. You can subscribe to events or query the information from the Adaptive Performance Manager during runtime to react in real-time; otherwise, it will only report the stats to the console.As an example, you can use the API provided to create applications that react to the thermal trends and events on the device. This ensures constant frame rates over a longer period of time while avoiding thermal throttling, even before throttling begins. In the sample implementation of Adaptive Performance in Megacity, we used three different ways to smooth the frame rate:By starting at moderate CPU and GPU levels, and increasing them gradually to eliminate bottlenecks, we were able to keep energy consumption low.If we saw that the device was getting close to throttling, we could tune quality settings to reduce thermal load – and we decided to lower the LOD levels.We also decreased the target frame rate once we were close to throttling.When the target frame rate is reached and temperature is in decline, we increase LOD levels, raise target frame rate, and decrease CPU and GPU levels again.These capabilities enable your game to achieve a smoother performance over time. By keeping a close eye on a device’s thermal trends, you can adjust performance settings on the fly to avoid throttling altogether.Download the Megacity mobile sample project here, to see how we’ve done this. For feedback or questions about Megacity, please visit this forum thread.The heart of the package is the Adaptive Performance Manager, which Unity creates during startup, allowing you to access and subscribe for thermal and performance event notifications easily. The example below shows how to access the Adaptive Performance Manager using the IAdaptivePerformance interface in the Start function of your MonoBehaviour.private IAdaptivePerformance ap = null; void Start() { ap = Holder.instance; } Unity sends thermal events whenever there are changes in the thermal state of the device. The important states are when throttling is imminent and when throttling is occurring. In the example below, you subscribe to ThermalEvents to reduce or increase your lodBias, which helps to reduce GPU load.using UnityEngine; using UnityEngine.Mobile.AdaptivePerformance; public class AdaptiveLOD : MonoBehaviour { private IAdaptivePerformance ap = null; void Start() { if (Holder.instance == null) return; ap = Holder.instance; if (!ap.active) return; QualitySettings.lodBias = 1; ap.ThermalEvent += OnThermalEvent; } void OnThermalEvent(object obj, ThermalEventArgs ev) { switch (ev.warningLevel) { case PerformanceWarningLevel.NoWarning: QualitySettings.lodBias = 1; break; case PerformanceWarningLevel.ThrottlingImminent: QualitySettings.lodBias = 0.75f; break; case PerformanceWarningLevel.Throttling: QualitySettings.lodBias = 0.5f; break; } } } Note that if you reduce the lodBias below a value of 1, it will have a visual impact in many cases and LOD object-popping might occur, but it is an easy way to reduce graphics load if it is not required for the game experience. In case you want to make even more detailed decisions to fine-tune how your game’s graphics and behavior are handled, the bottleneck events are very useful.CPU and GPU performance levelsThe CPU and GPU of a mobile device make up a very large part of its power utilization, especially when running a game. Typically, the operating system decides which clock speeds are used for the CPU and GPU.CPU cores and GPUs are less efficient when running at their maximum clock speed. Running at high clock speeds overheats the mobile device easily and the operating system throttles the frequency of the CPU and GPU to cool down the device.You can avoid this situation by limiting the maximum-allowed clock speeds with these properties:IAdaptivePerformance.cpuLevelIAdaptivePerformance.gpuLevelThe application can configure those properties based on its special knowledge about the current performance requirements and decide, based on the scenario, if the levels should be lowered or raised.Did the application reach the target frame rate in the previous frames?Is the application in an in-game scene or in a menu?Is a heavy scene coming up next?Is an upcoming event CPU or GPU heavy?Will you show ads that do not require high CPU/GPU levels?public void EnterMenu(){ if (!ap.active) return; // Set low CPU and GPU level in menu ap.cpuLevel = 0; ap.gpuLevel = 0; // Set low target FPS Application.targetFrameRate = 15; } public void ExitMenu(){ // Set higher CPU and GPU level when going back into the game ap.cpuLevel = ap.maxCpuPerformanceLevel; ap.gpuLevel = ap.maxGpuPerformanceLevel; } In the Adaptive Performance Manager, you can subscribe to receive performance bottleneck events that let you know if you are GPU, CPU, or “frame-rate bound.” Frame-rate bound means that the game is limited by Application.targetFrameRate, in which case the application should consider lowering its performance requirements.Running in the background governing bottleneck decisions – and queryable via the Manager – is the GPU frametime driver, which monitors the hardware time the GPU spent on the last frame; for the moment, the CPU time is calculated by summing Unity’s internal subsystems. Depending on the game and scenario, you can have it react differently when the game is CPU or GPU bound according to thermal state changes.void OnBottleneckChange(object obj, PerformanceBottleneckChangeEventArgs ev) { switch (ev.bottleneck) { case PerformanceBottleneck.TargetFrameRate: if (ap.cpuLevel > 0) { ap.cpuLevel--; } if (ap.gpuLevel > 0) { ap.gpuLevel--; } break; case PerformanceBottleneck.GPU: if (ap.gpuLevel < ap.maxGpuPerformanceLevel) { ap.gpuLevel++; } break; case PerformanceBottleneck.CPU: if (ap.cpuLevel < ap.maxCpuPerformanceLevel) { ap.cpuLevel++; } break; } } There are many different ways to optimize games, and the samples above and in Megacity only provide some suggestions for how to do it; ultimately, it depends very much on what works best for your game. For more information, please also check the package documentation.This is only the beginning! We are going to continue to invest in Adaptive Performance, adding more features and supporting more devices over time. The current package includes a low-level API, but we are already working on a high-level, component-based API compatible with DOTS, which should make it even easier to adapt performance in your Unity projects. Stay tuned for more information.A Preview version of Adaptive Performance is available now for Unity 2019.1 (beta) via the Unity Package Manager. You can access it here. For up-to-date information on Adaptive Performance, to see how other developers are using it, and to post questions or comments, please visit the forum.

>access_file_
1470|blog.unity.com

Isometric 2D environments with Tilemap

With the release of Unity 2018.3, we introduced Isometric Tilemap support - closely following Hexagonal Tilemap support which was added in the 2018.2 release. The new Tilemap features provide a fast and performant way to create 2D environments based on isometric and hexagonal grid layouts, the likes of which are seen in many game classics, including the first entries of the Diablo and Fallout franchises, Civilization, Age of Empires, and many more. Both features build on top of the existing Tilemap system introduced back in Unity 2017.2, and working with them today is just as easy! They are also natively integrated into the Editor. In further Unity releases, they might be moved to the package manager. If you’re interested in following along and experimenting with the techniques shown, we’ve created a pre-configured Isometric Starter Kit project with an animated character and multiple environment tilesets, which you can download for free.Explore more 2D environments and tools that provide endless creative possibilitiesBefore we start working with Tilemap, it is important to set up our project correctly. Isometric Tilemap works with 2-dimensional sprites, and it relies on correct renderer sorting in order to create the illusion of a top-down isometric view. We need to make sure that the tiles that are further away from the viewer get painted first; and those that are closer painted on top of them.To customize the order in which 2D objects are painted on the screen, we can use Unity’s Custom Axis Sort feature. You can define this setting either per-camera (currently, this is the default way to do it in the Scriptable Render Pipelines, including LWRP and HDRP) - or globally at the project level.To define a Custom Axis Sort at the Project level, go to Edit > Project Settings > Graphics. In the Camera Settings section, you will see a Transparency Sort Mode dropdown, as well as the X, Y, and Z value settings for the Transparency Sort Axis.By default, the Transparency Sort Axis in Unity is set to (0, 0, 1) for XYZ respectively. However, all of our 2D tiles are actually on the same Z plane. Instead, we can determine which tiles are behind or in front by using their height on screen, rather than their depth. Tiles which are positioned higher on the screen will be sorted behind those which are placed lower. To sort the tiles based on height, change the Transparency Sort Mode to Custom; and set the Transparency Sort Axis values to (0, 1, 0).You can read the relevant Unity Documentation page for 2D sorting if you want to learn more about how it works.In some cases, you may also want to adjust the Z value of your Transparency Sort Axis. We will cover this in more depth later on in this blog post.The Tilemap feature consists of several components working together. The first two are the Grid and Tilemap Game Objects. To create a Grid, simply right-click anywhere in the Hierarchy, go to 2D Object, and select the type of Tilemap you wish to use. By default, each new Grid is created with one child Tilemap Game Object of the corresponding type. The currently available Tilemap types are as follows:Tilemap - creates a rectangular Grid and Tilemap. An example of using this Tilemap can be seen in Unity’s 2D Game Kit.Hexagonal Point Top Tilemap - creates a Hexagonal Grid and Tilemap, where one of the vertices of each hexagon is pointing upwards.Hexagonal Flat Top Tilemap - another Hexagonal Grid type, where the top of the hexagon is an edge which is parallel to the top of the screen.The last two types, Isometric and Isometric Z as Y, create two different implementations of the isometric grid. The difference between them arises when simulating different tile elevation levels, such as when we have a raised platform in our Isometric level.A regular Isometric Tilemap is best used when you wish to create separate Tilemap Game Objects for each individual elevation level of the tiles. This will simplify the process of creating automatic collision shapes - but you will have less flexibility when it comes to height variation between the tiles since all the tiles on one layer will have to be on the same ‘plane’.In the case of an Isometric Z as Y Tilemap, the Z position value of each tile works in combination with the custom Transparency Axis Sort setting to make the tiles appear as stacked on top of one another. When painting on a Z as Y Tilemap, we dynamically adjust the Z setting on the brush to switch between different heights. The Z as Y Tilemap requires an additional Z value in the Custom Transparency Sort Axis to render correctly.Note: The assets shown here are from the Temple tileset in our Isometric Starter Kit project. Feel free to grab it - completely free - and have some fun creating your own environments!Think of the Grid as the ‘easel’ that holds your Tilemap Game Objects – which are essentially canvases that you will be painting your tiles onto. To start painting on a Tilemap, you also need a brush and a palette. A Tile Palette is what holds your tile assets, after which you can pick them with the brush tool and start painting.To create a Tile Palette, choose Window > 2D > Tile Palette. In the newly opened window, in the top left dropdown choose “Create New Palette”. Make sure to set the grid type that corresponds to your use case. For this example, I will be using a regular Isometric Tilemap; as well as the assets from our Isometric Starter Kit project. Set the palette cell size to Manual to be able to customize the dimensions of your isometric tiles. In this case, I know that the dimensions of my tiles correspond to a grid of 1 in X and 0.5 in Y; however, for your use case, it will depend on the resolution, pixels per unit values selected at import and dimensions of the assets - essentially, on the isometric angle at which the tiles are rotated.You might be unsure about the correct import settings and tilemap size that will work for your assets. There is a general rule that you can follow here based on your initial asset dimensions. First, take a look at the resolution of your tiles. Typically, isometric tiles that are represented as a block are taller than they are wide; ‘flat’ tiles (ones that appear as a plane rather than a cube) are wider than they are tall. However, the width will always be the same between them. Therefore, if you want your tiles to take up exactly one Unity unit, set the Pixels Per Unit value in the tile import settings equal to their width in pixels. You may want to adjust this value in some cases - usually by decreasing it (or increasing the actual resolution of your assets); this could be useful if you are trying to produce an effect where some tiles appear to take up more than one grid cell and overlay the neighboring tiles.In order to decide on the correct Y grid value for the tiles, take the height of the base (or cap) of a single tile, and divide it by the width. This will give you a Y value relative to the X, provided that X is 1. Let’s look at some examples:For the pixel art that we are using in this project, all tiles have a base height of 32 pixels, and are 64 pixels wide. Therefore, the grid size that we will be using is exactly 0.5 in Y. The second block in the example image comes from an asset pack from Golden Skull Art. The example tile has been scaled down for reference, but the original assets are 128 pixels wide. The tile base is about 66 pixels tall, giving us a Y grid size of 66/128 - approximately 0.515 units.Once we have settled on the correct Grid dimensions, let’s go ahead and add some tiles to our palette. Simply grab one of your tile sprites and drag it over into the Tile Palette window. This will create a Tile Asset. It contains some information about the tile itself, such as the sprite(s) that it is using, a tint color, and the type of collider that it will generate. If you want to see the detailed information about a tile on the palette, choose the Select (S) tool at the top of the Tile Palette window and click on that tile. Now, in the Inspector, you should be able to see which Tile asset it is referencing.To paint the new Tile onto our Tilemap, select the Brush (B) tool, and click the Tile in the Palette. You will now be able to paint with the selected Tile in the scene view. Some other painting tools include the Eraser (D), Box Fill (U), Flood Fill (G), and the Tile Picker (I).Sometimes, you might also wish to edit the arrangement of the tiles in the palette itself. Just below the toolbar, you will see an Edit button. If you click it, you will go into the palette editing mode, during which the tools will affect the Tile Palette itself. Don’t forget to exit out of this mode once you’ve made the desired changes.In some cases, you might see a situation where tiles of different types are not sorting correctly, despite being on the same Tilemap, like in the example below:This is determined by the Mode setting on the Tilemap Renderer component. By default, the Mode is set to Chunk.Chunk mode is effective at reducing the performance cost of Tilemap. Instead of rendering each tile individually, it batch renders them in one go, as a large block. However, there are two main drawbacks to using it. The first one is the fact that it does not support dynamic sorting with other 2D objects in the scene. This means that if your Tilemap is in Chunk mode, it will not be able to dynamically sort behind and in front of other objects, such as characters - only one or the other will be possible at a time, based on the Order in Layer setting. However, it is still extremely effective when you want to optimize your game, and can be used to batch render large areas of the ground.However, this does not get around the issue of different tiles not sorting with each other. In order to batch-render tiles that come from two or more different sprites (i.e. textures), the sprites have to be unified under a single Sprite Atlas asset.To create a Sprite Atlas, choose Assets > Create > Sprite Atlas. In the Sprite Atlas settings, you will find the list of Objects for Packing. Simply drag all of the tiles that you wish to be batch rendered into this list, and set the correct import settings - usually equivalent to those on your individual sprites.Once you have done that, the tiles will sort correctly; but they will only be visible in this way when in Play mode or at runtime.As such, it is better to set your Tilemap Renderer Mode to Individual while editing. It will sort each tile separately; which means that you will see them correctly rendered even outside of Play mode - which is extremely useful when you are still making changes to your level. Once you have your level structure in place, you can always set the Tilemap Renderer Mode back to Chunk.Individual Render Mode is also useful when you want to add objects - such as trees, props, and elevated ground that you wish to sort dynamically with characters, or with each other. During this blog post, we will stick to using Individual Mode for all of our Tilemaps.Sometimes, you might want to use more than one Tilemap on the same Grid. In the case of Isometric and Hexagonal Tilemaps, it will be useful if you want to add prop objects to the level that also align with the grid; or if you want to add tiles that appear to be higher than the first layer.To attach another Tilemap to the grid, right-click on the Grid Game Object, and create a new Tilemap of the corresponding type.In order to switch to painting on the new Tilemap, go back to the Tile Palette window, and change the Active Tilemap just below the main toolbar.There are generally two ways to go about adding elevated ground to your levels. The one that you will most likely use depends on the type of tilemap you choose to go with. We’ll go over each of the possible cases.Additionally, we have prepared a short video on the topic, which demonstrates one of these approaches with a regular Isometric Tilemap; as well as adding collision areas to the tiles. Check it out if you want to have a quick video reference for both of these things:For normal Isometric Tilemaps, you can simply create a new Tilemap under the same Grid; and give it a higher Order in Layer value. You can then change the Tile Anchor setting to make the new layer anchor to a higher point on the grid.My ground-level Tilemap had a Tile Anchor of (0, 0) for X and Y respectively. I want my new layer to paint one unit higher; so I will change the new Tilemap’s anchor point to (1, 1). Additionally, I will give it an Order in Layer of 1 - just one unit higher than my base level.I can now change my active Tilemap to the one with the second height level and paint away.Sometimes it can be useful to simulate different heights using the same Tilemap. For this case, you can use a Z as Y Isometric Tilemap and Grid.With a Z as Y Tilemap, the Z value of each tile has an additional influence on tile rendering order. We can adjust the Z value that tiles have while we are painting them, using the Z Position setting on our brush in the bottom part of the Tile Palette (which can also be changed using the ‘+’ and ‘-’ hotkeys):However, in order for our Z value to contribute properly and for the tiles to sort correctly, we need to go back to our Custom Axis Sort value and add a Z influence. The number that we use here is directly connected to the way in which Unity converts cell positions on an isometric grid to world space values.For example, a grid with XYZ dimensions of (1, 0.5, 1) - the default for isometric - will have a Z-axis sort value of -0.26. If you are curious as to how this number is calculated, or you are using a grid with a different cell size - read on to learn how to find the right Z value for your case.Once you have set the correct Custom Axis Sort value, you can start painting tiles that have different Z values. You can also adjust the increments in which the Z value moves the elevated tiles up or down by changing the Grid’s Z dimension - set to 1 by default.There is a general formula you can use to work out the Z value of your axis sort. First, take the Y dimension of your grid. If you haven’t worked out your Y dimension yet, take a look at the note on importing assets at the top of this blog post. Multiply this value by negative 0.5, and subtract an additional 0.01 from it.Following this formula, a grid that has the dimensions (1, 0.5, 1) will give us a Z sorting value of -0.26 (negative point twenty-six). At this axis sort value, any (1, 0.5, 1) grid will have its tiles sorting correctly.If you want to find out more about where this value and calculation comes from, take a look at the documentation here. It explains in great depth how 2D renderers work, and what method is used when converting isometric cells into world space values.Now that we have some tiles placed higher than others, we can control the areas that the player can go to and transition between them using collision.There are many approaches to adding collision; but in our case, we want the player to ascend and descend along the level using a ramp, and as such, it is not obvious which objects should or shouldn’t have colliders on them. Instead, we can define collision by hand using an additional Tilemap.In this project, we have created some sprites that correspond to the different shapes that we will use to define our collision areas. We can paint these shapes onto our third Tilemap, in the areas that we do not want the player to pass over. For example, we want the player to be able to ascend to the cliff only using the ramp, rather than walking onto it directly.We can also add a custom Material in our Tilemap Renderer component in order to tint the tiles a different color that is distinct from the rest of our level.Once we have placed our collision tiles, we can add a Tilemap Collider component to the collision Tilemap. This will auto-generate colliders for each individual tile based on the shape of the sprite.For better performance, we can also add a Composite Collider 2D component, and make sure to tick Used by Composite on our Tilemap Collider. This unifies all of our individual colliders into one big shape.Adding props to the level is quite simple. You can either manually place the prop sprites at any desired point in the scene, or you can attach the props to the Tilemap Grid by making them into individual tiles. You can decide which approach works best for your case.In this project, we’ve manually placed some trees around the level. The trees and the character have the same Order in Layer, allowing our character to sort behind and in front of them dynamically.We can define the point at which the player can pass the tree by using a collider. There are several ways in which we can do this.The first one, as demonstrated in the videos, is to attach a child collider to the object, and change its shape as needed.The other method is to define a Custom Physics Shape for the object within the Sprite Editor.To open the Sprite Editor, select the object’s sprite and find the Sprite Editor button in the Inspector. In the top-left dropdown, switch to the Custom Physics Shape editor. Here, you are able to create a polygonal shape to define the bounds of your custom collider.Once you have defined a physics shape, you can attach a Polygon Collider component to your object, and it will correspond to that shape.If you are using your props as tiles on a Tilemap, you could also use a Grid collider. Select the Tile Asset that corresponds to a prop tile (if you need a refresher on where to find it, take a look at the Basic Tilemap Workflow section). You will be able to see a dropdown setting for the Collider Type. By default, it is set to sprite - meaning the auto-generated collider will use the Physics Shape we talked about earlier. If you set it to grid, however, it will always exactly match with the shape of the grid cell that the prop is attached to. It may not be the most accurate way of implementing colliders but could be useful for a specific type of game.To use the grid colliders for these tiles, select the Tilemap with your props and add a Tilemap Collider component.Rule Tiles are extremely useful when it comes to automating the tile painting workflow. A Rule Tile acts as a normal tile, with an additional list of tiling parameters. Using these parameters - or rules - the tile can automatically choose which sprite should be painted based on its neighboring tiles.Rule Tiles are useful when you want to avoid hand-picking differently rotated tiles - for example, when creating a cliff or platform. They can also be used to randomize between different variations of the same tile to avoid obvious patterns, and even to create animated tiles.Isometric and Hexagonal Rule Tiles are available from Unity’s 2D Extras repository on GitHub. They also contain many other handy assets for the Tilemap feature that you may want to explore.We have also included pre-configured Rule Tiles for each of the different tilesets in our Isometric Starter Kit project. Here are some examples of the tiles included in the project for you to experiment with:Now that you’ve learned the ins and outs of Isometric Tilemaps in Unity, download the Isometric Starter kit project here and try it out yourself! It’s also possible to interact with Tilemaps via script if you are a programmer, so that might be something you want to try as well.For example, you can find out how you can implement a simple character controller which works with Isometric Tilemap by taking a look at this video:The artwork in this project was created for Unity by @castpixel and you can see more of her work here. If you are looking for additional 2D assets to experiment with using Tilemaps, you can check out the Unity Asset Store as well.---Learn best practices using Tilemap with beginner and advanced content on the Unity Learn Premium platform.

>access_file_
1473|blog.unity.com

2D Pixel Perfect: How to set up your Unity project for retro 8-bit games

Take your 2D development experience to the next level and explore Unity's native suite of 2D tools.Retro games with simple mechanics and pixelated graphics can evoke fond memories for veteran gamers, while also being approachable to younger audiences. Nowadays, many games are labeled as “retro”, but it takes effort and planning to create a title that truly has that nostalgic look and feel. That’s why we’ve invited the folks from Mega Cat Studios to help us talk about the topic. In this blog post, we’ll be covering everything you need to create authentic art for NES-style games, including important Unity settings, graphics structures, and color palettes. Get our sample project and follow along!Mega Cat Studios, out of Pittsburgh, Pennsylvania, has turned the creation of highly accurate retro games into an art form. So much so, in fact, that several of their titles can also be acquired in cartridge form and played on retro consoles like the Sega Genesis.Recent additions to the Unity workflows have made it a well-suited environment for creating your retro games. The 2D Tilemap system has been made even better and now supports grid, hex, and isometric tilemaps! Additionally, you can use the new Pixel Perfect Camera component to achieve consistent pixel-based motion and visuals. You can even go so far as to use the Post Processing Stack to add all sorts of cool retro screen effects. Before any of this work can be done, however, your assets will need to be imported and set up correctly.Our assets first need a correct configuration to be crisp and clear. For each asset you’re using, select the asset in the Project view, and then change the following settings in the inspector:Filter mode changed to ‘Point’Compression changed to ‘NoneOther filter modes result in a slightly blurred image, which ruins the crisp pixel-art style we’re looking for. If compression is used, the data of the image will be compressed which results in some loss of accuracy to the original. This is important to note, as it can cause some pixels to change color, possibly resulting in a change to the overall color palette itself. The fewer colors and the smaller your sprite, the greater the visual difference compression causes. Here’s a comparison between normal compression (default) and no compression.Another thing to be aware of is the Max Size setting for the image in the Inspector. If your sprite image has a size on any axis greater than the ‘Max Size’ property (2048 by default) it will be automatically resized to the max size. This will usually result in some loss of quality and cause the image to become blurry. Since some hardware will not properly support textures over 2048 on either axis, it is a good idea to try to stay within that limit.Above, is a sprite from a spritesheet that was 2208 on one axis with max size set at 2048. As you can see, increasing the Max Size property to 4096 allows the image to be sized appropriately and avoid a loss of quality.Finally, when preparing your sprite or sprite sheet, make sure you set the pivot unit mode to ‘Pixels’ instead of ‘Normalized’.This is so the sprite’s pivot point will be based upon pixels rather than a smooth range from 0 to 1 across each axis of the image. If the sprite were to not pivot from a pixel exactly, we would lose pixel-perfectness. Pivots can be set for sprites in the Sprite Editor, which can be opened from the Inspector when you have a sprite asset selected.With assets prepared, we can set our camera up to be “pixel-perfect”. A pixel-perfect result will look clean and crisp. Telltale signs of pixel art which aren’t displayed as pixel-perfect includes blurriness (aliasing), and some pixels appearing rectangular when they should be square.The 2D Pixel Perfect package can be imported through the Package Manager in Unity. Click the ‘Window’ menu in the toolbar followed by ‘Package Manager’. In the new window, click ‘Advanced’ and make sure you have enabled ‘Show preview packages’. Select 2D Pixel Perfect from the list on the left, and select install on the top right of the window.That’s it. Now you are ready to begin using the pixel-perfect camera component.The Pixel Perfect Camera component is added to and augments Unity’s Camera component. To add it, go to your main camera and add the Pixel Perfect Camera component to it. If the Pixel Perfect Camera component option is not there, follow the previously stated instructions to first import it into the project.Now let’s look at the settings we have available.First, I recommend checking ‘Run In Edit Mode’ and setting the display aspect ratio in the Game view to ‘Free Aspect’ so you can resize the game view freely. The component will display helpful messages in the game view explaining if the display is not pixel-perfect at any given resolution.Now, you can go through each setting to see what they do and how they affect the look of your game!Assets Pixels Per Unit - This field is in reference to the setting you can select in the inspector for each asset. As a general rule of thumb, each asset that will be used in the game’s world space should use the same pixels per unit (PPU), and you’d put that value here as well. If your game world exists as a grid of tiles and sprites, with each being 16 pixels by 16 pixels, a PPU of 16 would make sense - each tile of the grid would be 1 unit in worldspace coordinates. Make sure you put your chosen PPU here.Reference Resolution - Set this to the resolution that you intend all of your assets to be viewed at. If you want a retro look, this usually means a very small resolution. For example, the native resolution for the Sega Genesis is 320x224. When porting a game from Sega Genesis, we would use a reference resolution of 320x224. For general 16:9 usage, 320x180, as well as, 398x224 (if you want to keep the vertical resolution instead) should work well.Upscale Render Texture - This causes the scene to be rendered at as close to the reference resolution as possible and then be upscaled to the fit the actual display size. Because this setting results in a filled screen, we recommend it if you want a full-screen pixel-perfect experience with no margins. ‘Upscale Render Texture’ will also significantly affect how sprites look when rotated.Pixel Snapping (only available with Upscale Render Texture disabled) - With this enabled, sprite renderers will be snapped to a world-space grid automatically, where the grid’s size is based off of your chosen PPU. Note that this does not actually affect any object’s transform positions. As a result, you can still smoothly interpolate objects between positions, but the visual movement will remain pixel-perfect and snappy.Example:Crop Frame (X and Y) - This crops the viewed region of worldspace to exactly match the reference resolution, and adds black margins to the display to fill the gaps at the edges of the screen.Stretch Fill - Becomes available if you enable both x and y for Crop Frame. This causes the camera to scale to the game view to fit the screen in a way that preserves aspect ratio. Because this scaling won’t happen only in whole number multiples of the reference resolution, it will cause pixel-perfectness to be lost at any resolution which is not a whole number multiple of the reference resolution. The advantage here is that even though you lose pixel-perfectness for many resolutions, you won’t have the black bar margins and will instead have a fully filled screen. Note that although blurring often occurs from stretch fill, the usual alert display message does not show up.If you want a pixel-perfect and snappy display that will work for a variety of use-case, I recommend:Use a reference resolution that will never be bigger than a player’s window resolution (such as 320x180).Enable or Disable Upscale Render TextureEnable it if you will use rotations outside of 90, 180, and 270 and if you prefer the visual effect it has on rotated sprites.Upscaled render texture can result in a non-pixel-perfect image at some resolutions, depending on your reference resolution. Experiment with this and different screen resolutions using ‘Run in Edit Mode’ enabled on the Pixel. Perfect Camera component to determine whether this is an issue for your resolution. If you can get this to produce a pixel-perfect image at all target resolutions, this will result in the best full-screen pixel-perfect experience.Enable or Disable Pixel Snapping as you preferThis is more personal preference than anything. Without snapping, you have much smoother movement, but pixels can be out of alignment.Enable Crop Frame X and/or Y if not using Upscale Render TextureIf you can’t consistently get a pixel-perfect result with upscale render texture, cropping X and/or Y will ensure a pixel-perfect image for any resolution greater than the reference resolution, but creates big margins at the edges of the screen for some resolutions.Disable Stretch FillWe recommend setting the camera to be optimized for 16:9 aspect ratio viewing, including reference resolution if possible. At the time of writing, most gamers play on 16:9 monitors, and in 1920x1080 resolution. For example, 320x180 reference resolution is 16:9, and so it will have no black bar margins when played at 1920x1080 or any resolution which is an even multiple of 320x180, such as 1280x720.In Unity’s toolbar, you can go under Edit > Project Settings > Player and limit the aspect ratios that the game will support. If you find a particular configuration works just as you want in the ratio you’re targeting but looks bad in some particular aspect ratios, you can prevent the window from being at those ratios here. However, keep in mind that not all users will have a display setup that will work well with your limitations, so this is not recommended. Instead, enable cropping so these users will have margins, rather than having to play in a resolution which doesn’t fit their screen.Now that we’ve covered how to set Unity up for pixel-perfect art, let’s look at the basics of creating artwork for games that follow the restrictions of the classic Nintendo Entertainment System. This console generation places a large number of restrictions on the artists trying to create an authentic image. These restrictions include things like palettes used and the size and amount of objects on a screen. Additionally, it is import to keep in mind is the reference resolution of 256x240 when “targeting” this console.When creating artwork that is genuine to the NES, there are a host of restrictions that the artist will have to follow. Some of these will be consistent no matter what retro console an artist is attempting to emulate, while many others are specific to the NES itself. The first, and possibly the most important of these restrictions involve the way color palettes are used in an image. The NES is fairly unique when it comes to its color palette because the full-color palette of the console is hardcoded into the console. The NES chooses which colors to use in an image by sending a series of values to the graphics processor on the NES, and then the graphics processor returns the colors associated with those values. Below is an image of the NES’ color palette:These colors cannot be changed due to the fact that they are part of the console themselves. Every game you have ever seen for this console uses combinations of these colors in order to make their images.To create the combinations that are used in the game, sub-palettes are created and assigned to either the in-game sprites or background elements. The NES breaks its palette up into sub-palettes that can be assigned to sprites and backgrounds. Each sub-palette includes one common color that is used across all of the sub-palettes and three unique colors. It is capable of loading four sub-palettes for the backgrounds and four sub-palettes for the sprites. In the case of the sprites, the common color at the beginning of each sub-palette is treated as transparency.This is an example of a series of sub-palettes that are being used in a game. The top row represents the background sub-palettes and the bottom row represents the sprite sub-palettes. In this example, black is being used as the common color that is shared across all of the sub-palettes. Because the common color is treated as transparency on sprites, a second black palette entry is needed to be made for the sprite sub-palettes, in order to use it as a visible color.The restrictions on palette use get even tighter as the artist moves on to how the palettes are used in the game. To explain this, there needs to be further discussion on how retro consoles store, use, and display art. The artwork in any retro console is stored in the game as 8x8 px tiles. Using this tile-based approach allows artists to save space by reusing tiles for different things. (For example, pieces of a sidewalk can be repurposed and used to make the ledge on a building). The other important thing to note about tile based storage is that color information is generally not saved with the graphics. All of the tiles are saved with a monochromatic palette. This way, whenever a tile is displayed in the game it can have a sub-palette assigned to it, allowing the same tile to be simultaneously displayed on screen with different sub-palettes This is significant when creating artwork that is true to a retro console on a modern platform because it affects how you assign palettes to the artwork.The NES assigns palettes to sprites and backgrounds differently. It assigns sub-palettes for sprites on a tile-by-tile basis. That means that every 8x8 tile in a sprite can have one of the four sprite sub-palettes assigned to it.Backgrounds, on the other hand, are much more restrictive. Backgrounds assign their palettes in 16x16 chunks. The sub-palette assignments for an entire screen's worth of background are referred to as Attribute Tables. These Attribute Tables are the reason why most retro artwork involves heavy use of repeating tiled segments. Those segments tend to be composed of 16x16 tiles so that they neatly fit into an Attribute Table. Despite being in response to a hardware restriction, this 16x16 tile-based approach to backgrounds ended up being a defining characteristic of retro artwork and is absolutely necessary when trying to recreate it.Even though artists are free to use different sub-palettes for each 8x8 tile of a sprite, they might find themselves in a situation where they want to have a greater color depth in a sprite than what is already available. This is where sprite layering can come in. Sprite layering is simply splitting a sprite up into two separate sprites and then placing them on top of each other. This allows artists to circumvent the one sub-palette per 8x8 tile restriction. Doing this will essentially allow artists to double the number of colors that can be used in a single 8x8 area. The only major drawback of doing this is sprite rendering limits. The NES is only capable of displaying 64 8x8 sprite tiles on screen at once, and only 8 sprite tiles in the same horizontal line with one another. Once those numbers are reached, any further sprite tiles will not be rendered on screen. This is why many NES games would flicker sprites when there was a lot of them on the screen at once. That way, it’s only displaying certain sprites on alternating frames. These limits are something artists need to be mindful of when they are layering sprites on top of each other because while it doubles the number of colors, it also doubles the number of sprite tiles on the same horizontal line.Sprite layering can also be done with the background to get around the Attribute Table limits. This trick is generally used for static images, like story screens and character portraits, to give them a much greater color depth. In order to do this, the artist would draw part of the image as the background and then layer sprites on top of it to fill in the rest.To explain the next major restriction of the NES, first, we need to circle back to the fact that graphics are stored in tiles. Graphics tiles are stored in 256 tile pages and tiles from these pages cannot be loaded into VRAM in different locations, so it becomes difficult to mix and match tiles from different pages on the fly. The NES’ VRAM is only capable of displaying 512 of these tiles at once. Beyond just that restriction, it splits the tiles in half for sprites and background. That means it is only capable of displaying 256 sprite tiles and 256 background tiles at any given moment. This can become very restrictive if the artist wants to display a large variety of sprites and background elements.In order to combat this limitation, the NES has a feature that allows the artist to break each page up into partial pages called banks. So while the NES isn’t capable of loading individual tiles from various points in the graphics data, it is capable of loading different sections of a page at different times. For most games, these banks are either going to be 1K or 2K banks. A 1K bank equals one-fourth of a page or 64 tiles, while a 2K bank is half of a page or 128 tiles. The artist must decide if they want to reserve the use of each type of bank for either Sprites or Background elements because both types of banks need to be utilized. That means that you cannot have 1K banks for both the sprites and backgrounds. One page needs to use 1K banks and the other needs to use 2K. Generally speaking, most games tend to use 1K banks for the sprites and 2K banks for the backgrounds because background tilesets tend to be more static and need less in terms of on the fly variety.The usefulness of 1K banks for sprites is pretty significant. If the player sprite has a large range of animations that will not fit in a single page along with all of the other sprites that need to be loaded, individual actions can be saved in 1K banks and then swapped between depending on what action is happening on screen. It also allows for a larger variety of sprites that can be used in a single area of a game. For instance, if the player is to encounter six different kinds of enemies in an area of a game, but the sprite page only allows for the player and three other types of sprites, then when one enemy type is cleared off of the screen, the game can swap one of the enemy banks in for a new enemy type.One of the only major drawbacks of using 1K banks for sprites and 2K banks for backgrounds is how the NES handles background animation. In order to animate a background element for a NES game, the artist has to create duplicate banks of the animated background elements. Each new duplicate bank will contain the next frame of animation for each of the animated elements. These banks are then swapped in and out one at a time like a flip-book, in order to create the animation. If the artist is using half-page banks for the backgrounds, then storing all of those duplicate banks can take up a lot of space. One way to circumvent this though is to put all of the animated background elements for the entire game into a single bank. But, that also leaves the artist with the restriction of only having 128 tiles left over for the static elements for each background. It is up to the artist to decide the best course of action when deciding what kinds of banks they are going to use for the art.Many games from that era will employ tricks to create effects like parallax scrolling in the background, but these too present the artists and designers with a challenge. While the later 16-bit consoles allowed for multiple background layers, this is not an option on the NES. All backgrounds are a single flattened image. In order to create a sense of depth and layering, different programming tricks were used. In order to create a parallax background, for instance, the developer is able to set a register that can tell when a certain horizontal line (known as a raster line) is being rendered on the screen. They can then use that register to control the speed and direction that the screen is scrolling in. By using that, they can create a horizontal row of the background that scrolls at a different speed as the rest of the background. The trick for the artists and designers at this point is to be mindful that the background is still one flat image. If a platform or any other element that is supposed to be “in front” of that slower moving background is placed in that region, then it too will scroll slower than the rest of the image. That means that designers need to be mindful of where they are placing background elements in the scene, and artists need to create the background in a way that the effect will be seamless.There's also another trick for artists that want to have one of their background elements appear in the foreground. On the NES, developers are able to set a sprite’s priority to be less than zero. When this is done, it will cause the sprite to be displayed behind any non-transparent background pixels. Sprite priorities can be modified and triggered on the fly as well, allowing for certain elements to change a sprite’s priority as needed.When someone is trying to create a project that is authentic to a retro console, there are many technical considerations that they need to keep in mind that might not be things that modern development has to worry about. Due to the way older machines would render images and handle having small amounts of room to maneuver with the CPU and GPU, the designers would have to think creatively to work around the hardware’s limitations. In the modern age, it becomes important to learn about those limitations and the techniques, in order to truly recreate the look and design of games from that era. In the next post, we will look at the design limitations imposed by the 16-bit era as well as the Unity work needed to get that truly “old TV” feel. The 2D Pixel Perfect guide for 16 bits retro visuals is now available here.---First time designing levels with Tilemap? Explore worldbuilding in 2D in this beginner tutorial on Unity Learn.

>access_file_
1475|blog.unity.com

On DOTS: Entity Component System

This is one of several posts about our new Data-Oriented Tech Stack (DOTS), sharing some insights into how and why we got to where we are today, and where we’re going next.In my last post, I talked about HPC# and Burst as low-level foundational technologies for Unity going forward. I like to refer to this level of our stack as the “game engine engine”. Anyone can use this stack to write a game engine. We can. We will. You can too. Don’t like ours? Write your own, or modify ours to your liking.The next layer we’re building on top is a new component system. Unity has always been centered around the concepts of components. You add a Rigidbody component to a GameObject and it will start falling. You add a Light component to a GameObject and it will start emitting light. Add an AudioEmitter component and the GameObject will start producing sound.It’s a very natural concept for programmers and non-programmers alike, and easy to build intuitive UIs for. I’m actually quite amazed at how well this concept has aged. So well that we want to keep it.What hasn’t aged well is how we implemented our component system. It was written with an object-oriented mindset. Components and GameObjects are “heavy c++” objects. Creating/destroying them requires a mutex lock to modify the global list of id->objectpointers. All GameObjects have a name. Each one gets a C# wrapper object that points to the C++ one. That C# object could be anywhere in memory. The C++ object can also be anywhere in memory. Cache misses galore. We try to mitigate the symptoms as best we can, but there’s only so much you can do.With a data-oriented mindset, we can do much better. We can keep the same nice properties from a user point of view (add a Rigidbody component, and the thing will fall), but also get amazing performance and parallelism with our new component system.This new component system is our Entity Component System (ECS). Very roughly speaking, what you do with a GameObject today you do with an Entity in the new system. Components are still called components. So what’s different? The data layout.Let’s look at some common data access patterns A typical component that you would write in Unity in the traditional way might look like this:class Orbit : MonoBehaviour {    public Transform _objectToOrbitAround;    void Update()    {        //please ignore this math is all broken, that's not the point here :)        var currentPos = GetComponent().position;        var targetPos = _objectToOrbitAround.position;        GetComponent().velocity += SomehowSteerTowards(currentPos,targetPos)    } } This pattern comes back over and over. A component has to find one or more other components on the same GameObject and read/write some values on it.There are a lot of things wrong with this:The Update() method gets called for a single orbit component. The next Update() call might be for a completely different component, likely causing this code to be evicted from the cache the next time it has to run this frame for another Orbit component.Update() has to use GetComponent() to go and find its Rigidbody. (It could be cached instead, but then you have to be careful about the Rigidbody component not being destroyed).The other components we’re operating on are in completely different places in memory.The data layout ECS uses recognizes that this is a very common pattern and optimizes memory layout to make operations like this fast.ECS groups all entities that have the exact same set of components together in memory. It calls such a set an archetype. An example of an archetype is: “Position & Velocity & Rigidbody & Collider”. ECS allocates memory in chunks of 16k. Each chunk will only contain the component data for entities of a single archetype.Instead of having the user Update method searching for other components to operate on at runtime, per Orbit instance, in ECS land you have to statically declare “I want to run some operations on all entities that have both a Velocity and a Rigidbody and an Orbit component. To find all those entities, we simply find all archetypes that match a specific “component search query”. Each archetype has a list of Chunks where entities of that archetype are stored. We loop over all those chunks, and inside each of the chunks, we’re doing a linear loop of tightly packed memory, to read and write the component data. This linear loop that runs the same code on each entity also makes for a likely vectorization opportunity for Burst.In many cases, this process can be trivially split up into several jobs, making the code operating the ECS component run on nearly 100% core utilization.ECS does all this work for you, you just need to supply the code that you want to run on each entity. (You can do the chunk iteration manually if you want to though.)When you add/remove a component from an Entity, it switches archetype. We move it from its current chunk to a chunk of the new archetype, and back swap the last entity of the previous chunk to “fill the hole”.In ECS, you also statically declare what you intend to do with the component data. ReadOnly or ReadWrite. By promising (the promise is verified) to only read from the Position component, ECS can get more efficient scheduling of its jobs. Other jobs that also want to read from the Position component won’t have to wait.This data layout also allows us to deal with a long-standing frustration we’ve had, which are load times and serialization performance. Loading/streaming ECS data for a big scene isn’t much more than just loading raw bytes from disk and using them as is.This is the reason the Megacity demo loads in a few seconds on a phone.While entities can do what game objects do today, they can do more because they are so lightweight. In fact, what really is an Entity? In an earlier draft of this post I wrote “we store entities in chunks”, and later changed it to “we store component data for entities in chunks”. It’s an important distinction to make, to realize that an Entity is just a 32-bit integer. There is nothing to store or allocate for it, other than the data of its components. Because they’re so cheap, you can use them for scenarios that game objects weren’t suitable for. Like using an entity for each individual particle in a particle system.The next layer we need to build is very big. It’s the “game engine” layer composed of features like “renderer”, “physics”, “networking”, “input”, “animation”, etc. This is roughly where we are today. We have started to work on these pieces, but they won’t be ready overnight.That might sound like a bummer. In a way it is, but in another way, it’s not. Because ECS and everything built on top of it are written in C#, it can run inside of traditional Unity. Because it runs inside of Unity, you can write ECS components that use pre-ECS functionality. There is no pure ECS mesh drawing system right now. However, you can write an ECS MeshRenderSystem that uses pre-ECS Graphics.DrawMeshIndirect API as an implementation, while you wait for a pure ECS version to ship. This is exactly the technique that our Megacity demo uses. Loading/Streaming/Culling/LODding/Animation is done with pure ECS systems, but the final drawing is not.So you can mix & match. What’s great about that is you can already reap the benefits of Burst codegen, and ECS performance for your game code, instead of having to wait for us to ship pure ECS versions of all subsystems. What’s not great about it is that in this transition phase, you can see and feel this friction that you’re “using two different worlds that are glued together”.We will ship all the source code to our ECS HPC# subsystems in packages. You can inspect, debug, modify each subsystem, as well as have more fine-grained control over when you want to upgrade which subsystem. You could, for example, upgrade the Physics subsystem package without upgrading anything else.Game Objects aren’t going anywhere. People have successfully shipped amazing games on it for over a decade. That foundation isn’t going anywhere.What will change is that you will over time see our energy to make improvements tilt from going exclusively into the game object world, towards the ECS world.A common, very valid, point people bring up when looking at ECS, is that there’s a lot of typing. A lot of boilerplate code that stands in between you and what you’re trying to achieve.There are a lot of improvements on the horizon that aim to remove the need for most boilerplate and make it simpler to express your intent. We haven’t implemented many of them yet as we’ve been focussing on the foundational performance, but we believe there is no good reason for ECS game code to have much boilerplate code, or be particularly more work to write than writing a MonoBehaviour.Project Tiny has already implemented some of these improvements (like a lambda based iteration API). Speaking of which..Project Tiny will ship on top of the same C# ECS as this blog post has been talking about. Project Tiny will be a big ECS milestone for us in several ways:It will be able to run in a complete ECS-only environment. A new player with no baggage from the past.That means it's also pure-ECS and has to ship with all ECS subsystems a real (tiny) game needs.We'll adopt Project Tiny's Editor support for Entity editing for all ECS scenarios, not just tiny.We have job openings for all the different parts of the DOTS stack, particularly in Burbank and Copenhagen, check out careers.unity.com.Also, make sure to join us on Unity Entity Component System and C# Job System forum to give feedback and get information on experimental and preview features.

>access_file_
1478|blog.unity.com

SRP Batcher: Speed up your rendering

In 2018, we’ve introduced a highly customizable rendering technology we call Scriptable Render Pipeline (SRP). A part of this is a new low-level engine rendering loop called SRP Batcher that can speed up your CPU during rendering by 1.2x to 4x, depending on the Scene. Let’s see how to use this feature at its best!This video shows the worst case scenario for Unity: each object is dynamic and uses a different material (color, texture). This scene shows many similar meshes but it would run the same with one different mesh per object (so GPU instancing can’t be used). The speedup is about 4x on PlayStation 4 (this video is PC, Dx11).NOTE: when we talk about x4 speedup, we’re talking about the CPU rendering code (the “RenderLoop.Draw” and “ShadowLoop.Draw” profiler markers). We're not talking about global framerate (FPS)).The Unity editor has a really flexible rendering engine. You can modify any Material property at any time during a frame. Plus, Unity historically was made for non-constant buffers, supporting Graphics APIs such as DirectX9. However, such nice features have some drawbacks. For example, there is a lot of work to do when a DrawCall is using a new Material. So basically, the more Materials you have in a Scene, the more CPU will be required to setup GPU data.During the inner render loop, when a new Material is detected, the CPU collects all properties and sets up different constant buffers in the GPU memory. The number of GPU buffers depends on how the Shader declares its CBUFFERs.When we made the SRP technology, we had to rewrite some low-level engine parts. We saw a great opportunity to natively integrate some new paradigms, such as GPU data persistence. We aimed to speed up the general case where a Scene uses a lot of different Materials, but very few Shader variants.Now, low-level render loops can make material data persistent in the GPU memory. If the Material content does not change, there is no need to set up and upload the buffer to the GPU. Plus, we use a dedicated code path to quickly update Built-in engine properties in a large GPU buffer. Now the new flow chart looks like:Here, the CPU is only handling the built-in engine properties, labeled object matrix transform. All Materials have persistent CBUFFERs located in the GPU memory, which are ready to use. To sum up, the speedup comes from two different things:Each material content is now persistent in GPU memoryA dedicated code is managing a large “per object” GPU CBUFFERYour project must be using either the Lightweight Render Pipeline (LWRP), the High Definition Render Pipeline (HDRP), or your own custom SRP. To activate the SRP Batcher in HDRP or LWRP, just use the checkbox in the SRP Asset Inspector.If you want to enable/disable SRP Batcher at runtime, to benchmark performance benefits, you can also toggle this global variable using C# code:GraphicsSettings.useScriptableRenderPipelineBatching = true;For an object to be rendered through the SRP Batcher code path, there are two requirements:1. The object must be in a mesh. It cannot be a particle or a skinned mesh.2. You must use a Shader that is compatible with the SRP Batcher. All Lit and Unlit Shaders in HDRP and LWRP fit this requirement.For a Shader to be compatible with SRP:All built-in engine properties must be declared in a single CBUFFER named “UnityPerDraw”. For example, unity_ObjectToWorld, or unity_SHAr.All Material properties must be declared in a single CBUFFER named “UnityPerMaterial”.You can see the compatibility status of a Shader in the Inspector panel. This compatibility section is only displayed if your Project is SRP based.In any given Scene, some objects are SRP Batcher compatible, some are not. But the Scene is still rendered properly. Compatible objects will use SRP Batcher code path, and others still use the standard SRP code path.If you want to measure the speed increase with SRP Batcher in your specific Scene, you could use the SRPBatcherProfiler.cs C# script. Just add the script in your Scene. When this script is running, you can toggle the overlay display using F8 key. You can also turn SRP Batcher ON and OFF during play using F9 key. If you enable the overlay in PLAY mode (F8) you should see a lot of useful information:Here, all time is measured in milliseconds (ms). Those time measurements show the CPU spent in Unity SRP rendering loops.NOTE: timing means cumulated time of all “RenderLoop.Draw” and “Shadows.Draw” markers called during a frame, whatever the thread owner. When you see “1.31ms SRP Batcher code path”, maybe 0.31ms is spent on main thread, and 1ms is spread over all of the graphic jobs.In this table, you can see a description of each setting in the Overlay visible in PLAY mode, from top to bottom:NOTE: We hesitate to add FPS at the bottom of the overlay because you should be very careful about FPS metrics when optimizing. First, FPS is not linear, so seeing FPS increase by 20% didn’t tell you immediately how much you optimized your scene. Second, FPS is global over the frame. FPS (or global frame timing) depends on many other things than rendering, like C# gameplay, Physics, Culling, etc.You can get SRPBatcherProfiler.cs from a SRP Batcher project template on GitHub.Here are some Unity scenes shots with SRP Batcher OFF and ON to see the speed up in various situations.Book of the Dead, HDRP, PlayStation 4. x1.47 speed up. Please note that FPS doesn’t change, because this scene is GPU bound. You get 12ms left to do other things on the CPU side. Speed up is almost the same on PC.FPS Sample, HDRP, PC DirectX 11. X1.23 speed up. Please note there is still 1.67ms going to the standard code path because of SRP Batcher incompatibility. In this case, skinned meshes and a few particles rendered using Material Property Blocks.Boat Attack, LWRP, PlayStation 4. Speed up x2.13.SRP Batcher is working on almost all platforms. Here is a table showing platform and minimal Unity version required. Unity 2019.2 is currently in open alpha.SRP Batcher fast code path is supported in VR, only with “SinglePassInstanced” mode. Enabling VR won’t add any CPU time ( thanks to SinglePassInstanced mode )How do I know I’m using SRP Batcher the best way possible?Use SRPBatcherProfiler.cs, and first check that SRP Batcher is ON. Then, look at “Standard code path” timing. This should be close to 0, and all timing should be spent in “SRP Batcher code path”. Sometimes, it’s normal that some time is spent in the standard code path if your scene is using a few skinned meshes or particles. Check out our SRP Batcher Benchmark project on GitHub.SRPBatcherProfiler shows similar timing regardless of SRP Batcher is ON or OFF. Why?First, you should check that almost all rendering time goes through the new code path (see above). If it does, and the numbers are still similar, then look at the “flush” number. This “flush” number should decrease a lot when the SRP Batcher is ON. As a rule of thumb, divided by 10 is really nice, by 2 is almost good. If the flush count does not decrease a lot, it means you still have a lot of Shader variants. Try to reduce the number of Shader variants. If you did a lot of different Shaders, try to make a “uber” one with more parameters. Having tons of different material parameters is then free.Global FPS didn’t change when I enabled the SRP Batcher. Why?Check the two questions above. If SRPBatcherProfiler shows that “CPU Rendering time” is twice as fast, and the FPS did not change, then the CPU rendering part is not your bottleneck. It does not mean you’re not CPU bound - instead, maybe you’re using too much C# gameplay or too many physics elements. Anyway, if “CPU Rendering time” is twice as fast, it’s still positive. You probably noticed on the top video that even with 3.5x speedup, the scene is still at 60FPS. That’s because we have VSYNC turned ON. SRP Batcher really saved 6.8ms on the CPU side. Those milliseconds could be used for another task. It can also just save some battery life on mobile.It’s important to understand what is a “batch” in SRP Batcher context. Traditionally, people tend to reduce the number of DrawCall to optimize the CPU rendering cost. The real reason for that is the engine has to set up a lot of things before issuing the draw. And the real CPU cost comes from that setup, not from the GPU DrawCall itself (that is just some bytes to push in the GPU command buffer). SRP Batcher doesn’t reduce the number of DrawCalls. It just reduces the GPU setup cost between DrawCalls.You can see that on the following workflow:On the left is the standard SRP rendering loop. On the right is the SRP Batcher loop. In SRP Batcher context, a “batch” is just a sequence of “Bind”, “Draw”, “Bind”, Draw”... GPU commands.In standard SRP, the slow SetShaderPass is called for each new material. In SRP Batcher context, the SetShaderPass is called for each new shader variant.To get maximum performance, you need to keep those batches as large as possible. So you need to avoid any shader variant change, but you can use any number of different Materials if they’re using the same shader.You can use Unity Frame Debugger to look at the SRP Batcher “batches” length. Each batch is an event in frame debugger called “SRP Batch”, as you can see here:See the SRP Batch event on the left. See also the size of the batch, which is the number of Draw Calls (109 here). That’s a pretty efficient batch. You also see the reason why the previous batch had been broken (“Node use different shader keywords”). It means the shader keywords used for that batch are different than the keywords in the previous batch. It means that the shader variant has changed, and we have to break the batch.In some scenes, some batch size could be really low, like this one:Batch size is only 2. It probably means you have too many different shader variants. If you’re creating your own SRP, try to write generic “uber” shader with minimum keywords. You don’t have to worry about how many material parameters you put in the “property” section.NOTE: SRP Batcher information in Frame Debugger requires Unity 2018.3 or higher.Note: This section is made for advanced users writing their own Scriptable Render Loop and shader library. LWRP or HDRP users can skip this section, as all shaders we provide are already SRP Batcher compatible.If you’re writing your own render loop, your shaders have to follow some rules in order to go through the SRP Batcher code path.First, all “per material” data should be declared in a single CBUFFER named “UnityPerMaterial”. What is “per material” data? Typically all variables you declared in the “shader property” section. That is all variables that your artist can tweak using the material GUI inspector. For instance, let’s look at a simple shader like:If you compile this shader, the shader inspector panel will show you:To fix that, just declare all your “per material” data like that:SRP Batcher also needs a very special CBUFFER named “UnityPerDraw”. This CBUFFER should contain all Unity built-in engine variables.The variable declaration order inside of “UnityPerDraw” CBUFFER is also important. All variables should respect some layout we call “Block Feature”. For instance, the “Space Position block feature” should contain all those variables, in that order:You don’t have to declare some of these block features if you don’t need them. All built-in engine variables in “UnityPerDraw” should be float4 or float4x4. On mobile, people may want to use real4 ( 16 bits encoded floating point value) to save some GPU bandwidth. Not all UnityPerDraw variables could use “real4”. Please refer to the “Could be real4” column.Here is a table describing all possible block features you could use in the “UnityPerDraw” CBUFFER:NOTE: If one of the variables of one feature block is declared as real4 ( half ), then all other potential variables of that feature block should also be declared as real4.HINT 1: always check the compatibility status of a new shader in the inspector. We check several potential errors ( UnityPerDraw layout declaration, etc ) and display why it’s not compatible.HINT 2: When writing your own SRP shader you can refer to LWRP or HDRP package to look at their UnityPerDraw CBUFFER declaration for inspiration.We still continue to improve SRP Batcher by increasing batch size in some rendering passes (especially Shadow and Depth passes).We’re also working on adding automatic GPU instancing usage with SRP Batcher. We started with new DOTS renderer used in our MegaCity demo. The speedup in the Unity editor is quite impressive, going from 10 to 50 FPS.MegaCity in-editor with SRP Batcher & DOTS renderer. The difference in performance is so huge that even global frame rate speeds up by a factor of five.NOTE: To be precise, this massive speedup when enabling the SRP Batcher is editor only, due to editor currently not using Graphics Jobs. Speedup in Standalone player mode is something like x2.MegaCity in Editor. If you could play the video at 60hz you would feel the speed up when enabling SRP Batcher. NOTE: SRP Batcher with DOTS renderer is still experimental and in active development.

>access_file_
1480|blog.unity.com

On DOTS: C++ & C#

This is a brief introduction to our new Data-Oriented Tech Stack (DOTS), sharing some insights in how and why we got to where we are today, and where we’re going next. We’re planning on posting more about DOTS on this blog in the near future.Let’s talk about C++. The language Unity is written in today.One of many advanced game programmers’ problems at the end of the day is that they need to provide an executable with instructions the target processor can understand, that when executed will run the game.For the performance critical part of our code, we know what we want the final instructions to be. We just want an easy way to describe our logic in a reasonable way, and then trust and verify that the generated instructions are the ones we want.In our opinion, C++ is not great at this task. I want my loop to be vectorized, but a million things can happen that might make the compiler not vectorize it. It might be vectorized today, but not tomorrow if a new seemingly innocent change happens. Just convincing all my C/C++ compilers to vectorize my code at all is hard.We decided to make our own “reasonably comfortable way to generate machine code”, that checks all the boxes that we care about. We could spend a lot of energy trying to bend the C++ design train a little bit more in a direction it would work a little bit better for us, but we’d much rather spend that energy on a toolchain where we can do all of the design, and that we design exactly for the problem that game developers have.What checkboxes do we care about?Performance is correctness. I should be able to say “if this loop for some reason doesn’t vectorize, that should be a compiler error, not a ‘oh code is now just 8x slower but it still produces correct values, no biggy!’”Cross-architecture. The input code I write should not have to be different for when I target iOS than when I target Xbox.We should have a nice iteration loop where I can easily see the machine code that is generated for all architectures as I change my code. The machine code “viewer” should do a good job at teaching/explaining what all these machine instructions do.Safety. Most game developers don’t have safety very high on their priority list, but we think that the fact that it’s really hard to have memory corruption in Unity has been one of its killer features. There should be a mode in which we can run this code that will give us a clear error with a great error message if I read/write out of bounds or dereference null.Ok, so now that we know what things we care about, the next step is to decide on what the input language for this machine code generator is. Let’s say we have the following options:Custom languageSome adaption/subset of C or C++Subset of C#Say What C#? For our most performance critical inner loops? Yes. C# is a very natural choice that comes with a lot of nice benefits for Unity:It’s the language our users already use today.Has great IDE tooling, both editing/refactoring as well as debugging.A C#->intermediate IL compiler already exists (the Roslyn C# compiler from Microsoft), and we can just use it instead of having to write our own.We have a lot of experience modifying intermediate-IL, so it’s easy to do codegen and postprocessing on the actual program.Avoids many of C++’s problems (header inclusion hell, PIMPL patterns, long compile times)I quite enjoy writing code in C# myself. However, traditional C# is not an amazing language from a performance perspective. The C# language team, standard library team, and runtime team have been making great progress in the last two years. Still, when using C# language, you have no control over where/how your data is laid out in memory. And that is exactly what we need to improve performance.On top of that, the standard library is oriented around “objects on the heap”, and “objects having pointer references to other objects”.That said, when working on a piece of performance critical code, we can give up on most of the standard library, (bye Linq, StringFormatter, List, Dictionary), disallow allocations (=no classes, only structs), reflection, the garbage collector and virtual calls, and add a few new containers that you are allowed to use (NativeArray and friends). Then, the remaining pieces of the C# language are looking really good. Check out Aras’s blog for some examples from his path tracer toy project for some examples.This subset lets us comfortably do everything we need in our hot loops. Because it’s a valid subset of C#, we can also run it as a regular C#. We can get errors on out of bounds access, with great error messages, debugger support and compilation speeds you forgot were possible when working in C++. We often refer to this subset as High-Performance C# or HPC#.We’ve built a code generator/compiler called Burst. It’s been available since Unity 2018.1 as a preview package. We have a lot of work ahead, but we’re already happy with it today.We’re sometimes faster than C++, also still sometimes slower than C++. The latter case we consider performance bugs we’re confident we can resolve.Only comparing performance is not enough though. What matters equally is what you had to do to get that performance. Example: we took the C++ culling code of our current C++ renderer and ported it to Burst. The performance was the same, but the C++ version had to do incredible gymnastics to convince our C++ compilers to actually vectorize. The Burst version was about 4x smaller.To be honest, the whole “you should move your most performance critical code to C#” story also didn’t result in everybody internally at Unity immediately buying it. For most of us, it feels like “you’re closer to the metal” when you use C++. But that won’t be true for much longer. When we use C# we have complete control over the entire process from source compilation down to machine code generation, and if there’s something we don’t like, we just go in and fix it.We will slowly but surely port every piece of performance critical code that we have in C++ to HPC#. It’s easier to get the performance we want, harder to write bugs, and easier to work with.Here’s a screenshot of Burst Inspector, allowing you to easily see what assembly instructions were generated for your different burst hot loops:Unity has a lot of different users. Some can enumerate the entire arm64 instruction set from memory, others are happy to create things without getting a PhD in computer science.All users benefit as the parts of their frame time that are spent running engine code (usually 90%+) get faster. The parts that are running Asset Store package runtime code gets faster as Asset Store package authors adopt HPC#.Advanced users will benefit on top of that by also being able to also write their own high-performance code in HPC#.In C++, it’s very hard to ask the compiler to make different optimization tradeoffs for different parts of your project. The best you have is per file granularity on specifying optimization level.Burst is designed to take a single method in that program as input: the entry point to a hot loop. It will compile that function and everything that it invokes (which is guaranteed to be known: we don’t allow virtual functions or function pointers).Because Burst only operates on a relatively small part of the program, we set optimization level to 11. Burst inlines pretty much every call site. Remove if checks that otherwise would not be removed, because in inlined form we have more information about the arguments of the function.C++ (nor C#) doesn’t do much to help developers to write thread-safe code.Even today, more than a decade since game consumer hardware has >1 core, it is very hard to ship programs that use multiple cores effectively.Data races, nondeterminism and deadlocks are all challenges that make shipping multithreaded code difficult. What we want is features like “make sure that this function and everything that it calls never read or write global state”. We want violations of that rule to be compiler errors, not “guidelines we hope all programmers adhere to”. Burst gives a compiler error.We encourage both Unity users and ourselves to write “jobified” code: splitting up all data transformations that need to happen into jobs. Each job is “functional”, as in side-effect free. It explicitly specifies the read-only buffers and read/write buffers it operates on. Any attempt to access other data results in a compiler error.The job scheduler will guarantee that nobody is writing to your read-only buffer while your job is running. And we’ll guarantee that nobody is reading from your read/write buffer while your job is running.If you schedule a job that violates these rules, you get a runtime error every time. Not just in your unlucky race condition case. The error message will explain that you’re trying to schedule a job that wants to read from buffer A, but that you already scheduled a job before that will write to A, so if you want to do this, you need to specify that previous job as a dependency.We find this safety mechanism catches a lot of bugs before they get committed and results in efficient use of all cores. It becomes impossible to code a deadlock or a race condition. Results are guaranteed to be deterministic regardless of how many threads are running, or how many time a thread gets interrupted by some other process.By being able to hack on all these components, we can make them be aware of each other. For example, a common case for a vectorization not happening is that the compiler cannot guarantee that two pointers do not point to the same memory (aliasing). We know two NativeArray’s will never alias because we wrote the collection library, and we can use that knowledge in Burst, so it won’t have to give up on optimization because it’s afraid two array pointers might point to the same memory.Similarly, we wrote the Unity.Mathemetics math library. Burst has intimate knowledge of it. It will (in the future) be able to do accuracy sacrificing optimizations for things like math.sin(). Because to Burst math.sin() is not just any C# method to compile, it will understand the trigonometric properties of sin(), understand that sin(x) == x for small values of x (which Burst might be able to prove), understand it can be replaced by a Taylor series expansion for a certain accuracy sacrifice. Cross platform & architecture floating point determinism is also a future goal of burst that we believe is possible to achieve.By writing Unity’s runtime code in HPC#, the engine and the game are written in the same language. We will distribute runtime systems that we have converted to HPC# as source code. Everyone will be able to learn from them, improve them, tailor them. We’ll have a level playing field, where nothing is stopping users from writing a better particle system, physics system or renderer than we write. I expect many people will. By having our internal development process be much more like our users' development process, we’ll also feel our users pain more directly, and we can focus all our efforts into improving a single workflow, instead of two different ones.In my next post, I’ll cover a different part of DOTS: the entity component system.

>access_file_