Skip to article frontmatterSkip to article content

6. Transforming an OpenUSD Xform

This chapter covers

Let’s kick off part two by focusing on the essential skills needed to dynamically control the elements within your scene. Up until now we’ve been using a simple approach to transforming objects, in order to introduce the basic concepts. However, there are more complex ways of working with Xforms that can prove both useful and efficient in many types of applications for OpenUSD.

For example; a self-driving car will need to understand its position in the world relative to other objects; an design engineer may want to move parts of a model very precisely according to their own size, so that they sit exactly next to other parts; a digital twin of a robot is being trained to pick up a cup, but how will it know when it has grasped the cup?

The solutions to these situations lie in a deeper understanding of how Xforms operate individually, but crucially, how they can interact and inform each other’s position and movement on a stage.

We’ll start by learning about the importance of the order in which transformations are applied and how it can affect the intended result. Then we’ll take a deep dive into the necessary math and geometry basics, ensuring you have a solid foundation before moving on to more complex tasks.

You’ll learn how to programmatically manage the translation, rotation and scale of prims on a stage, and how to do this in relation to other objects on the stage. We’ll also introduce the essential methods for tasks like collision detection which allows objects to ‘physically’ interact with each other.

All of these techniques are crucial for precise scene composition and will set the stage for more advanced manipulations in subsequent chapters.

Xforms are fundamental building blocks of OpenUSD scenes, so the following sections will help you to go beyond the basics and attain a deeper understanding of how they operate.

6.1 Revisiting Xforms

In our previous discussion, we introduced the concept of Xforms in OpenUSD and walked through the process of creating a basic Xform using a specific package and function, like UsdGeom.Xform.Define(). Go ahead and set your working directory to Ch05, then let’s create the following stage which we’ll use throughout this section to explore some important aspects of transformations:

from pxr import Usd, UsdGeom, Gf, Sdf
stage = Usd.Stage.CreateNew("xform.usd") 
xform = UsdGeom.Xform.Define(stage, "/World/Xform")

This foundational knowledge provided a solid starting point for understanding the role of Xforms in the OpenUSD ecosystem. However, creating an Xform is only the first step in harnessing its full potential.

Before we get into the more complex ways of transforming Xforms, let’s introduce another way to create a prim by simply defining its path and type as strings. This approach can streamline the process of prim creation, allowing us to focus on the more complex and nuanced task of manipulating our prims, as this is the core topic of the chapter. It has some advantages over using specific packages and functions.

You can use the following method to create a prim by defining its path and just referring to its type name. Program 1 will define a function named ’create_prim’ that can later be called to create a new prim on the stage.

def create_prim(stage: Usd.Stage, prim_path: str, prim_type: str):   
    return stage.DefinePrim(prim_path, prim_type)

Program 1:Create a Prim by Type

Having been defined, the function needs to be called using a statement like create_prim(stage, “/path/to/prim”, “PrimType”) to actually run the code inside the function. In a typical project, the ‘create_prim’ function would likely be called in another part of the code, perhaps when a user or system needs to dynamically create a USD prim on a stage.

For example, you can use this function to create an Xform:

create_prim(stage, "/World/Xform", "Xform") 

Alternatively, in order to ensure that we later have a simple reference to the prim, we can capture the return value of the create_prim function as a variable as follows:

light_prim = create_prim(stage, "/Lights/DistantLight", "DistantLight")   

As this chapter is all about manipulating Xforms, let’s recall how the transformation of prims (such as translations, rotations, and scaling) is handled through Xforms. When you want to modify the position, orientation, or size of a prim, you need to work with the Xform that is associated with it. Using UsdGeom.Xformable, you can easily retrieve the Xform component of any prim, regardless of its type. This makes the process of applying transformations consistent and straightforward across different types of objects. The following snippet accesses the Xform of the light we just created:

# to make the light transformable
xform_light = UsdGeom.Xformable(light_prim) 

Having accessed the Xform of a prim, then you can use regular functions to add transforms to it or modify existing transforms. For example, let’s rotate the distant light’s Xform by 50° around the y axis:

# Add a rotation operation that allows setting rotation around X, Y, and Z axes
rotation_op = xform_light.AddRotateXYZOp()    #A
# Set the rotation to 50 degrees around the Y-axis, with no rotation around the X and Z axes
rotation_op.Set(Gf.Vec3d(0, 50, 0))    

Let’s delve deeper into the details of Xform manipulation, exploring more intricate ways of working with Xforms in OpenUSD.

6.1.1 Understanding Transform Order

When discussing transforms (Xforms), we are referring to the combination of translation, rotation, and scale that define an object’s position, orientation, and size in 3D space. The order of the Xform, in this context, refers to the specific sequence in which these transformations are applied, namely: translation, rotation, and scale.

The order of transforms is significant because each transform is applied sequentially, and the result of each transform is used as the input for the next one. This is known as a “transform pipeline” or “transform chain”.

Understanding the transform pipeline is very important because when you apply multiple transforms to an object, the order in which they are applied can significantly affect the final result. Here’s why:

For example, consider a simple scenario where you want to translate and rotate a cube. If this is done by first rotating the cube, the rotation will alter the direction in which the cube’s y axis is pointing. The result is that the subsequent translation will move the cube in the new direction of the y axis. In contrast, if you apply the translation before the rotation, the cube moves along the y axis, then rotates in that position.

To demonstrate the result of using different transform orders, the following code will create two cubes, then apply the transformations in a different order to each of them. First, ‘cube1’ will be rotated by 45 degrees around its x-axis and then translated by 10 units along its y-axis. Secondly, ‘cube2’ will be translated 10 units along its y axis, then rotated 45 degrees around its x axis. The result will be each cube moving to a different position on the stage. With the xform.usd stage that we made earlier open, let’s add these two cubes to it:

cube1 = UsdGeom.Cube.Define(stage, "/World/Cube1") 
cube1.AddRotateXYZOp().Set(Gf.Vec3d(45, 0, 0)) 
cube1.AddTranslateOp().Set(Gf.Vec3d(0, 10, 0))

cube2 = UsdGeom.Cube.Define(stage, "/World/Cube2")
cube2.AddTranslateOp().Set(Gf.Vec3d(0, 10, 0))
cube2.AddRotateXYZOp().Set(Gf.Vec3d(45, 0, 0))

Figure 1 shows how altering the transform order on each cube results in a different final position. Notice how the initial rotation of ‘cube1’ altered the angle of its y axis, so that when the translation was applied, the cube moved 10 units in a different direction than ‘cube2’.

USD Composer Render setting

Figure 1:Illustrating the impact of using a different Transform Order. The y axis of cube1 was altered by the initial rotation before the translation, whereas the y axis of cube2 remained unchanged before its translation, giving divergent results.

6.1.2 Applying Transform Order

Having established that the order in which multiple transforms are added to an object or data will significantly affect the result, let’s look at how to apply a consistent transform order. Establishing a clear transform order before assigning new transformations will ensure reproducibility, accuracy, and reliability of the results. We’ll present the code generically, showing how it would be applied to any xform, then we’ll go through the same process applying it to one of the two cubes we just created.

First, it may be necessary to wipe the slate clean by removing any previous transform order that has been applied. By doing so, we can avoid unintended consequences, reduce errors, and guarantee that our transformations produce the desired outcome. To clear the transform order of an Xform:

xform.ClearXformOpOrder()

Then you can set out a new transform order which will establish the order in which future transforms will be applied. Note this time we are going to use the OrientOp() instead of the RotateXYZOp(), this is because we are shortly going to be using a different way of expressing rotation. The OrientOp is used for applying ‘Quaternion’ rotation which is measured as a number of degrees of rotation around a given vector. We will explore this in much more depth in the next section. Let’s set the new transform order:

xform.AddTranslateOp()    
# Add the rotation as an OrientOp
xform.AddOrientOp()    
xform.AddScaleOp()    

At this stage, we haven’t assigned values to the transform operations because we’re only defining their order. By specifying the sequence—translate, orient, and scale—we establish how these transformations will be applied to the object. This approach ensures precise control over the final position, orientation, and scale, and allows for flexibility in value assignment later. It provides a clear structure, enabling easier adjustments and management of complex transformations.

We can check the transform order of our Xform using the GetOrderedXformOps() function and print it:

xform_ops = xform.GetOrderedXformOps()    
for op in xform_ops:
    print(op.GetOpName())    

Then, we can assign some specific values to the transform using the new order. Remember, now that we’ve decided to use the OrientOp for our rotation we will set it with a quaternion value, (Gf.Quatf(1, 0, 0, 0) More on this in the next section). Note that the number in the square brackets specifies the index of the translate order of the xform, so that whenever you want to alter a transform, you can specify which xform_ops you are intending to change based on the order you set earlier:

# Given the transform order, this is translate
xform_ops[0].Set(Gf.Vec3d(0,0,0))    
# Use quaternion as rotation
xform_ops[1].Set(Gf.Quatf(1,0,0,0))   
# Add scale
xform_ops[2].Set(Gf.Vec3d(1,1,1))  

Note OpenUSD occasionally uses float (f) and double (d) precision interchangeably. Depending on the specific version and required precision, you may need to swap between Gf.Quatf and Gf.Quatd, or Gf.Vec3f and Gf.Vec3d, to resolve errors. If you want to specify the precision, you send the parameter precision UsdGeom.XformOp.PrecisionDouble or UsdGeom.XformOp.PrecisionFloat. For example, AddTranslateOp(precision=UsdGeom.XformOp.PrecisionDouble).

Let’s apply all of that to the stage we just created with the two cubes on it. We’ll start by removing the transform order that was previously applied to cube1. Although we did not explicitly set a transform order on cube1 when we created it, the order was set by the order in which we applied the transformations, so we do still need to remove the existing order. This will return cube1 to the World Origin:

# Directly retrieve the prim using UsdGeom.Xform to ensure it's an Xform type
cube1 = UsdGeom.Xform.Get(stage, '/World/Cube1')    
# Clear the transform operation order
cube1.ClearXformOpOrder()    

Next, let’s add a new transform order to it:

# Add a translation operation to the Xform
cube1.AddTranslateOp()

# Add an orientation (rotation) operation to the Xform
cube1.AddOrientOp()

# Add a scale operation to the Xform
cube1.AddScaleOp()

# Get the ordered transform operations on the Xform of cube1
xform_ops = cube1.GetOrderedXformOps()

# Print the names of the operations to check the transform order
for op in xform_ops:
    print(op.GetOpName())

Finally, let’s assign some new values to cube1’s transform. Note that although we are aiming to rotate the cube by 45° around the x axis, the numbers in the AddOrientOp() that is using Gf.Quatf() are very different. We’ll explain that directly after you’ve applied the code:

# Given the transform order, this will translate the cube by 5 units along the y axis
xform_ops[0].Set(Gf.Vec3d(0, 5, 0))

# The AddOrientOp() that was set in the transform order expects a Gf.Quatf() value
xform_ops[1].Set(Gf.Quatf(0.92, 0.38, 0, 0))

# Scales the cube by a factor of 2, just for fun
xform_ops[2].Set(Gf.Vec3d(2, 2, 2))

If you are viewing your stage, it should now look something like Figure 2, which shows the two cubes in their new positions; Cube2 in the position resulting from a translation of 10 units along the y axis followed by a 45° rotation around the x axis; and Cube1, having had its initial transformation cleared, now in a position resulting from a new translation of 5 units along the y axis followed by an orientation of approximately 45° around the x axis.

New position of cubes

Figure 2:The new positions of the cubes. Cube2 remains where we originally moved and rotated it, whereas Cube2 has had its original transformation cleared and then a new translation, orientation and scale added.

So, why do the numbers in the Gf.Quatf() constructor differ from the 45° values we’ve used with the AddRotateXYZOp() to rotate objects? This difference occurs because there are multiple ways to represent rotation angles. The AddRotateXYZOp() uses Euler angles, which express rotations in terms of pitch, yaw, and roll, while the AddOrientOp() requires a quaternion value. Quaternions represent rotations in a fundamentally different manner compared to Euler angles. In the next section, we’ll explore the mathematics behind quaternions and their advantages over Euler angles, helping you understand how these values translate into 3D rotations.

6.2 Mastering Rotation

OpenUSD programming relies heavily on mathematical concepts to represent and manipulate 3D scenes. The math involved includes linear algebra, geometry, transformations, quaternions, and calculus. These mathematical concepts are used to accurately represent 3D objects and scenes, perform transformations and simulations, and ensure predictable behavior. Mastering these mathematical concepts is vital for effective OpenUSD programming, as it allows developers to build accurate, efficient, and reliable 3D applications.

This section focuses exclusively on rotation because rotation is the most mathematically complex and conceptually challenging of the three fundamental transformations: rotation, translation, and scaling. While translation and scaling involve relatively straightforward manipulations of position and size, rotation requires a deeper understanding of concepts like angles, axes, and quaternions.

Understanding different ways to represent rotations, such as quaternions, Euler angles, and rotation matrices, is crucial for working with 3D graphics and simulations. Each representation has its strengths and weaknesses, and knowing how to convert between them ensures seamless integration and data exchange between different systems and tools.

By understanding different rotation representations, you can choose the most suitable one for your specific use case, write more efficient and reliable code, and work effectively with various systems and algorithms. This knowledge is essential for applications like computer vision, robotics, animation, and game development, where accurate and efficient rotation calculations are critical.

Let’s examine 3 different ways to represent 3D rotations; Euler angles; quaternions; and rotation matrices.

6.2.1 Euler Angles

Each value of an Euler angle represents a rotation in degrees around one of the three axes in 3D space. They are used in many fields, including aviation to describe pitch, roll, and yaw. Using Euler angles in OpenUSD programming can be convenient for simple rotations and transformations due to their intuitive nature. However, they have a mathematical singularity when the object’s longitudinal y-axis is pointed up or down, and they can’t perform certain rotations when two axes align, a situation known as gimbal lock (A phenomenon when some angles get too close to each other, the math gets confused, making it hard to calculate the correct rotation. It’s like the compass gets stuck, and you can’t get the right direction anymore!).

In previous chapters we have been mainly using Euler angles to represent a rotation, as follows:

# Remember to clear transform order before assigning a new transform
xform.ClearXformOpOrder()
# Add RotateXYZ as the Euler angle
xform.AddRotateXYZOp().Set(Gf.Vec3d(45, 0, 0))

It can sometimes be helpful to reorder the XYZ axes to change the sequence of rotation operations. For instance, you can use AddRotateYZXOp() or AddRotateZYX() to alter the order of rotations along the XYZ axes. In the same way that different translate orders can result in different positions, so can a different order of Euler angles result in distinct rotations, so it is important to consider which order will be the least problematic in your use-case.

While the issue of axis ordering causing different rotations is a notable limitation of Euler angles, the ability to choose the order of XYZ rotations can be advantageous. It can allow you to avoid gimbal lock, align with external systems or conventions (like those used in certain industries or applications), and achieve more natural or expected movement in complex rotations, especially in animations, by preventing unintended twists or flips.

Figure 3 shows a diagram of Euler angles in relation to the rotation of a paper plane. Thinking of how a plane moves in 3D space is a useful way to conceptualize Euler angles, with rotation around the x axis representing the plane’s pitch, rotation around the z axis representing roll, and rotation around the y axis representing yaw.

Illustration of Euler angles and paper plane rotation

Figure 3:Illustration of Euler angles in relation to the rotational position of a paper plane. Rotations around the x, y and z axes can be thought of as affecting the pitch, yaw and roll of an object, respectively.

6.2.2 Quaternions

Quaternions are hyper-complex numbers with four components that describe a rotation about an axis. They are more complex mathematically than Euler angles but computationally more efficient. They simplify the composition of complex and continuous rotations and effectively address the gimbal lock issue found in Euler angles. Although they are less intuitive than Euler angles, their ability to handle rotations smoothly and reliably often makes them the preferred choice in 3D graphics and robotics. Additionally, quaternions sometimes require normalization, a process that ensures they maintain the correct form for rotation by adjusting their magnitude to exactly 1, keeping them accurate and reliable during continuous use.

They also offer advantages over rotation matrices by being more compact, computationally efficient, and numerically stable. Even with the additional computation introduced by normalization, quaternions remain more efficient than rotation matrices for most tasks. Normalization is a relatively simple operation (involving calculating the magnitude and adjusting the components), and it typically requires fewer computations than the matrix operations needed for combining and applying rotations.

Quaternions give a way to encode an axis–angle representation using four real numbers and can be used to calculate the corresponding rotation to a position vector rr, representing a point relative to the origin in 3D space. The rotation occurs around an axis defined by a unit vector and an angle θ\theta, with the quaternion expressed as:

q=w+xi+yj+zkq = w + xi + yj + zk

Put more simply, imagine you have a stick in 3D space. The stick represents an axis that is pointing in one direction, and you can rotate things around this stick. The first number, ww, tells you how much to rotate around the stick, and the other three numbers tell you which direction the stick is pointing, or its orientation. The xixi, yjyj and zkzk numbers tell you how much the stick points in the direction of the x-axis, y axis and z axis respectively.

Figure 4 is a diagram of quaternions, showing the position vector 𝑟 relative to the three axes 𝑖, 𝑗 and 𝑘 (the orientation of the stick) and the angle of rotation around 𝑟 (how much to rotate around the stick).

Quaternion rotation around a vector

Figure 4:An illustration of quaternions, where vector v represents a direction relative to the axes x, y, and z, and the rotation angle Θ represents the amount of rotation around vector v.

In the previous section, we used Euler angles to represent 45° rotation around the x axis, and mentioned that its quaternion version would be:

# Remember to clear transform order before assigning a new transform
xform.ClearXformOpOrder()

# Defines a quaternion with components w, x, y, z
w, x, y, z = (0.92, 0.38, 0, 0)

# Use Gf.Quatf as the quaternion defined above
xform.AddOrientOp().Set(Gf.Quatf(w, x, y, z))

To illustrate the mathematical process of converting Euler angles to a quaternion, we use a 45° rotation around the X-axis as an example. While this demonstrates the mapping, we will not delve into the foundational reasoning behind why such a conversion works, as it is outside of the scope of this book. Essentially, to convert from Euler to quaternion, suppose we have Θ degree rotation along the normalized axis , and . The the quaternion version would be:

w=cos(θ)   x=sin(θ2)i   y=sin(θ2)j   z=sin(θ2)kw = cos(\theta) ~~~ x = sin(\frac{\theta}{2})i ~~~ y = sin(\frac{\theta}{2})j ~~~ z = sin(\frac{\theta}{2})k

If you set θ=45°\theta = 45°, and (i,j,𝑘k)=(1,0,0)(i, j, 𝑘k) = (1, 0, 0) since the rotation happens around the x-axis, you will get the values for quaternion. Here 0.92 encodes the amount of rotation, and (0.38, 0, 0) indicates the axis of rotation (x-axis) is scaled by 0.38.

6.2.3 Transform and Rotation Matrices

Transformation matrices are like a Swiss Army knife for 3D space. Using a transform matrix is a powerful way to represent and combine multiple transformations, such as rotations, translations, and scaling, into a single mathematical operation. By representing transformations as matrices, we can take advantage of the properties of matrix multiplication to chain together multiple transformations, allowing us to perform complex transformations in a single step. This approach also enables us to easily invert transformations, compose multiple transformations, and perform other operations that would be difficult or impossible to achieve using other methods.

Moreover, transform matrices provide a unified and consistent way to work with transformations, making it easier to write reliable and efficient code. They also enable you to leverage the power of linear algebra and matrix operations, which are highly optimized and widely supported in most programming languages and libraries. By using transform matrices, developers can write more concise, efficient, and maintainable code, and can focus on the creative and logical aspects of their work, rather than getting bogged down in the details of transformation mathematics.

A transform matrix is a 4x4 matrix used for combining rotation, scaling, and translation, allowing for efficient and convenient manipulation of 3D objects. The upper-left 3x3 section of the matrix represents the rotation matrix. A transform matrix is referred to as a homogeneous transformation matrix because it uses homogeneous coordinates, which add a fourth dimension (usually set to 1) to the traditional 3D coordinates, enabling translation to be represented as a matrix multiplication. Figure 5 shows an example.

Homogeneous transformation matrix structure

Figure 5:An example of a (homogeneous) transformation matrix showing the 3×3 rotation matrix on the upper left, the translate coordinates on the bottom row, and the scale factor on the right.

A rotation matrix is a mathematical representation used to rotate points or vectors in three-dimensional space around the origin. It is a 3x3 matrix that, when multiplied with a coordinate vector, rotates the vector by a specified angle around a specific axis.

Their main disadvantage of rotation matrices compared to other two forms of rotation is that they are more memory-intensive since they require storing 9 values (a 3x3 matrix), whereas Euler angles only require 3 values (one for each rotation axis) and quaternions require only 4 values. They are also computationally expensive because operations like matrix multiplication (used for combining rotations) involve more calculations than quaternion multiplication. Additionally, applying a rotation using a matrix involves multiplying the 3x3 matrix with a 3D vector, which requires more operations than the equivalent quaternion-based method.

The 3x3 grid of a rotation matrix can be used alone, if there is no need to combine translation or scaling. They make it easy to concatenate a series of rotations into a single rotation. They can also be used to reverse a rotation by reversing the order of the rotations and changing the signs of the three rotation angles.

Effect of rotation matrices on axes and coordinates

Figure 6:Illustration of the effect of rotation matrices to reorient axes and coordinates by rotating them around the z-axis. The rows of the matrix determine how the original coordinates (x,y,z)(x, y, z) influence the new position of a point (x1,y1,z1)(x_1, y_1, z_1). The columns of a rotation matrix define the new directions of the coordinate axes after rotation.

Now let’s dive deeper at the math behind the rotation matrix. For illustration purposes, we examine rotation around the z-axis (rather than the previous 45° rotation around the x-axis), since it’s easier to visualize and draw in a 2D plot as in Figure 6. The plot visualizes the rotation by showing an initial set of axes and a second set illustrating the new orientation after applying the rotation matrix. It also shows that for a rotation around the z-axis by an angle θ\theta, the rotation matrix is

To intuitively understand the matrix, each column of the rotation matrix corresponds to how the x, y, and z coordinates of a point are affected by the rotation. Specifically, the first column [cos(θ),sin(θ),0][cos(\theta), sin(\theta), 0] tells us where the x-axis unit vector moves, and the second column [sin(θ),cos(θ),0][-sin(\theta), cos(\theta), 0]tells us where the y-axis unit vector moves. The third column [0,0,1][0, 0, 1] tells us that the z-axis remains unchanged.

For each row, it gives the projection of the original i-th basis vector onto the rotated coordinate axes. The first row [cos(θ),sin(θ),0][cos(\theta), -sin(\theta), 0] tells us that the original x-axis is now a mix of the new x- and y-axes. The second row [sin(θ),cos(θ),0][sin(\theta), cos(\theta), 0] tells us that the original y-axis is now a mix of the new x- and y-axes. The third row [0,0,1][0, 0, 1] tells us that the original z-axis stays the same.

This meaning of rows and columns applies to a more general rotation matrix as well, while the rotation matrix takes on a significantly more complex math form than those for rotations around the x-, y-, or z-axes. Readers interested in the details are encouraged to explore further resources.

Now let’s look at how a rotation matrix would be used in OpenUSD. The Transform Matrix is a superset that contains the rotation matrix. To add a transform matrix to an xform, we first use the Gf.Matrix4d() method to create a transform matrix and then use the MakeMatrixXform() API of the xform to apply the transform matrix. Here consider the rotation of 45° around the z-axis and we can set θ= 45° to get the rotation matrix. Then add translation (100, 200, 0) and keep the scale unchanged, and finally get the transform matrix

# Create a 4x4 transform matrix
matrix = Gf.Matrix4d(
    0.707, -0.707, 0, 0,
    0.707,  0.707, 0, 0,
    0,      0,     1, 0,
    100,   200,    0, 1
)

# Apply the transform matrix to the xform
xform.MakeMatrixXform().Set(matrix)

We will apply a transform matrix in the next section. Before we get to that, let’s summarize the three rotation methods we have introduced above.

6.2.4 Summary of Rotation Methods

In summary, Euler Angles are intuitive and easy to implement but suffer from gimbal lock and non-uniqueness. Quaternions and Rotation Matrices, on the other hand, provide a more reliable and accurate representation of rotation, avoiding gimbal lock and ensuring uniqueness. However, Quaternions require more complex mathematical operations and normalization, while Rotation Matrices are more memory-intensive and computationally expensive.

Euler Angles

Quaternion Rotation

Rotation Matrix

Table 1 summarizes the comparison of these three rotation representation methods.

Table 1:Table 6.1 A Summary of the Pros and Cons of Different Rotation Methods.

Rotation methodProsConsBest For
Euler angleIntuitive, Simple to implement, Easy to interpolateSuffer from Gimbal lock, Not unique, Order-dependentSimple, intuitive control over rotations, especially when user input involves specifying angles for pitch, yaw, and roll, like in camera controls.
QuaternionUnique representation, Efficient interpolationLess intuitive, More complex to implementSmooth, continuous rotations and situations requiring precise control without the risk of gimbal lock, such as in 3D animations or robotics.
MatrixUnique representation, Easy to composeLess intuitive, More memory-intensive, More computationally expensiveCombining rotation with other transformations like scaling and translation, particularly in 3D rendering pipelines or graphics shaders.

The choice of rotation representation ultimately depends on the specific requirements of the project. Euler Angles are suitable for simple applications, Quaternions are ideal for applications requiring smooth rotation and interpolation, and Rotation Matrices are suitable for applications requiring complex rotation compositions and precise control. By understanding the pros and cons of each representation, you can choose the most suitable one for their project and ensure accurate and efficient rotation calculations.

Next let’s start applying what we’ve learned by exploring the methods of programmatically arranging objects in a scene, demonstrating the power of transform matrices, then creating an example stage on which we can apply those methods.

6.3 Playing with Scene Layouts

So far, when discussing the movement of objects around a stage we have been working with prims and Xforms in isolation. However, most stages will have multiple objects on them and it can be helpful, or even necessary, to consider an object’s position in relation to the stage, its parent, or other objects on the stage. In 3D graphics, an object’s position, rotation, and scale relative to its parent object or local coordinate system is called a local transform, while the object’s position, rotation, and scale relative to the World Origin or global coordinate system is called a world transform.

When working with object relationships, it’s important to know when objects touch or collide. This can be determined by assigning objects a ‘Bounding Box’, which provides data on the extent of the space they occupy. In other words, the bounding box represents the solidity of an object and can be used to detect collisions. The ‘Extent’ attribute defines the bounding box, specified by its minimum and maximum corners in 3D space.

Imagine your scene has a character wearing a hat. It is more efficient to calculate the hat’s position in relation to the character’s head than to the World Origin. That way, the hat could share all movements with the head, though it would be given a constant offset from the center of the head, so that it always sits on the top. Further, if the hat were to fall off the character’s head, provided the hat and the floor have bounding boxes, there would be a way of calculating when it hits the floor.

In another example, if you have a car (“/garage/car”) inside a garage (“/garage”), the car’s local transform might position it at the center of the garage. However, the garage’s world transform places it at a specific spot on a city map relative to the World Origin. Figure 7 shows that changing the garage’s position on the map will change the car’s world position, but the car’s local position inside the garage remains the same, as it has not moved in relation to the garage.

Local transform vs World transform

Figure 7:Local transform vs World transform: The local transform describes a child prim’s position relative to its parent prim. As shown in the image, moving the garage with the car inside it will change the car’s world transform, but the car’s local transform remains unchanged.

Now imagine our example car is a self-driving vehicle. It’s conception of its 3D environment would require knowledge of its local transform relative to its garage, so it would know how to find it again; its world transform, so it would know its position relative to everything else on the map; and the dimensions of its own bounding box, as well as those of other obstacles, to avoid collisions.

Let’s introduce some essential methods that will allow us to manipulate scene layouts from script by programmatically transforming objects. We’ll explore how to retrieve local and world transforms, and create bounding boxes, giving us the power to automate complex tasks, and create dynamic scenes. The methods laid out in this section will also prepare us for later chapters on animation and physics.

We’ll begin by opening the xform.usd stage that we created in section 5.1 and using it to practice getting an object’s transform and adding a bounding box. Later in this section, we’ll build on these techniques by creating a new stage and referencing some dice objects with letters on each side so that we manipulate them to make them spell out “USD”.

6.3.1 Obtaining World & Local Transforms

Obtaining world and local transforms is particularly useful when you need to perform calculations or operations that involve multiple prims or objects in the scene, such as computing distances, angles, or collisions between objects. By getting the transformation of a prim in world or local space, you can accurately determine its location and orientation in the scene, allowing you to perform tasks such as camera placement relative to the prim, lighting setup around the prim, or robotics and physics simulations.

Remembering to ensure that your working directory is set to ‘Ch05’, let’s use the xform.usd stage we created earlier to obtain world and local transforms, then extract data from them:

from pxr import Usd, UsdGeom, Gf

stage = Usd.Stage.Open('<path/to/your/xform/stage>')  # ex: './xform.usd'

#A Replace the path to file xform.usd

Program 2 and Program 3 define self-contained functions for calculating the world and local transformation matrices of a given prim. Later, we will build upon the result of Program 2 by further processing the world transform matrix, breaking it down into translation, rotation, quaternion, rotation matrix, and scale components. .

# Defines a function to retrieve the world transform for a specified prim in a given stage
def get_world_transform(stage: Usd.Stage, prim_path: str):
    prim = stage.GetPrimAtPath(prim_path)

    # Wraps the prim into a UsdGeom.Xformable object, which allows access to transformation-related methods
    xform = UsdGeom.Xformable(prim)

    # Obtain the current time code. The transformation of an object can vary over time in animation.
    time_code = Usd.TimeCode.Default()

    # Get the world transform matrix
    world_transform: Gf.Matrix4d = xform.ComputeLocalToWorldTransform(time_code)

    # Returns the computed world transformation matrix
    return world_transform

Program 2:Get Prim World Transform

The function get_world_transform computes the world transformation matrix of a specific primitive (prim) within a stage. After getting the prim, it retrieves its Xformable, which represents a transform information. Then, it sets the time code to the default time (usually at frame 0). Finally, it computes the transformation matrix that transforms the primitive’s local coordinates to world coordinates at the specified time. The function returns the computed world transformation matrix (Gf.Matrix4d).

Similarly, we can get the local transform by calling the following function:

def get_local_transform(stage: Usd.Stage, prim_path: str):
    prim = stage.GetPrimAtPath(prim_path)

    # Wraps the prim into a UsdGeom.Xformable object, which allows access to transformation-related methods
    xform = UsdGeom.Xformable(prim)

    time_code = Usd.TimeCode.Default()

    # Get the local transform matrix
    local_transformation: Gf.Matrix4d = xform.GetLocalTransformation()

    # Returns the computed local transformation matrix
    return local_transformation

```                                        .

These listings will return the computed world and local transformation matrix. By returning this matrix, the function provides the caller with the prim's transformation in world or local space, which can then be used for further calculations or operations, such as positioning, orienting, or scaling the prim relative to the global or local coordinate system in the scene. 

Now that the get_world_transform function is defined, you can call it with different prim_path values to retrieve the world transform for various prims in the stage.

Let’s extract information from the world transform matrix that we just obtained.

The following example shows how to get the world transform of a prim given the prim path. Let’s use the prim ‘Cube1’ that we created earlier in the chapter:

```python
world_transform = get_world_transform(stage, "/World/Cube1")

Program 3:Get Prim Local Transform

After getting the world transform matrix of a prim, we may want to know its global translation, orientation, and scale so that we can accurately position, align, and manipulate the prim within the scene.

To extract the translation:

# Extracts the translation vector from the world transform matrix
translation: Gf.Vec3d = world_transform.ExtractTranslation()

To extract the rotation:

rotation: Gf.Rotation = world_transform.ExtractRotation() 

Usually, we extract the quaternion instead of the Euler Angle because the quaternion is unique:

q: Gf.Quatf = world_transform.ExtractRotationQuat() 

There is no function dedicated to extracting the scale like translation and rotation above, but we can still utilize the rotation matrix as it includes both the rotation and scaling information. By measuring the length of each vector in the matrix, it is possible to determine the scaling factors for each axis, assuming the matrix represents a combined affine transformation:

rotation_matrix = world_transform.ExtractRotationMatrix() 
# Computes the scale from the lengths of vectors in the rotation matrix and stores it in a Gf.Vec3d object
scale: Gf.Vec3d = Gf.Vec3d([v.GetLength() for v in rotation_matrix])   

Having extracted this data from the world transform matrix, you may wish to see the results you’ve just extracted, in which case, you can print them in your console by using Python’s f-string formatting to combine multiple print statements into one:

print(
    f"Translation: {translation}\n"  # f strings allow for embedding expressions inside string literals, using {} braces
    f"Rotation (Quaternion): {q}\n"  # \n is used to insert a new line between each printed component, keeping the output organized
    f"Rotation Matrix: {rotation_matrix}\n"
    f"Scale: {scale}"
)

This script should produce the following output:

Translation: (0, 5, 0)

Rotation (Quaternion): (1, 0.6003580236492053, 0, 0)

Rotation Matrix: ( (2, 0, 0), (0, 1.385600122833253, 1.44225944254301), (0, -1.44225944254301, 1.385600122833253) )

Scale: (2, 2, 2)

Note Printing transformation results can be extremely useful during development, debugging, analysis, and user feedback. However, in production or performance-sensitive environments, printing should be used judiciously, often replaced by logging or more sophisticated monitoring approaches.

As we have already defined the ‘get_local_transform’ function in Listing 5.3, if we want to extract local transformation data we can do so by defining local_transformation, then repeating the steps above but replacing ‘world_transform’ with ‘local_transformation’, for example:

# Calls the function to get the local transformation matrix of the specified prim
local_transformation = get_local_transform(stage, "/World/Cube1")

# Extracts the translation component from the local transformation matrix
translation_local: Gf.Vec3d = local_transformation.ExtractTranslation()

Although we extracted the scale of ‘Cube1’ above, it is often not very useful because it only shows how the final transform differs from the original model size, resulting in a unitless value. To find the actual size of the prim within the stage, we should use the bounding box method.

6.3.2 Compute the Bounding Box

The motivation for getting the bounding box in 3D is to understand the physical space that an object occupies within a scene. The bounding box provides the minimum and maximum coordinates that fully enclose the object, giving you the precise information about its size, position, and orientation. This is essential for tasks like collision detection which allows objects to ‘physically’ interact with each other, and for realistic scene layout where ‘solid’ objects don’t overlap, ensuring that they fit within a designated area. By using the bounding box, you can make precise decisions about object placement and interaction within the 3D environment.

Bounding boxes are often used in tasks like scene editing, final rendering, or real-time interactions in applications such as digital twins, simulation, gaming or robotics. However, in complex scenes, they can be computationally intensive. OpenUSD addresses this by allowing you to filter bounding box calculations based on the element’s ‘purpose,’ and providing a cache of the results of previous bounding box calculations called the ‘BBoxCache’. Together these elements optimize performance during editing or for the scene’s final use.

Settings for the Purpose Parameter

‘Purpose’ settings are parameters that define which elements of a scene are considered in various operations, such as rendering or calculations. They allow you to filter scene components based on their intended use. It typically has the following settings:

These options help tailor calculations to specific needs, like optimizing performance or ensuring accuracy in certain contexts.

This following code Program 4 defines a function called compute_bounding_box that calculates the bounding box of a prim in the scene. It first sets the purpose to use default, then creates a BBoxCache object. The function calculates the world-space bounding box for the given prim. Finally, it extracts the minimum and maximum points of this bounding box and returns them, which represent the corners of the box that fully encloses the prim.

# Define the function ‘compute_bounding_box’ for a specified prim in a given stage
def compute_bounding_box(stage, prim_path):
    prim = stage.GetPrimAtPath(prim_path)

    # Set the purpose of getting the bounding box, "default" for general purposes
    purposes = [UsdGeom.Tokens.default_]

    # Get the box cache
    bboxcache = UsdGeom.BBoxCache(Usd.TimeCode.Default(), purposes)

    # Compute the bounding box
    bboxes = bboxcache.ComputeWorldBound(prim)

    # Get the box vertices
    min_point = bboxes.ComputeAlignedRange().GetMin()
    max_point = bboxes.ComputeAlignedRange().GetMax()

    # Returns the computed min and max points of the bounding box
    return min_point, max_point

Program 4:Compute the Bounding Box

Figure 8 shows the min and max points of a bounding box on a given prim. The bounding box reveals the actual size of the prim on the stage by fully enclosing every vertex of the prim. It places the min_point at the bottom, left and rear of the box and the max_point at the top, right and front of the box.

Illustration of a bounding box

Figure 8:Illustration of a bounding box. Computing the bounding box reveals the actual size of the prim in the stage by identifying the corner points at bottom, left, back (min_point) and top, right, front (max_point).

Having defined the function ‘compute_bounding_box’, let’s call it for ‘Cube1’ on our ‘xform.usd’ stage:

prim_path = "/World/Cube1" 

min_point, max_point = compute_bounding_box(stage, prim_path) 

Next, let’s consolidate the knowledge we’ve gained so far by creating a new stage, where we can translate some objects with a goal in mind.

6.3.3 An Example Stage

We’ve explored advanced methods for manipulating xforms. Now, let’s apply these techniques by creating an example OpenUSD scene. We’ll set up a stage with three dice, allowing us to reinforce our understanding of fundamental transformations in relation to other objects. This focused example will deepen your understanding of how Xforms affect the composition and behavior of objects within a stage.

Figure 9 shows the type of dice we will use for our example stage. As it’s a cube with a different letter on each side it will make an excellent example for calculating bounding boxes and applying transformations with the aim of spelling out the letters “USD”.

Example dice cube for bounding box and transformation

Figure 9:An example of the dice we’ll use to populate a stage. As a cube, it serves as an excellent example for calculating bounding boxes and applying transformations.

Let’s start by creating a new stage called ‘dice_scene’, importing some packages, and referencing the dice as an external reference. Remember to consider the directory structure and set your working directory to the folder where you want the ‘dice_scene’ to be created and saved, and where you wish to reference assets from nested folders just like we did with the statue scene in Chapter 4:

from pxr import Usd, UsdGeom, Gf

stage = Usd.Stage.CreateNew("dice_scene.usd")

dice1 =  UsdGeom.Xform.Define(stage, '/World/Dice1')

dice1.GetPrim().GetReferences().AddReference("<your file path to dice.usd ex: './Assets/Dice.usd'>")    

Next, let’s use the concepts we’ve learned in the previous sections to clear the transformation order and apply new transformations to this dice. These steps will include translating, rotating, and scaling the dice to position it within the scene. Notice that we are using Euler angles with the AddRotateXYZOp(), as we are only doing simple, non-continuous rotations:

# Clear transform
dice1.ClearXformOpOrder()

dice1.AddTranslateOp().Set(Gf.Vec3d(220, 100, 100))
dice1.AddRotateXYZOp().Set(Gf.Vec3d(-180, 0, 90))

# Add translate, rotate, and scale
dice1.AddScaleOp().Set(Gf.Vec3d(1, 1, 1))

This process not only positions the dice within the scene but also gives us a reference point from which we will position other objects. By adding more dice and applying transformations relative to Dice1, we will use three dice to spell out ‘USD’. This time we’re going to use an internal reference, as we already have an example of the dice on the stage:

# Create another Xform as the second dice
dice2 = UsdGeom.Xform.Define(stage, '/World/Dice2')

# Use internal reference to "copy" the first dice
dice2.GetPrim().GetReferences().AddInternalReference("/World/Dice1")

As the second dice is an exact copy of the first dice, it will appear in the same location on the stage. We’re going to reposition the Dice2 by extracting data from Dice1’s transformation matrix, applying a bounding box to it, then giving Dice2 an offset from Dice 1. The first step is to visualize the necessary geometric calculations. We need to define the functions to get Dice1’s world transform and compute its bounding box, so let’s start by revisiting the function for ‘get_world_transform’ in Program 2.

we can call the get_world_transform function, to retrieve the translation and rotation values of the first dice:

dice1_transform_matrix = get_world_transform(stage, "/World/Dice1")

dice1_translation = dice1_transform_matrix.ExtractTranslation()

# Get transform information from Dice1
dice1_rotation = dice1_transform_matrix.ExtractRotationQuat().GetNormalized()

Note that here we used the GetNormalized() function after ExtractRotationQuat() to ensure that the quaternion is perfectly normalized and ready to use in further calculations or transformations.

Then let’s call the ‘compute_bounding_box’ function to calculate the bounding box for Dice1:

# Compute the bounding box information from Dice1
box_min, box_max = compute_bounding_box(stage, "/World/Dice1")

# Define the variable ‘dice_size’ and assign the box_max and box_min values to it
dice_size = box_max[0] - box_min[0]

Next, we’ll define the variable ‘dice2_translation’, which will be used later to position Dice2 relative to Dice1. By using the value of ‘dice_size’, we’ll apply an offset along the X-axis with Gf.Vec3d(dice_size, 0, 0) to determine Dice2’s placement.

# Calculate the position of dice2 by applying an offset equivalent to the ‘dice_size’ along the x-axis
dice2_translation = dice1_translation - Gf.Vec3d(dice_size, 0, 0) 

As we are aiming to spell ‘USD’ with these dice, let’s ensure that the correct face of the dice is facing forwards by applying a rotation to the second dice. Here we will define the variable ‘further_rotation’ which, when we apply it, will add additional rotation to Dice2, relative to the rotation of Dice1:

further_rotation = Gf.Rotation(Gf.Vec3d(1, 0, 0), 90).GetQuat()

# #A Use quaternion multiply to rotate the dice further around the x-axis by 90°
dice2_rotation = further_rotation * dice1_rotation  

Finally, having prepared the variables for the new transforms, let’s apply them to the second dice:

dice2.ClearXformOpOrder() 

dice2.AddTranslateOp().Set(dice2_translation) 

dice2.AddOrientOp().Set(Gf.Quatf(dice2_rotation))

dice2.AddScaleOp().Set(Gf.Vec3d(1, 1, 1)) 

Figure 10 illustrates Dice2’s translation using an offset from Dice1, followed by a rotation that builds upon Dice1’s existing rotation.

Positioning Dice2 relative to Dice1's bounding box

Figure 10:Showing how Dice2 is positioned relative to Dice1’s bounding box. First, Dice2 is translated along the X-axis by an amount equal to the dice size. Next, it is rotated by 90° relative to Dice1’s existing rotation.

We can repeat the previous process to create another dice, then transform it relative to Dice2 by defining ‘dice3_translation’ and ‘dice3_rotation’, then applying them:

dice3 = UsdGeom.Xform.Define(stage, '/World/Dice3')

dice3.GetPrim().GetReferences().AddInternalReference("/World/Dice1")

dice3_translation = dice2_translation - Gf.Vec3d(dice_size, 0, 0)

dice3_rotation = Gf.Rotation(Gf.Vec3d(0, 1, 0), 90).GetQuat() * dice2_rotation 

dice3.ClearXformOpOrder() 

dice3.AddTranslateOp().Set(dice3_translation) 

dice3.AddOrientOp().Set(Gf.Quatf(dice3_rotation) )

dice3.AddScaleOp().Set(Gf.Vec3d(1, 1, 1))

Figure 11 shows the stage after we have introduced three copies of the Dice.usd and applied various translations to spell out ‘USD’

Outcome of manipulating the xforms of three dice

Figure 11:Showing the outcome of manipulating the xforms of three dice within the scene, showcasing the effects of translation, rotation, and scaling relative to data derived from the translation matrix of Dice1.

6.3.4 Enhancing the Look of Your Stage

A well-designed stage not only captures attention but also conveys the intended mood and message more effectively. By carefully curating the backdrop, lighting, and object materials (as learned in the previous chapters), you can create a visually cohesive and immersive environment that draws viewers in. Let’s take this opportunity to build on skills learned in previous chapters to enhance the look of our dice scene.

To enhance the visual appeal of your stage, several common techniques can be employed. Here are some suggestions, but feel free to experiment with other techniques that we’ve covered in the earlier chapters:

Adding a Simple Background

This can set the tone and provide context to your scene, serving as a foundational visual element. We have provided a backdrop.usd in the ‘Ch05’ folder of assets for this chapter:

backdrop =  UsdGeom.Xform.Define(stage, '/World/Backdrop')

backdrop.GetPrim().GetReferences().AddReference(<your file path to Backdrop.usd ex: './Assets/Backdrop.usd'>)    #A

Designing Thoughtful Lighting

The strategic placement and varying intensities of light sources can highlight key areas, create depth, and evoke specific moods. For example, the following code will add a Distant Light to give a strong directional light over the whole scene. Then we will rotate it so that it throws some shadows across some of the cubes to enhance the sense of solidity and depth in the image:

from pxr import UsdLux

distant_light = UsdLux.DistantLight.Define(stage, "/World/Lights/DistantLight")

distant_light.AddRotateXYZOp().Set(Gf.Vec3d(-51.3, 0, -46.1))

distant_light.CreateIntensityAttr(750)

Now, you could experiment by adding additional lights to soften the shadows.

Varying the Colors and Materials

Variation in color can be used to add richness and contrast, making the scene more exciting and visually engaging. For example, you can change the color of the second dice to red by editing the properties of the material ‘Dice_Color’, who’s shader is located at the path “/World/Dice2/materials/Dice_Color/preview_Principled_BSDF”:

from pxr import UsdShade

# Retrieve the shader at the specified path 
shader_path = "/World/Dice2/materials/Dice_Color/preview_Principled_BSDF"
shader = UsdShade.Shader(stage.GetPrimAtPath(shader_path))

# Set the diffuse color of the shader to a red value
shader.GetInput("diffuseColor").Set(Gf.Vec3f(0.8, 0, 0))

If you have already looked at the materials on the dice you may have noticed that there are two materials, one for the dice color and one for the letter color. Therefore, if you want to change the color of the letters on one of your dice, you will need to change the properties of the material called ‘Letter_Color’, whose shader path would be "/World/Dice2/materials/Letter_Color/preview_Principled_BSDF_001”.

Varying Object Positions

Varying your objects’ positions on stage can create a more visually interesting scene, guiding the audience’s focus and enhancing the overall composition, which involves designing your own preferred object transforms. Let’s try doing that in the following exercise.

Stage with lighting and material enhancements

Figure 12:The stage after being enhanced with thoughtful lighting, dynamic object placement, and vibrant material contrasts, creating a visually striking scene.

Summary