SVG and JavaScript: transform viewport coordinates into element coordinates

25 minute read

A couple of months ago I built a JavaScript application that allows adding points and labels to locations on a building floorplan. The whole canvas (not HTML <canvas>) is a SVG document inside an HTML document and points/objects/labels/etc. are added to that canvas as native SVG elements.

Users can add/move objects on the floorplan but also zoom and pan the floorplan itself.

When performing these actions it is important to transform coordinates from the screen or viewport (like the position of your mouse/fingers) into coordinates that make sense in your SVG element’s coordinate system.

In this post I want to share some of my notes and a simplified example on how to achieve this using a group element with a transform attribute and a few objects (circle elements) inside.

Panning, Zooming, Dragging

Let’s say we have a setup as shown in the following Gif:

Zooming, panning and dragging

We can move the canvas, we can move objects, zoom in and then move objects again in a zoomed/panned state.

If you look at the inspector window you see that the zooming and panning is performed by changing the transform attribute on the group element with id main, and that moving an object (a circle element) just changes the respective cx and cy values (as the objects are inside the transformed group).

All the movements follow the mouse pointer in a reasonably fast manner.

Implementing the start of the pan

A first (incomplete) approach to implementing the panning or dragging could be to store clientX and clientY from the mousedown or touchstart event and then calculate the difference to the clientX and clientY from the mousemove or touchmove event. While this difference represents the amount the mouse moved in the browser’s viewport (window), it cannot be directly applied to the transformation matrix of our main element as the SVG document on the page may not cover the whole viewport, might be in a scaled state, etc. – in any case, we have at least two different coordinate systems.

The effect when not translating the coordinates is usually that objects move either much faster or much slower than your mouse pointer.

Let’s now see how we can transform the coordinates you get from events to coordinates you can use in your SVG document.

A minimal HTML document with an SVG document inside (as seen in the Gif above) could look like this:

<html>
  <head>
    <title>SVG app</title>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>
  <body>
    <div id="canvas-wrapper">
      <svg
        id="canvas"
        viewBox="0 0 1024 768"
        preserveAspectRatio="xMidYMid"
        xmlns="http://www.w3.org/2000/svg"
      >
        <g
          id="main"
          transform="matrix(1 0 0 1 0 0)"
        >
          <image
            x="0"
            y="0"
            width="1024"
            height="768"
            xlink:href="background.png"
          ></image>
          <!-- some "random" points -->
          <circle id="p1" cx="512" cy="284" r="4" fill="blue" stroke="black" />
          <circle id="p2" cx="500" cy="300" r="4" fill="red" stroke="black"/>
          <circle id="p3" cx="490" cy="330" r="4" fill="red" stroke="black" />
          <circle id="p4" cx="430" cy="250" r="4" fill="red" stroke="black" />
        </g>
      </svg>
    </div>
  </body>
</html>

Let’s say we want to start our panning action when the user clicks on the div with the id canvas-wrapper.

Once the document is ready, we register the event listener, for example like this:

document.getElementById('canvas-wrapper').addEventListener('mousedown', onPanStart);

Our function onPanStart could look like this:

function onPanStart(e) {
  const {clientX, clientY} = e;

  // get our own transform matrix from the main element
  // (we will modify this as we move our mouse along)
  const currentTransform = getTransform();

  // get the transform matrix used to convert from the SVG element's
  // coordinates to the screen/viewport coordinates
  const sctm = document.getElementById('main').getScreenCTM();

  const mouseStart = transformFromViewportToElement(clientX, clientY, sctm, currentTransform);

  canvas = {
    mouseStart,
    transform: currentTransform,
    sctm
  }

  document.getElementById('canvas-wrapper').addEventListener('mousemove', onPan);
  document.getElementById('canvas-wrapper').addEventListener('mouseup', onPanEnd);
}

In a nutshell:

  • We get the clientX and clientY representing the point in the viewport where the event (mousedown) occured.
  • Then we get the transform matrix of our main group (in the beginning [ 1, 0, 0, 1, 0, 0 ]) which we later use to translate/scale. The getTransform function simply reads the transform attribute and creates an array from it:
function getTransform() {
  const matrix = document.getElementById('main').getAttribute('transform');
  return matrix.replace(/^matrix\(/, '').replace(/\)$/, '').split(' ').map(parseFloat);
}
  • Also from our main SVG element, we get the matrix that transforms the element’s coordinate system to the viewport’s coordinate system.
  • Now comes the interesting part: we transform our coordinates from the event into coordinates for the element’s coordinate system (transformFromViewportToElement function explained below).
  • Finally, we store the values in variable named canvas (let canvas = {}) for later access and add the event listeners for the actual panning (mousemove) and end (mouseup).

All the magic happens in the transformFromViewportToElement function:

function transformFromViewportToElement(x, y, sctm=null, elementTransform=null) {
  // Transforms coordinates from the client (viewport) coordinate
  // system to coordinates in the SVG element's coordinate system.
  // Call this, for example, with clientX and clientY from mouse event.

  // create a new DOM point based on coordinates from client viewport
  const p = new DOMPoint(x, y);

  // get the transform matrix used to convert from the SVG element's
  // coordinates to the screen/viewport coordinate system
  let screenTransform;
  if (sctm === null) {
    screenTransform = document.getElementById('main').getScreenCTM();
  } else {
    screenTransform = sctm;
  }

  // now invert it, so we can transform from screen/viewport to element
  const inverseScreenTransform = screenTransform.inverse()

  // transform the point using the inverted matrix
  const transformedPoint = p.matrixTransform(inverseScreenTransform)

  // adjust the point for the currently applied scale on the element
  if (elementTransform !== null) {
    transformedPoint.x *= elementTransform[0]; // scale x
    transformedPoint.y *= elementTransform[3]; // scale y
  }

  return {x: transformedPoint.x, y: transformedPoint.y}
}

I commented each step in the function above. Essentially, we create a point with our event coordinates, take the matrix used to convert from the element to the viewport, invert it (since we want the opposite) and then do a matrix transform of our point with said matrix. In case we are working with a transformed element (as is the case for our canvas panning, but not for object dragging) and are in a zoomed-in/out state, we also want to apply that scale to our coordinate.

Actual panning and ending

To complete our basic example, let’s see how the onPan and onPanEnd functions could look like:

function onPan(e) {
  const {clientX, clientY} = e;
  const client = transformFromViewportToElement(clientX, clientY, canvas.sctm, canvas.transform);

  // calculate how much we have moved from the starting point
  const movement = {
    x: canvas.mouseStart.x - client.x,
    y: canvas.mouseStart.y - client.y
  };

  // set `tx` and `ty` (translate x, y) of matrix with the offset that
  // was set at the beginning of the movement minus the actual movement.
  const startMatrix = [...canvas.transform];
  startMatrix[4] = startMatrix[4] - movement.x;
  startMatrix[5] = startMatrix[5] - movement.y;

  // update the actual transform attribute of the SVG `main` group
  document.getElementById('main').setAttribute(
    'transform', `matrix(${startMatrix.join(', ')})`);
}

function onPanEnd(e) {
  canvas = {};

  document.getElementById('canvas-wrapper').removeEventListener('mousemove', onPan);
  document.getElementById('canvas-wrapper').removeEventListener('mouseup', onPanEnd);
}

In the onPan function we get and transform the new event coordinates (where the mouse pointer is now) in the same way as we did before, then calculate the difference between start and current position, apply the difference to the original (at start of pan) transform matrix’s tx and ty (index 4 and 5), and finally set the transform attribute of main to the updated matrix.

Note that we are changing the matrix of our transform attribute of the group here – this is not the matrix obtained by getScreenCTM!

In onPanEnd we simply reset our global canvas variable and then remove the listeners.

Dragging and dropping objects

The dragging of objects could be implemented in the same way (using transformFromViewportToElement). The only difference here is that we would modify the x and y coordinates (or rather cx and cy) directly (instead of a transform attribute), and also ignore the current scale of the group element’s transform matrix (set elementTransform parameter of transformFromViewportToElement to null).

Like to comment? Feel free to send me an email or reach out on Twitter.

Did this or another article help you? If you like and can afford it, you can buy me a coffee (3 EUR) ☕️ to support me in writing more posts. In case you would like to contribute more or I helped you directly via email or coding/troubleshooting session, you can opt to give a higher amount through the following links or adjust the quantity: 50 EUR, 100 EUR, 500 EUR. All links redirect to Stripe.