Hackspace Project 2: Revolution

"Revolution" is a technology mediated, interactive art installation, exploring (and allowing an audience to play with) the actions available to citizens when faced with some oppressive or undesirable controlling force. It plays with the balance of power between people and the forces they reject. Participants use poses, detected by a camera and a machine learning model, to act as a leader for a tribe, and they are free to do nothing, to rally their tribe, to influence others across tribes, or to act in unison. Revolution is the ultimate, unified action across tribal beliefs, and having the greatest power against the oppressive force.

This is not about winners or losers, or political statements, it is simply a playground for individuals to explore the power of actions at different scales and different levels of unity, and of many individuals acting with a single purpose.

A video demonstration of the project

The Project files for the installation can be found on GitHub

Theme setting and ideation for the project: hackathon 2

On 21 October 2019 we held a theme setting day to collectively agree a set of themes for the upcoming hackathon.

The ideation exercises, and my own subsequent exploration of the agreed themes is fully

documented in this note

To summarise, we explored several ideation techniques to generate themes and narrow them down by consensus. Many techniques were discussed but those employed included:

  • Reverse brainstorming

  • Generating constraint words (literally words/themes that could act as creative constraints to an activity)

  • Observe a different place (walk around a novel environment and pay attention)

  • Embrace absurdity

These generated a long list of possible key words and constraints. As a group, we combined, refined and distilled them to a shortlist, which was reduced by popular vote to just three.

  • Revolution

  • Kinetic

  • Reflect

Through my own ideation and exploration of these themes, documented in the linked note above, I generated several alternative, creative responses to each theme. I settled on exploring Revolution through some interactive installation, that encouraged participants to learn how joined-up action was more powerful than individual action.

I looked at the different ways in which any project can deliver value and aligned this project as follows

Value Realm - where are we aiming to deliver value?

My personal aims for the project (why?)

  • Create a digital artefact of value (see value realm above)

  • Maximise opportunities to gain marks on the hack-space mark scheme

  • Create an experience for users/participants, that brings something abstract, difficult or uncomfortable to be real, experienced, sensory, visceral.

  • Gain some creative development/stretch

  • gain some technical development/stretch

  • Create something that contributes to my portfolio

The technologies I will use at least initially to start the hackathon are:

  • p5.js for prototyping

  • Web as a device interface

  • Processing for scale and performance

Expanding the idea

I created this mood board, exploring themes of revolution and the way the project might develop

Planning the project

With mixed success, I used a Trello board to try and use the available time effectively.

Hackathon 2: 7 November 2019

I began the hackathon having developed a response to the chosen theme, and with some specific ideas for prototyping. I could not be sure how far I would get during the day, or whether explorations might throw up obstacles and new possibilities.

What did I want out of the hack day?

  1. Learn what the requirements are for the media wall. How do you distribute workload?

  2. Re-aquaint with Processing and managing lists of objects. I can use P3D to get depth and use lighting/materials, or I can keep it flat and use 2.5D effects?

  3. Model possible user interactions

  4. Think about the relationship between user actions and visual effects

I finished the day with a refreshed confidence of Processing object orientation and 3D geometry, plus some confidence around the real-time data sharing between web clients using Node.js and Socket.IO. The 3D rotating cylinder seemed a good visual structure for the project. Some misgivings around how to interface web and Processing effectively were emerging.

Image of the basic vision and possible technology architecture

Development Streams

After the hackathon, development work carried on along a few separate but parallel streams.

The progress along each of these streams is documented separately.

Technology

From the start I planned to use Processing
for visuals, as it is pretty fast, I know it (or can reacquaint myself after a 3 year break), and it can render in 3D space (which is part of the vision). There are also many ways to add interaction.

The original vision included a real-time web interface with visitors’ mobile phones, so they can interact on their screens by touch, and witness any effect in the large-screen narrative. This would use a Javascript client, with p5.js handling visuals and touch interaction, and using socket.io to communicate in real time to a node.js backend.

the Node server will need to communicate current player state to the processing sketch.

There is a socket.io implementation for Java but for this we must use Processing as library in full java mode, using Eclipse as an editor to add libraries using the package manager, Maven. In this mode, both Processing and socket.io are added as dependencies.

Processing seems to only work in Java mode if you install a deprecated Java virtual machine (version 8, as part of JDK 1.8) current JVM is version 13.

However in this mode, we could not get Maven to add the socket.io library as required modern dependencies are not met in this deprecated mode.

This technology stack was looking increasingly fragile. This brought me back to using Javascript and P5 for the large screen work, but this came with the following limitations:

  • 2D only, as WEBGL render in P5 is too limited; and

  • performance limits - the number of objects in the render space, or the complexity of those objects (from a rendering/compute viewpoint) is much more limited than in Processing.

Something like ThreeJS would overcome efficiency and 3d limitations but was considered to have too much of a learning curve to get creative mastery of in the time available.

Sticking with Processing then, I examined my other options for interfacing. Websockets is a low level interface that both javascript and Java libraries exist.

Briefly I looked at using Spacebrew as a promising looking middleware to successfully communicate data between web and Processing clients. It seemed pretty viable.

Another possibility was to abandon web clients, and use other more direct interface to Processing, such as pressure pads, or Microsoft Kinect. Pressure pads are pretty limited in subtlety for interaction. Kinect is great and I have soem experience of using this with Processing, in a previous project, SeePilgrims.

By chance, I came across RunwayML a framework for running Machine Learning (ML) models for creative projects and with a documented Processing access, using OSC (Open Sound Control) as the protocol. One of the popular ML models shown was PoseNet, a human pose detection model. An introductory video to using Runway and OSC with processing was made by the CodingTrain YouTube channel.

Although SpaceBrew was looking quite viable as a way to keep javascript enabled device interfaces in the mix, I really liked the idea of pose detection from viewers physical presence as the interface. so the new technology stack became as follows.

Pose Detection

Running RunwayML locally, and with the PoseNet ML model installed, pose detection is working pretty effectively with an external webcam, with resolution down to 300x200 for speed (a few frames a second is sufficient). Pose data takes the form of x,y coordinates for yet body joints/features, measured against the resolution of the input image. This gives us points for a simplified skeleton of points for each detected body in the image.

I can receive the pose data from the model over OSC into a processing sketch as JSON data. Processing has methods to decompose JSON and extract the data as more Java-ready data types.

Runway with the PoseNet model running

Although throughout this project we run Runway on the same machine as the processing sketch (using OSC to effectively point back to the local IP address), the webcam and Runway pose detection could potentially be run on a separate machine, leaving more CPU capacity for both Processing and Runway to use, without competition.

Target poses

Turning skeleton points into body part angles, then into poses

The next challenge was to process raw skeleton data into recognisable poses, as input for the project, free from ambiguity, and with sufficiency reliability. One experiment uses the relative height, left/right of one joint with respect to another, for example if the left wrist is higher than the left elbow, with is higher than the left shoulder. This is partially successful, but discerning multiple poses was less effective.

Instead I used simple trigonometry to determine the angle of lower and upper arms on each side (using atan2). This allows the angle for each segment to be compared with an exemplar for the pose, within a defined tolerance.

A simple Processing sketch using data to guess poses, with mixed success

This is successful enough that, with some refinement, it could be encapsulated into a Processing module, which has a defined area of the camera space in which to check for poses, for each potential player. this module is passed the raw pose data from Runway, performs its calculations, and then returns either a matching pose, or "NONE".

At this point the pose detection was merged with an early version of the visual/narrative project code. Up to this point a mouse driven proxy for poses was used to drive the user actions and influence the visuals. This interface was retained as it was often useful to test the visuals without starting the CPU-intensive Runway.

Performance

I always wanted to squeeze the richest possible visual experience out of processing. This means lots of objects being updated, and lots of vertices for the code to calculate and render. That means putting quite a load on the CPU. With the media wall magnifying every pixel I needed to work hard to keep the visuals subtle.

However it was also clear that Runway, running the PoseNet model was competing for the CPU. Both pieces of software are capable of using the GPU to offload some of the work. It turns out that PoseNet can only do this when running on Linux, although this could change in the future. I think that Processing does not offload as much as you would think to the GPU unless you write your render code with this in mind. For these reasons you really do need a powerful games-spec PC to be able to run the project without lagging.

Possible future versions of the project could work to ensure the GPU is employed more fully, improving performance.

Target display and aspect ratio

The target display for this project has always been the media wall. At 7.35m x 3.75m, the scale gives an altogether different relationship between the viewer and visuals, and really allows the menace of the dark actor to be felt.

For development and testing work, the laptop screen and a reasonable projector were used, although the impact was absent.

The aspect ratio of the media wall was used as the target throughout in any case.

Media wall tech details are published on the Bath Spa University website.

Media wall Dimensions

7.35m x 3.75m (AR 0.51)

Pixels: 1920 x 3600 (Aspect ratio 0.53)

Some thoughts on the scale of the medial wall, and the use of the space in front, and using all of the display space appropriately (action at the bottom, menace from above)

On 4th December I tried, unsuccessfully, to carry out some testing on the media wall, to see how the software would run and to get some sense of how the visuals worked at that scale. Unfortunately the media wall had faults and these were not addressed until the Christmas break.

The media wall receives some TLC

A look at alternative display targets, when the media wall was out of action

Technology Specification

The demo-ready specification for hardware and software can be found in this note

Creative Development

Visual Design

On the hack day, I just experimented with a simple swirling column of primitive elements in a 3D space, to test the visual theme, and to check my understanding of object orientation and 3D geometry in Processing was sound. This basic use of 3D space became the motif for the whole creative vision.

My interpretation of the ‘Revolution’ theme demands a representation of a dark Actor, a sinister force, coming from above to add to the menace (especially on the media wall scale). This would be met by a representation of many individual good actors, whose representation on the screen would individually not be that great, but which collective, as different visual apparatus applied, become more powerful, to reflect the growing power of unified citizens.

The visual motif for the dark actor established quite early on, based on a sketch I had developed for my creative coding experiments.

An early version of the dark actor visuals

The good actor design was harder. We needed to track a point, rotating in 3D space for each citizen/actor, be able to associate them in tribes, and then have a sequence of visual effects that developed with each citizen action, or increasing drama and presence.

There are many actors of each tribe, and each tribe is effectively "led" by a player of this installation, through their posed actions.

These developed separately, some only really emerging quite late on.

Some of the early visual effects in development

The final effect for Revolt!

Processing Visual Techniques

Processing uses a frame based draw cycle as the paradigm for creating graphic effects. You need to explicitly recalculate and render any element every frame (30-60 frames per second), as there is no separate engine to do this for you.

I have used a number of techniques in Processing, and 3d geometry. Most of the effects are based around a vertical cylinder space in 3d, with effects rotating around this cylinder.

There is some use of primitive shapes (ellipses, lines) for some effects, but most are created with custom shapes built out of vertices which are created, moved and rendered generatively. Most of these effects are based on the idea of objects with their own life span that are spawned, have set lifespan over which they alter according to some algorithm, and which then remove themselves. There is extensive use of Perlin noise to give a natural, organic feel to the shapes and their movements.

The code is organised through extensive use of object orientation, both as a means of encapsulating larger code structures, and as a way of containing and spawning many instances of independent visual elements that calculate their own shape, position, motion, and life span.

Narrative

With visual motifs established I then had the challenge of tying these together in some sort of narrative that responded to sustained citizen action, and which also invoked a response from the dark actor, and which created a sense of a tense battle for dominance.

This was quite tricky, and with technical and visual development coming together late on, there was not much time for tuning and testing this with others.

Some narrative ideas in a storyboard

The response to citizen actions took the following path.

  • citizens only have available to them the lowest set of actions, until they earn the next level and unlock it, through sustained invocation of the current action.

  • This creates a kind of stepped crescendo in citizen power.

  • The effect of the current action grows the power of the effect visually over time until it hits a threshold, and unlocks the next action.

  • The visual effects are tied to the currently invoked action. When no longer invoked the current visual effect will subside slowly back to a neutral state.

  • The current strength of citizen action, building on top of the last gives this stepped growth up to a maximum good actor score.

I decided that the dark actor strength should respond directly to this. The dark actor is always stronger, but only needs to expend just enough strength to be one step ahead. In this way the dark actor tracks the current good actors’ aggregate "score". But when a new action is unlocked by reaching the maximum effect of the current action, the dark actor releases a surge of angry energy. Enough to temporarily overpower the citizens and shake the world, as a kind of warning.

Only at the highest level, when the citizens revolt, do they have enough power to actually weaken the dark actor.

While I believe this whole interplay would benefit from more fine tuning, the basic model is a sound one for prototype demo. Further tuning and testing would be needed to create the desired audience experience.

User Experience, User interface

We want the visitors to this installation to take a journey and understand the powers available to citizens facing an oppressive force, and to understand that those actions must be earned, sustained and come at a cost of compromising individual and tribal identity.

The visual style is deliberately abstract: this is not a history lesson or a reenactment, but a creative experiment exploring a theme. It should be enjoyable, but have gravity and a sombre tone. I want visitors to understand the context, but without needing a lengthy briefing or scene setting.

While this remains a prototype, there is a fair amount of intrusive UI visible. Some of this is for debug and development purpose (and can be hidden) but some will need to remain and therefore be developed into something that is present, giving users feedback of their actions, but without detracting from the visual effect of the piece itself.

The testing of pose detection has been pretty successful, with different people successfully achieving the recognised poses with minimal intervention from me. The pose detection using the web cam suffers from lighting sensitivity. While the Microsoft Kinect, by contrast, uses infrared distance estimation (which is not affected so much by visible light), combined with a camera, the posenet model has only the RGB camera. It is therefore vulnerable to variations in foreground and background lighting, users’ clothing, and background elements that the model may confuse for body parts.

There is also the problem that, is using a projector for development, instead of the intended media wall, we have a bright projector shining just above the webcam. The conditions that are good for pose detection (good day time or artificial light levels) are bad for projection, especially the projection of something that uses dark shades and subtle motifs.

All of this does compromise the user experience somewhat. It would be good to test the project on the media wall to see what the scale and good daylight contrast brings to the experience.

Critical reflection and evaluation

Throughout the development of this project I have undertaken the reflective practise techniques (Graham Gibbs - reflective cycle), at points where the project hit a milestone a roadblock, or an unexpected turn.

These can be seen in detail in this [note(https://www.evernote.com/l/ADomcXtIeCZF2KGzY0m7rLcSL8ekLCKQeeI) but the key lessons learned are listed below.

Key learnings

  • Make an early evaluation of engineering feasibility, before investing in creative development.

  • When I hit a block, consider what possibilities exist for alternative approaches. This may not be the obvious one.

  • Involve other people with diverse skills and thoughts in ideation, to broaden my own ideas pool. And again when I hit a block.

  • I don’t need to avoid difficult elements, just find a different perspective. Keep it simple and just start, do something, rather than ruminating on a deadlock. Find another starting point (Oblique Strategies?)

  • Site specific creative and technical focus is a very valid way to design a creative piece, and there are ideas in here that come from the vision for that specific space (more than just the media wall), but consider how it can be made more general/portable. Remain excited of responding to location/space/site with artistic aspects, but…be open to alternatives and find something new/positive out of the constraints presented.

  • Use setback or forced change as a new constraint and iterate on design themes rather than forcing the old design into an ill-fitting box.

  • Before getting stuck into the artistic and technical development work, I should mockup and prototype the effect, the intention of the piece, using whatever is available.

  • The timeline should have had intermediate deadlines to create a space for integration, performance and UX work during the last 2 weeks. Some artistic development may have been sacrificed.

  • Take a view as to which elements I will personally be able to work on, and establish how other roles will proceed (either static assumptions, placeholder/boilerplate content, other people/roles involved, or time limited attention to each role). Give myself permission to not develop all aspects personally.

Critical evaluation of the project at this, prototype stage.

Where are we with respect to the vision?

All the ingredients are there to meet the brief. We have a narrative embodied within the user interactions, and visual effects. We have the visual story telling through the progression between abstract yet distinct phases. The entire piece is driven through users making choices and acting on them. And we have a working set of technology that delivers them in a reliable and performant way.

Next Steps

What is missing seems to be an orchestration, making the narrative really tell the story, really motivate the users to take action, and feel the consequence. I feel this is 80% about testing and fine tuning now that we have a very workable prototype. Work is also needed on the user interface and context setting for the viewer.

I actually think that a separate ideation, development stream for this, separate from the technical osculation would be worthwhile. I have heard artists describe the need to see your piece from an audience perspective, in its intended place. What do they expect? What do they know already? what do you want them to know? How might you want them to feel during, after? This would inform the wrapping and placing of the project for exhibiting, and also the practical needs from a user interface and illuminating information provided.

Value goals

Notwithstanding the orchestration/experience flaw mentioned above, I believe that this project delivers against the identified value goals, were it to be exhibited. These were cultural, society, artistic, and public engagement. Were this to be a grant funded project in an exhibition space, I would undertake surveys with visitors to gauge their feelings agains these value measures.

My own personal goals

To conclude the reflection section, I happy with how most elements of the project have progressed. While I have ended the module with a fully working prototype, I would like to spend time polishing the overall experience in the light of narrative and experience play testing. This would ensure that all the hard creative and technical work delivers a satisfying, illuminating experience.