A Collection of Common Patterns and Conventions for Designing Musical Grid User Interfaces
Applications for musical grid interfaces are designed without any established guidelines or defined design rules. However, within applications of different manufacturers, musicians, and designers, common patterns and conventions can be observed which might be developing towards unofficial standards. In this survey we analyzed 40 applications, instruments, or controllers and collected 18 types of recurring UI elements, which are clustered, described, and interactively presented in this survey. We further postulate 3 theses which standard UI elements should meet and propose novel UI elements deduced from WIMP standards.
Musical Grid Interface; Design Standards
•Human-centered computing → Interaction design theory, concepts and paradigms;
Musical grid interfaces are a widespread standard among music-tech industry [1][2] and independent musicians, artists, and developers [3][4]. This is among other things due to the technological freedom and flexibility which is associated with these generic, update-able interfaces and with the relation to the DIY movement and availability of parts and openness of protocols which enable to create and repurpose grids as individual interfaces for musical performance and practice [5][6].
As a result, user interfaces (UIs) for these interfaces are designed by a variety of entities with different goals, intentions, aesthetic ideas, and musical mindsets. However, in the absence of uniform guidelines for the design, we wondered whether standards have nonetheless emerged. Such standards could have developed from a collective memory, or from similar references. We consider such standards as recurring patterns of design language, even if they have not been centrally formalized and defined.
This paper has the purpose to investigate the following questions:
RQ1: Do standards in the design of musical grid applications exist?
RQ2: What are unconsidered options?
In the following we will present the condensed insights from our analysis of independent applications for the monome norns platform and from the analysis of commercial products. We (1) discuss recurring patterns which point to implicitly accepted and adapted conventions, (2) point out inconsistencies and problematic choices from a UI heuristic perspective, (3) present 3 theses which mitigate some of the to the medium-inherent limitations, and (4) present a collection of old and new UI elements which are adaptations of classic WIMP elements on low-res technology.
This paper presents the first overview on UI elements and UI design conventions in the musical grid interface context, we will use the term grid analogously to musical grid following the definition given by Rossmy et al. [5].
Before presenting the survey results, we first need a common language to define: (1) terms for the grid hardware, (2) terms for the interactions performed on grids.
When we are referring to the grid hardware versus the visual content displayed, we use the term grid-interface to describe the device and application to specify the depicted UI. Further, with grid-buttons we are referencing the physical hardware buttons, not the UI elements which can depict UI buttons.
With buttons (grid-buttons) as the central input element, all interactions with grids are composed of one fundamental interaction: a button click. A click consists of a "press", "hold" and "release" phase. Composite actions are button-click combinations of arbitrary complexity, which, using regular expressions, can be noted as:
click (pause, click)*
With increasing complexity of composite actions, the latency between the start and the end is increasing. This adds an undesired delay to the interaction flow. The total time t of an action can be noted as:
where tc is the time of a click tp the time between clicks and tg the time between composite interactions.
Such composite interactions are known from the computer-mouse or from touch-screens [7][8] and combinations of composite interactions such as "Hold + Click" (cf. computer-keyboard : "Shift") are conceivable.
In the following, we present the UI elements used in grid applications. When possible, we use the names of common UI elements to follow a consistent nomenclature.
As the data basis we considered grid applications from the following resources. We used 30 independent user scripts from the norns community which use grids as their interface. We also included 10 commercial devices, which are more standardized interfaces as companies follow an internal design language across device generations and sibling products.
Commercial Products | Novation Circuit, Ableton Push 2, Novation Launchpad Pro Mk3, Synthstrom Audible Deluge, Dreadbox Medusa, Empress Zoia, Polyend Seq, OXI One, Arturia MatrixBrute, Yamaha Tenori On |
---|---|
Grid Applications | Kria, Cheatcodes, Gridstep, colorwheel, buoys, Arcologies, Animators, awake, boingg, cheatcodes, compass, corners, cranes, cyrene, earthsea, foulplay, granchild, gridstep, initere, kolor, less concepts 3, loom, mangl, meadowphysics, metrix, mlr, molly the poly, moonraker, oooooo, plonky |
We analyzed all UIs based on the used (1) UI elements, (2) interactions, and (3) general design strategies and usability heuristics.
For the applications reviewed, we found that the majority built on simple click, and hold and click interactions. While even more complex interactions are conceivable, we only found: (I1) click, (I2) hold, (I3) hold and click, (I4) double click, (I5) double hold, and (I6) pattern click.
While I1, I2, and I3 are used in almost all instances, I4, I5, and I6 occurred sporadically. Typically, I1 is the default interaction with grid content, this is activation of steps, toggling states, or selecting values. I2 is either used to extend the time of a selection (which leads to I5) or is used to trigger alternative actions. I3 is used to access alternative functions (cf. "Shift"), e.g. to mark selections to which values should be assigned, values which should be assigned, or ranges of values. I4 is used to clear cell data, I5 is used to show or access alternative values and functions of elements, and I6 is used to trigger user defined actions and can be compared to functions keys on the PC.
I1 | click | [*] | activate, deactivate, play note, trigger action, select value, iterate values |
---|---|---|---|
I2 | hold button | [_] | select, show value, hold state |
I3 | hold and click | [_][*] | select range, assign value, activate alternative function, create connections |
I4 | double click | [**] | clear, reset, empty |
I5 | double hold | [*_] | show alternative value |
I6 | pattern press | trigger action |
We assume that applications will focus mainly on I1-I3, as these can be explored intuitively. The more complex the interactions become, the less often they are executed correctly or by chance during exploration.
Most applications draw their inspiration from the GUI context known from desktop computers or on smart devices123. This direct transfer includes UI elements from the musical context and replicates therein existing graphical language (cf. piano roll, step-sequencer, mixers).
Almost all applications use some variations of buttons. We define buttons as UI elements that represent binary or discrete values (states). Their activation (state change) can be either momentary (as long as the button is pressed) or switching between states (alternating). The most simplest forms are single cell momentary buttons (35%), toggle buttons (40%), or cycle buttons (17.5%) (iterate multiple states). Multiple buttons can form logical entities such as radio buttons (50%) or value sequences (55%), e.g., used to program sequences.
To represent continuous values, sliders are normally used. Variations are bipolar-sliders (15%), which allow values below and above a neutral center position, polar-sliders (20%), which allow values on one side of the spectrum, and ranges (30%) which determine an interval within a value range. Sliders are often used to provide explicit access to variables, or are implicitly part of other UI elements such as to define lengths or ranges. Bipolar sliders are used for octave transpose, panning, or other options. Further, we found a color wheel used to select color hues. Its special feature is that it not only enables the selection of an abstract value, but visually represents the value to be selected.
For dedicated note input we identified: (1) piano-style keyboards, (2) isomorphic keyboards, and (3) drum-matrices. Piano-style keyboards (5%) mimic the key layout of a piano in a skeuomorphic way [9], whereas isomorphic keyboards (30%) [10][11][12] reference the note layout of stringed instruments. Within an isomorphic matrix, pitch rises chromatically in X direction and in interval increments along the Y direction. If the matrix is wider than the interval increment, the same pitch can reoccur within the matrix. While drum-matrices (12.5%) also provide chromatic steps in X direction, they seamlessly advance into the next row, thus no note repetitions are included. Further, their layout is typically quadratic as found in samplers and drum machines. As a less generic concepts we identified a chord matrix which separates the root-note and chord type selection (OXI ONE). This simplifies the playing of more complex chords.
To structure the user interface, most applications (47.5%) use multiple pages that contain the user interface elements. These are logically grouped by function, focus, or instrument. For navigation, either external controls are used (cf. Ableton Push 2, Launchpad) or radio buttons for multi-page and toggle buttons for two-page applications within the grid. This concept mimics tab-based applications such as web browsers or text editors. This indicates the availability of modes and shows the currently active mode.
In addition, we identified UI elements that have an emphasizing rather than interactive character. In note sequences, e.g., not only the data but also the playback position is indicated. Such indicators (70%) can either iterate through the sequence, or spatially move within the grid. Another form of indicators are signals (10%) which are sent over the applications area (cf. arcologies), interacting with each other or with predefined objects. Further, audio data was drawn as waveforms (2.5%) to allow selection of start and end loop-points (cf. Deluge). This representation is restricted due to resolution, but can depict the rough content. To explain actions, pop-up text (5%) is used to spell out actions (cf. Novation Circuit). An alternative are pictograms (e.g. depiction of waveforms), or glyphs (e.g. basic geometric shapes) which provide a visual glue to the users (cf. earthsea). Such mnemonics (7.5%) can be user-defined "memory glyphs" which function as labels to "name" presets (cf. kria).
Buttons | momentary, toggle, cycle, radio, array, matrix, sequence |
---|---|
Sliders | polar, bipolar, range, color wheel |
Note Input | keyboard, isomorphic, drum matrix |
Structural | pages |
Misc. | indicators, signals, images, pictograms, glyphs, text |
While the previous section considered the UI elements from a conceptual standpoint, the following section highlights the design strategies used in the graphical representation of information. In the following we will illustrate the strategies on the example of a typical step sequencer.
The most basic and common design strategy is the clear visual representation of a cell state. This means, for example, that two different states of a cell (empty vs. activated step) are indicated by two different illumination values.
Further, selections are indicated in addition to the represented data. Thus, as long steps are hold the cells are highlighted and thus differentiable from empty and activated steps (Novation Circuit). To indicate the playback position, the indicator is highlighted either by inverting the color (Polyend Seq), by an increase in brightness, or by pulsating (Novation Circuit). The latter helps when the sequencer is not running. In this occasion the indicator is stationary and thus harder to identify. The pulse emphasizes the moving character of the playhead.
To cope with the limited resolution, color transitions between adjacent cells are used to display continuous values. For example, to indicate playhead positions that are finer than the current resolution, the upcoming position is illuminated proportionally. Another form of animations are transitions (Deluge or TouchGrid [13]) which create a spatial relationship between parts of an application. This relation can be planar, e.g. the horizontal arrangement of pages, or perpendicular such as zoom in and out transitions.
Within the investigated applications we identified the following design decisions with the potential for usability issues. They complicate uninstructed learning, or stand in contrast to UI design principles [14].
First, we found UI elements that did not visually reflect a state change. For example, we found a single-cell toggle button used to switch between the two pages. Clicking the button executed the switch, but it did not changed its appearance. From a usability heuristics perspective, this is a break with "visibility of the system status". At its core, this is based on two factors: (1) clear communication and (2) direct and immediate feedback.
Second, we found that within some application multiple UI element types had the same visual appearance. While this is an issue that is exacerbated by the inherent limitations, it depicts a lack of "consistency and standards". In many instances, the design of radio buttons, sliders, and sequences is identical, even though they may represent different things conceptually and from a data perspective (Ansible Kria).
Third, we observed that due to grid restrictions basic paradigms of GUI design (gestalt principles) are not followed. As the most apparent, missing whitespace creates unintended proximity between UI elements that leads to misconceptions. Basically, the human eye starts to see patterns, continues, closes or interprets shapes that are not actually related or coherent.
Fourth, some of the UI elements do not allow users to anticipate the associated interactive area. This is problematic when multiple such elements are arranged seamlessly on the same axis. For example, if multiple radio buttons are arranged in the same row and only the selected value is highlighted, users cannot easily distinguish whether adjacent cells belong to one element or the other.
Based on the previously presented observations we formulate the following three theses, which ideally should apply to all grid UI elements, and which we include in the designs of the following design collection. Thus, UI elements should be (1) visible, (2) separable, and (3) distinguishable.
This means that the interactive areas of the grid are marked and as a result are visible to the user. In concrete terms, a radio button should not only highlight the selected cell but also the other available options.
Further, when it is not possible to space out UI elements and by such establish a separation between those, the interaction with the elements should make them implicitly separable. This can be achieved by highlighting the whole area of the UI element for the duration of the interaction.
And last, the design of different UI elements should be as distinguishable as possible. The used visual language should help to establish clear and distinct designs which can be in the best case identified without interaction necessary.
Based on the presented observations and postulates, we will present in the following a comprehensive listing of grid UI elements. These contain the common elements adapted to our postulates and further include UI elements known from the GUI world, which are not present in our data set.
Further, we provide interactive examples in the form of javascript mockups, to illustrate the fine-grained differences and existing variations.
Button and button-based UI elements are used to represent binary or distinct values. The associated interactions are: momentary activation (active while hold), toggle behavior (inverted when clicked), and cycling behavior (iterating through a set of values when clicked). Multiple buttons can form logical entities, such as radio buttons or groups of check-boxes.
Sliders represent values in a cardinal range. They are associated with polar or bipolar ranges. Sliders and ranges consist of (1) thumbs, which indicate the selected values, and the (2) track (active and non active).
Key are used to input notes. The values represent scales such as the western twelve tone scale. UI elements which are specially designed to enter, play, and select notes are: (1) piano-style keyboards, (2) isomorphic keyboards, and (3) drum-matrices. To provide orientation, piano-style and isomorphic keyboards highlight octaves, or the "black" or "white" keys.
Containers are used to group UI elements and to provide a way to selectively access these groups. Commonly such groups are structured as pages and selected via radio buttons (Tabs). By the authors proposed UI elements are pop-ups and accordions. Animated transitions are useful to emphasize the relation between the groups and the associated control elements.
Indicators, such as playheads for sequencers, visualize a system state which is not directly interactive but depicts valuable information to the user. Values in between distinct cells can be displayed by color fades and previous values such as passed positions can be illuminated.
We started this survey with the questions if standards in the design of grid UIs exist and if unconsidered UI concepts from the WIMP paradigm can be identified. Therefore, we focussed on conventionally shaped grid interfaces [5], as opposed to more experimental ones [15]. We further focussed on graphical UI design and for this survey ignored the incorporation of velocity or other expressive modalities [16][17]. We found within our data set, that while many applications had a very individual focus and highly specialized use case and design, the majority of applications still reused and replicated consistent conventions.
The reoccurring UI elements are fundamentally intertwined with GUI design in general and depict an adaption of these WIMP concepts to a low resolution domain. While the underlying concepts are standardized, we found that the specific design was often varying and that some implementations can cause usability issues due to this inconsistency. We are aware that most of the examined examples are created by artistic minds which in the first place strive for musical mastery with their implementation. Thus, the musical expression is the measure such applications should be judged by in the first place. However, with the increasing standardization of the grid interface [5] the luthiers of grid applications have to face the same problems as in musical App development [18]. This means that a complicated balance between the easy of use of tools and the associated musical mastery [19] has to be found.
In addition to the examined conventions in UI design we also identified that typical container UI elements were still unconsidered in the grid UI context. This is especially interesting as such struggle with available interactive area and many container concepts such as pop-ups or accordions work with compressed and decompressed space to enable access to UI elements only when required. We proposed such design in this paper to point out the potential, but at the same time see the challenge of offering immediate access to functionality which rather leads wide than deep interface structures.
This survey has shown that UI conventions in grid interfaces exist, and as a consequence we have tried to formalize some of these standards and the design ideas they contain. Although such UI designs are highly dependent on the application context, we hope that this collection can at least be inspirational and highlight some design principles that could improve the usability of grid UIs. We hope to stimulate a discussion within the NIME community about UI design standards as they exist in the GUI world. We see this paper as a starting point for defining such conventions, which need to be progressively refined and considered from multiple perspectives.
Within the survey only publicly available resources were used. The data that was collected was handled according to the requirements of the German Data Protection Act and stored anonymously.