Rebecca Fiebrink Perry Cook, Advisor Pre-FPO, 6/14/2010

1 / 78

# Rebecca Fiebrink Perry Cook, Advisor Pre-FPO, 6/14/2010 - PowerPoint PPT Presentation

## Rebecca Fiebrink Perry Cook, Advisor Pre-FPO, 6/14/2010

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Real-time Human-Computer Interaction with Supervised Learning Algorithms for Music Composition and Performance Rebecca FiebrinkPerry Cook, AdvisorPre-FPO, 6/14/2010

3. function [x flag histdt] = pagerank(A,optionsu) [mn] = size(A); if (m ~= n) error('pagerank:invalidParameter', 'the matrix A must be square'); end; options = struct('tol', 1e-7, 'maxiter', 500, 'v', ones(n,1)./n, … 'c', 0.85, 'verbose', 0, 'alg', 'arnoldi', … 'linsys_solver', @(f,v,tol,its) bicgstab(f,v,tol,its), … 'arnoldi_k', 8, 'approx_bp', 1e-3, 'approx_boundary', inf,… 'approx_subiter', 5); if (nargin > 1) options = merge_structs(optionsu, options); end; if (size(options.v) ~= size(A,1)) error('pagerank:invalidParameter', … 'the vector v must have the same size as A'); end; if (~issparse(A)) A = sparse(A); end; % normalize the matrix P = normout(A); switch (options.alg) case 'dense’ [x flag histdt] = pagerank_dense(P, options); case 'linsys’ [x flag histdt] = pagerank_linsys(P, options) case 'gs’ [x flag histdt] = pagerank_gs(P, options); case 'power’ [x flag histdt] = pagerank_power(P, options); case 'arnoldi’ [x flag histdt] = pagerank_arnoldi(P, options); case 'approx’ [x flag histdt] = pagerank_approx(P, options); case 'eval’ [x flag histdt] = pagerank_eval(P, options); otherwise error('pagerank:invalidParameter', ... 'invalid computation mode specified.'); end; function [x flag histdt] = pagerank(A,optionsu)

4. ? Effective Efficient Satisfying

5. ? Effective Efficient Satisfying Machine learning algorithms

6. Outline • Overview of computer music and machine learning • The Wekinator: A new interface for using machine learning algorithms • Live demo + video • Completed studies • Findings • Further work for FPO and beyond • Wrap-up

7. computer music

8. Interactive computer music sensed action interpretation response (music, visuals, etc.) computer

9. Computer as instrument sensed action interpretation sound generation computer

10. Computer as instrument sensed action interpretation mapping sound generation human + control interface computer

11. Computer as collaborator sensed action interpretation model meaning sound generation microphone and/or sensors computer

12. A composed system sensed action mapping/model/interpretation mapping/model/interpretation response

13. Supervised learning

14. Supervised learning inputs model training data algorithm Training outputs

15. Supervised learning inputs “C Major” “F minor” “G7” training data model algorithm Training outputs “F minor” Running

16. Supervised learning is useful • Models capture complex relationships from the data and generalize to new inputs. (accurate) • Supervised learning circumvents the need to explicitly define mapping functions or models. (efficient) So why isn’t it used more often?

17. A lack of usable tools for making music Existing computer music tools • WEKA: • Many standard algorithms • Apply to any dataset • Graphical interface + API • > 10,000 citations (Google scholar) • (Witten and Frank, 2005) ??? Weka 1. General-purpose: many algorithms & applications ✓ ✓ ✗ ✗ 2. Runs on real-time signals ✓ ✓ 3. Appropriate user interface and interaction support ✗ ✓ Built by engineer-musicians for specific applications

18. Outline • Overview of computer music and machine learning • The Wekinator: A new interface for using machine learning algorithms • Live demo + video • Completed studies • Findings • Further work for FPO and beyond • Wrap-up

19. The Wekinator • A general-purpose, real-time tool with appropriate interfaces for using and constructing supervised learning systems. • Built on Weka APIs • Downloadable at http://code.google.com/p/wekinator/ (Fiebrink, Cook, and Trueman 2009; Fiebrink, Trueman, and Cook 2009; Fiebrink et al. 2010)

20. A tool for running models in real-time Feature extractor(s) .01, .59, .03, ... .01, .59, .03, ... .01, .59, .03, ... .01, .59, .03, ... time model(s) 5, .01, 22.7, … 5, .01, 22.7, … 5, .01, 22.7, … 5, .01, 22.7, … time Parameterizable process

21. A tool for real-time, interactive design Wekinator supports user interaction with all stages of the model creation process.

22. Under the hood Learning algorithms: Classification: AdaBoost.M1 J48 Decision Tree Support vector machine K-nearest neighbor Regression: MultilayerPerceptron joystick_x joystick_y webcam_1 … Feature1 Feature2 Feature3 FeatureN Model1 Model2 ModelM … … Parameter1 Parameter2 ParameterM volume pitch 3.3098 Class24

23. Tailored but not limited to music The Wekinator • Built-in feature extractors for music & gesture • ChucK API for feature extractors and synthesis classes Open Sound Control (UDP) Control messages Other feature extraction modules Other modules for sound synthesis, animation, …?

24. Outline • Overview of computer music and machine learning • The Wekinator: A new interface for using machine learning algorithms • Live demo + video • Completed studies • Findings • Further work for FPO and beyond • Wrap-up

25. Wekinator in performance

26. Recap: what’s new? • Runs on real-time signals and general-purpose • A single interface for building and running models • Comprehensive support for interactions appropriate to computer music tasks

27. Outline • Overview of computer music and machine learning • The Wekinator: A new interface for using machine learning algorithms • Live demo + video • Completed studies • Findings • Further work for FPO and beyond • Wrap-up

28. Study 1: Participatory design process with 7 composers • Fall semester 2009 • 10 weeks, 3 hours / week • Group discussion, experimentation, and evaluation • Iterative design • Final questionnaire (Fiebrink et al., 2010)

29. Study 2: Teaching interactive systems building in PLOrk • COS/MUS 314 Spring 2010 • Focus on interactive music systems building • Wekinator midterm assignment • Master process of building a continuous and discrete gestural control system, and use in a performance • Logging + questionnaire • Final projects

30. Study 3: Bow gesture recognition • Winter 2010 • Work with a composer/cellist to build gesture recognizer for a commercial sensor bow • Classify standard bowing gestures • e.g., up/down, legato/marcato/spiccato(Fiebrink, Schedel, and Threw, 2010) • Outcomes: classifiers, improved software, written notes on engineering process

31. Study 4: Composition/composer case studies • Completed: Winter 2010 to present • CMMV (Dan Trueman, faculty) • Martlet (v 1.0) (Michelle Nagai, graduate student) • G (Raymond Weitekamp, undergraduate) • Blinky; nets0 (Rebecca Fiebrink) • Interviews completed with Michelle and Raymond

32. Outline • Overview of computer music and machine learning • The Wekinator: A new interface for using machine learning algorithms • Live demo + video • Completed studies • Findings • Further work for FPO and beyond • Wrap-up

33. Findings to date Interacting with supervised learning Training the user Supervised learning in a creative context Usability summary

34. Interactively training • Primary means of control: iteratively edit the dataset, retrain, and re-evaluate • A straightforward way of affecting the model • Add data to make a model more complex • Add or delete data to correct errors

35. Exercising control via the dataset N=21; Students re-trained an average of 4.64 times per task (4.91)

36. The interface to the training data is important • Real-time example recording and a single interface improve efficiency • Supports embodiment and higher-level thinking • Several composers used playalong learning as the dominant method • Supports different granularities of control • K-Bow: visual label editing interface • Spreadsheet editor is still used

37. Interactive evaluation • Evaluation of models is also an interactive process in Wekinator

38. “Traditional” evaluation (e.g. Weka) Available data Training set Train model Evaluation set Evaluate

39. Evaluation in Wekinator Training set Train model Evaluate

40. Interactive evaluation • Running models is primary mode of evaluation • In PLOrk study: • Model run & used: 5.3 times (5.3) per task; • On average, 4.0 minutes (out of 19 minutes) running • CV computed: 1.4 times (std dev. 2.6) per task • Traditional metrics also useful • Compare different classifiers quickly (K-Bow) • Validation (of the user’s model-building ability)

41. When is this interaction feasible? • Appropriate and possible for human to provide and/or modify the data • User has knowledge and (possibly control) over future input space • Training process is fast • Training time in PLOrk: Median .80 seconds, 71 % of trainings under 5 seconds • PLOrk # training examples in final round: Mean 692, std. dev. 610

42. Related approaches to interactive learning • Building models of the user • Standard in speech recognition systems • Use human experts to improve a model of other phenomena • Vision: Fails and Olsen, 2003 • Document classification: Baker, Bhandari, and Thotakura, 2009 • Web images: Amershi 2010 • Novel in music, novel for a general-purpose tool

43. Findings to date Interacting with supervised learning Training the user Supervised learning in a creative context Usability summary

44. Interaction is two-way control Machine learning algorithms feedback Running & evaluation

45. Training the user to provide better training examples • Minimize noise and choose easily differentiable classes

46. PLork students learned: “In collecting data, it is crucial, especially in Motion Sensor, that the positions recorded are exaggerated (i.e. tilt all the way, as opposed to only halfway.) Usually this will do the trick…” “I tried to use very clear examples of contrast in [input features]... If the examples I recorded had values that were not as satisfactory, I deleted them and rerecorded… until the model understood the difference…”