<?xml version='1.0' encoding='UTF-8'?>

<?xml-stylesheet href="./_c74_tut.xsl" type="text/xsl"?>
<chapter name="Tutorial 25: Tracking the Position of a Color in a Movie">
<setdocpatch name="25jColorTracking" patch="25jColorTracking.maxpat"/>

<previous name="jitterchapter24">QuickTime Effects</previous>
<next name="jitterchapter26">MIDI Control of Video</next>
<parent name="jitindex">Jitter Tutorials</parent>



<h1>Tutorial 25: Tracking the Position of a Color in a Movie</h1>
<h2>Color Tracking</h2>
<p>There are many ways to analyze the contents of a matrix. In this tutorial chapter we demonstrate one very simple way to look at the color content of an image. We'll consider the problem of how to find a particular color (or range of colors) in an image, and then how to track that color as its position changes from one video frame to the next. This is useful for obtaining information about the movement of a particular object in a video or for tracking a physical gesture. In a more general sense, this technique is useful for finding the location of a particular numerical value (or range of values) in any matrix of data.</p>

<h2>jit.findbounds</h2>
<p>The object that we'll use to find a particular color in an image is called <o>jit.findbounds</o>. Since we're tracking color in a video, we'll be analyzing&#x2014;as you might expect&#x2014;a 4-plane 2-dimensional matrix of <i>char</i> data, but you can use <o>jit.findbounds</o> for matrices of any data type and any number of planes.</p>

<p>Here's how <o>jit.findbounds</o> works. You specify a minimum value and a maximum value you want to look for in each plane, using <o>jit.findbounds</o>'s <m>min</m> and <m>max</m> attributes. When <o>jit.findbounds</o> receives a matrix, it looks through the entire matrix for values that fall within the range you specified for each plane. It sends out the cell indices that describe the region where it found the designated values. In effect, it sends out the indices of the <i>bounding</i> region within which the values appear. In the case of a 2D matrix, the bounding region will be a rectangle, so <o>jit.findbounds</o> will send out the indices for the left-top and bottom-right cells of the region in which it found the specified values.</p>


<p>In this example we use the <o>jit.movie</o> object to play a movie (actually an animation) of a red ball moving around. This is obviously a simpler situation than you will find in most videos, but it gives us a clear setting in which to see how <o>jit.findbounds</o> works. Notice that we've used typed-in arguments to initialize the <m>min</m> and <m>max</m> attributes of <o>jit.findbounds</o>.</p>

<illustration><img src="images/jitterchapter25a.png"/>Minimum and maximum values specified for each of the four planes</illustration>

<p>There are four arguments for these attributes&#x2014;one value for each of the four planes of the matrix that <o>jit.findbounds</o> will be receiving. The <m>min</m> attribute sets the minimum acceptable value for each plane, and the <m>max</m> attribute sets the maximum acceptable value. These arguments cause <o>jit.findbounds</o> to look for any value from 0 to 1 in the alpha plane, any value from 0.75 to 1 in the red plane, and any value from 0 to 0.1 in the green and blue planes. Since the data in the matrix will be of type <i>char</i>, we must specify the values we want to look for in terms of a decimal number from 0 to 1. (See <i>Tutorials</i> <i>5</i> and <i>6</i> for a discussion of how <i>char</i> values are used to represent colors.) We want to track the location of a red ball, so we ask <o>jit.findbounds</o> to look for cells that contain very high values in the red plane and very low values in the green and blue planes. (We'll accept any value in the alpha plane.)</p>

<bullet>Click on the <o>toggle</o> to start the <o>metro</o>. As the red ball moves around, <o>jit.findbounds</o> reports the cell indices of the ball's bounding rectangle. Stop the <o>metro</o>, and examine the numbers that came out of <o>jit.findbounds</o>. You'll see something like this:</bullet>

<illustration><img src="images/jitterchapter25b.png"/><o>jit.findbounds</o> reports the region where the specified color appears</illustration>

<p>The <o>jit.findbounds</o> object will report the region where it finds the desired values <i>in all planes of the same cell</i>. In this picture, the <o>jit.findbounds</o> object found the values we asked for somewhere in columns 120 through 159 and somewhere in rows 50 through 89 inclusive. This makes sense, since the red ball is exactly 40 pixels in diameter. Those cell indices describe the 40x40 square region of cells where the ball is located in this particular frame of the video.</p>

<p>Note that the output of <o>jit.findbounds</o> from its first two outlets is in the form of two lists. The first outlet reports the starting cell where the values were found in each dimension and the second outlet reports the ending cell of the region in each dimension. (Since it's a 2D matrix, there are only two values in each list, and we use the <o>unpack</o> objects to view them individually.)</p>

<p>If we wanted to know a <i>single point</i> that describes the location of the ball in the video frame, we could take the center point of that rectangular region reported by <o>jit.findbounds</o> and call that the location of the ball. That's what we do with the <o>expr</o> objects.  For each dimension, we take the difference between the starting cell and the ending cell, divide that in half to find the center between the two, and then add that to the starting cell index to get our single location point.</p>

<illustration><img src="images/jitterchapter25c.png"/>Calculating the center point of the rectangle</illustration>

<p>Notice that for the vertical dimension we actually subtract the vertical location coordinate from 239. That's because the cell indices go from top to bottom, but we would like to think of the height of the object going from bottom to top. (That's also how the <o>slider</o> object behaves, so since we're going to display the vertical coordinate with the <o>slider</o>, we need to express the coordinate as increasing from bottom-to-top.)</p>

<p>We send the results of our location calculation to a pair of <o>slider</o> objects to demonstrate that we are successfully tracking the center of the ball, and we show the coordinates in the <o>number</o> boxes. We also scale the coordinates into the range 0 to 1, to show how easily the horizontal and vertical location of the ball could potentially be used to modify some activity or attribute elsewhere in a Max patch. For example, we could use the vertical location to control the volume of a video or an MSP sound, or we could use the horizontal coordinate to affect the rotation of an image.</p>

<illustration><img src="images/jitterchapter25d.png"/>Scale the location coordinates into the range 0-1, for use elsewhere in the patch.</illustration>

<h2>Tracking a Color in a Complex Image</h2>
<p>Well, that all worked quite nicely for the simple example of a plain red ball on a plain white background. But tracking a single object in a real life video is a good deal tougher. We'll show some of the problems you might encounter, and some tricks for dealing with them.</p>

<bullet>Make sure the <i>redball</i> movie is stopped. Now double-click on the <o>patcher </o> <m>bballtracking</m> object to see <i>A More Detailed Example</i>. Click on the <o>toggle</o> labeled <i>Start/Stop</i> in the upper-left corner of the <i>[bballtracking]</i> subpatch to start the video.</bullet>
<p>This movie has objects with distinct colors: a red shirt, green pants, and a yellow-and-blue ball. Potentially it could be useful for color tracking. However, there are a few factors that make tracking this ball a bit harder than in the previous example.</p>

<p>First of all, the top few scan lines of the video (the top few rows of the matrix) contain some garbage that we really don't want to analyze. This garbage is an unfortunate artifact of the imperfect digitization of this particular video. Such imperfections are common, and can complicate the analysis process. Secondly, the image is not highly saturated with color, so the different colors are not as distinct as we might like. Thirdly, the ball actually leaves the frame entirely at the end of the four-second clip. (When <o>jit.findbounds</o> can't find any instance of the values being sought, it reports starting and ending cell indices of -<m>1</m>.) Fourthly, if we want to track the color yellow to find the location of the ball in the frame, we need to recognize that the ball is not all one shade of yellow. Because of the texture of the ball and the lighting, it actually shows up as a range of yellows, so we'll need to identify that range carefully to <o>jit.findbounds</o>.</p>

<p>Let's try to solve some of these problems. As we demonstrated in <i>Tutorial 14</i>, some Jitter objects allow us to designate a source rectangle of an image that we want to view that's different from the full matrix. In <i>Tutorial 14</i> we demonstrated the <m>srcdimstart</m>, <m>srcdimend</m>, and <m>usesrcrect</m> attributes of <o>jit.matrix</o>, and we mentioned that <o>jit.movie</o> has comparable attributes called <m>srcrect</m> and <m>usesrcrect</m>. Let's use those attributes of <o>jit.movie</o> to crop the video image, getting rid of some parts we don't want to see.</p>

<bullet>Click on the <o>message</o> box labeled <i>Crop Source Image</i>. This sends to <o>jit.movie</o> the cell indices of a new source rectangle that we want to view, and tells <o>jit.movie</o> to use that source rectangle instead of the full matrix. Notice that by starting at row <m>4</m> (that is, starting with the fifth row of the matrix), we crop out the garbage at the top of the image. We also chop 20 pixels off of the left side of the source image, so that the first bounce of the ball occurs exactly in the lower-left corner. Now we've focused on the part of the video we want to analyze.</bullet>
<bullet>Next we'll deal with our other problems. Click on the small <o>preset</o> object labeled <i>Setup</i> in the lower-right corner of the window. This sets all the user interface objects to just the settings we desire.</bullet>
<p>This sets the <m>loop</m> attribute of <o>jit.movie</o> to 2 for back-and-forth playback, and it sets a loop endpoint at time <m>2160</m> (just at the moment when the 54th frame would occur) so that the movie now plays back and forth from frame 0 to frame 53 and back. The movie now plays just up to the moment of the first bounce of the ball on the pavement, then reverses direction.</p>

<p>We have also sent some values to the <o>jit.brcosa</o> object (discussed in detail in <i>Tutorial 7</i>) to set its <m>brightness</m>, <m>contrast</m>, and <m>saturation</m> attributes just the way we want them. This doesn't exactly result in the best-looking image, but it does make the different colors more distinctive, and compresses them into a smaller range of values, making them easier for <o>jit.findbounds</o> to track.</p>

<p>And we've turned on the <m>usesrcdim</m> attribute of the <o>jit.matrix</o> object (in the center of the patch) so that it is now using the output of <o>jit.findbounds</o> to determine its source rectangle. You can see the tracked region displayed in the <o>jit.pwindow</o> labeled <i>Show Tracked Region</i>.</p>

<illustration><img src="images/jitterchapter25e.png"/>Using the output of <o>jit.findbounds</o> to set the <m>srcdimstart</m> and <m>srcdimend</m> attributes of <o>jit.matrix</o></illustration>

<p>The basic yellow of the ball has nearly equal amounts of red and green in it, so we set the <m>min</m> and <m>max</m> attributes of <o>jit.findbounds</o> to look for cells containing high values in the red and green planes and a low value in the blue plane. You can see that with careful settings of <o>jit.brcosa</o> and careful settings of the <m>min</m> and <m>max</m> attributes of <o>jit.findbounds</o>, we've managed to get very reliable tracking of the yellow part of the ball.</p>

<div>
<techdetail>Note: One fairly important detail that we haven't really discussed here is how to set the <m>min</m> and <m>max</m> attributes of <o>jit.findbounds</o> most effectively to track a particular color in a video. A certain amount of trial-and-error adjustment is needed, but you can get some pretty specific information about the color of a particular pixel by using the <o>suckah</o> object demonstrated in <i>Tutorial 10</i>. You can place the <o>suckah</o> object over the <o>jit.pwindow</o> of the video you want to analyze, click on the color you want to track, and use the output of <o>suckah</o> to get the RGB information of that cell. (The values from <o>suckah</o> are in the range 0-255, but you can divide them by 255.0 to bring them into the 0-1 range.)</techdetail>
</div>
<h2>Using the Location of an Object</h2>
<p>So, at least in this particular situation, we've managed to overcome the difficulties of tracking a single object in a video. But now that we've accomplished that, what are we going to do with the information we've derived? We'll show a couple of ways to use object location to control sound: by playing MIDI notes or by playing MSP tones. Neither example is very sophisticated musically, but they should serve to demonstrate the basic issue of mapping location information to sound information.</p>

<p>We'll send the location data to two subpatches located in the part of the patch marked <i>Use Tracking Info</i>. We use a <o>pack</o> object to pack all of the output of <o>jit.findbounds</o> together into a single 4-item list, and then we use a <o>gate</o> object to route that information to the <o>patcher</o> <m>playnotes</m> subpatch (to play MIDI notes) or the <o>patcher</o> <m>playtones</m> subpatch (to play MSP tones) or neither (to produce no sound).</p>

<illustration><img src="images/jitterchapter25f.png"/>Send the location information to one of two subpatches</illustration>

<h3>Playing Notes</h3>
<bullet>In the <o>umenu</o> labeled <i>Use Tracking Info</i>, choose the menu item <m>1 = Play MIDI Notes</m>. Double-click on the <o>patcher</o> <m>playnotes</m> object to see the contents of the <i>[playnotes]</i> subpatch. If you are not hearing any notes being played (and you've verified that the movie is still playing), try double-clicking on the <o>noteout</o> <m>a</m> object and choosing a different MIDI synthesizer in the device dialog box.</bullet>

<illustration><img src="images/jitterchapter25g.png"/>The contents of the [playnotes] subpatch</illustration>

<p>In the <i>[playnotes]</i> subpatch we use the same sort of mapping formulae as we used in the first example to calculate the location coordinates of the ball and place the information in a usable range. We calculate the horizontal location and divide by 16 to get numbers that will potentially range from 0 to 19. We use the <o>change</o> object to ignore duplicate numbers (i.e. repeated notes), and then we look up the note we want to play in the <o>table</o>.</p>

<div>
<techdetail>Note: The basketball player's motion has no relationship to any particular musical scale, so taking the raw location data as MIDI key numbers would result in an atonal improvisation. (Not that there's anything wrong with that!) If we want to impart a tonal implication to the pitch choices, we can use the numbers generated by the horizontal motion of the ball as index numbers to look up notes of the scale in a lookup <o>table</o>. If you want to see (or even alter) the contents of the <o>table</o>, just double-click on the <o>table</o> object to open its graphic editing window.</techdetail>
</div>
<p>We use the vertical location of the ball&#x2014;which we've mapped into the range 0-119&#x2014;to determine the velocity values. The <o>makenote</o> object assigns the duration (200ms) to the notes and takes care of providing the MIDI note-off messages. The underlying pulse of the music (20 pulses per second) is determined by the speed of the <o>metro</o> that's playing the movie, but because the <o>change</o> object suppresses repeated notes, not every pulse gets iterated as a MIDI note.</p>

<bullet>Close the <i>[playnotes]</i> subpatch window. Click on the <o>message</o> box labeled <i>Crop and Flip Source Image</i>. This sends a new source rectangle to <o>jit.movie</o> to flip the image horizontally, which reverses the high-low musical effect of the <i>[playnotes]</i> subpatch.</bullet>
<h3>Playing Tones</h3>
<bullet>In the <o>umenu</o> labeled <i>Use Tracking Info</i>, choose the menu item <m>2 = Play MSP Tones</m>. Double-click on the <o>patcher</o> <m>playtones</m> object to see the contents of the <i>[playtones]</i> subpatch.</bullet>

<illustration><img src="images/jitterchapter25h.png"/>Use location of a color as frequency control information for MSP oscillators</illustration>
<p>Here we're using the horizontal and vertical location coordinates of the basketball as frequency values for MSP oscillators. The equations we use to calculate those values are somewhat arbitrary, but they've been devised so as to map both coordinates into similar frequency ranges. The horizontal coordinate is used to control the oscillator in the left audio channel, and the vertical coordinate controls the frequency of the oscillator in the right channel.</p>

<p>We use the presence of incoming messages to turn MSP audio on (and fade the sound up), and if the messages are absent for more than 200 ms, we fade the sound down and turn the audio off.</p>

<illustration><img src="images/jitterchapter25i.png"/>First message starts and fades in audio; lack of message for 201ms fades out and turns off audio.</illustration>

<bullet>Close the <i>[playtones]</i> subpatch window. Flip the video image horizontally by clicking on the <o>message</o> boxes labeled <i>Crop Source Image</i> and <i>Crop and Flip Source Image</i> to hear the difference in the effect on the MSP oscillators.</bullet>
<h3>Deriving More Information</h3>
<p>In this tutorial we've shown a pretty straightforward implementation in which we use the location coordinates of a color region directly to control parameters of sound synthesis or MIDI performance. With a little additional Max programming, we could potentially derive further information about the motion of an object.</p>

<p>For example, by comparing an object's location in one video frame with its location in the preceding frame, we could use the Pythagorean theorem to calculate the distance the object traveled from one frame to the next, and thus calculate its velocity. We could also calculate the <i>slope</i> of its movement (), and thus (with the arctangent trig function) figure out its angle of movement. By comparing one velocity value to the previous one, we can calculate acceleration, and so on. By comparing an object's apparent size from one frame to the next, we can even make some crude guesses about its movement toward or away from the camera in the "z axis" (depth).</p>

<h2>Summary</h2>
<p>The <o>jit.findbounds</o> object detects values within a certain range in each plane of a matrix, and it reports the region in the matrix where it finds values within the specified range of each plane. This is useful for finding the location of any range of numerical data in any type of matrix. In particular, it can be used to find the location of a particular color in a 4-plane matrix, and thus can be used to track the movement of an object in a video.</p>

<p>Cropping the video image with the <m>srcrect</m> attribute of <o>jit.movie</o> helps to focus on the desired part of the source image. The <o>jit.brcosa</o> object is useful for adjusting the color values in the source video, making it easier to isolate and detect a specific color or range of colors.</p>

<p>We can use the output of <o>jit.findbounds</o> to track the location of an object, and from that we can calculate other information about the object's motion such as its velocity, direction, etc. We can use the derived information to control parameters of a MIDI performance, MSP synthesis, or other Jitter objects.</p>

	<seealsolist>
		<seealso name="jit.brcosa">Adjust image brightness/contrast/saturation</seealso>
		<seealso name="jit.findbounds">Calculate bounding dimensions for a range of values</seealso>
		<seealso name="jit.matrix">The Jitter Matrix!</seealso>
		<seealso name="jit.pwindow">In-Patcher Window</seealso>
		<seealso name="jit.movie">Play or edit a movie</seealso>
	</seealsolist>
	</chapter>
