summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorDennis Peeten <dpeeten@onsneteindhoven.nl>2008-06-11 15:21:07 (GMT)
committerDennis Peeten <dpeeten@onsneteindhoven.nl>2008-06-11 15:21:07 (GMT)
commit808a7c14fdb11325ebdea0296c92c063a7c49cd1 (patch)
tree93571a56841bff859720c69549614cb187fa9d06
parent6059a52daa590419b3c601d127c0a87c47a5a638 (diff)
download2iv55-808a7c14fdb11325ebdea0296c92c063a7c49cd1.zip
2iv55-808a7c14fdb11325ebdea0296c92c063a7c49cd1.tar.gz
2iv55-808a7c14fdb11325ebdea0296c92c063a7c49cd1.tar.bz2
Bijna Af.
-rw-r--r--report/chapter1.tex12
-rw-r--r--report/chapter2.tex31
-rw-r--r--report/chapter4.tex47
-rw-r--r--report/headtracking.tex6
-rw-r--r--report/wiimote_ir.tex52
5 files changed, 93 insertions, 55 deletions
diff --git a/report/chapter1.tex b/report/chapter1.tex
index 2cad507..e524989 100644
--- a/report/chapter1.tex
+++ b/report/chapter1.tex
@@ -1,11 +1,15 @@
\section{Introduction}
-For the master course Interactive Virtual Environments (2IV55), group 7 decided to create an application that can be used in an experimental setup. The application uses different techniques from computer graphics to create some kind of virtual environment. The goal of this program was to draw some conclusions whether or not these techniques contribute to the accuracy of the user when utilizing a "3D mouse" with 3DOF\footnote{Degrees of freedom: The amount of motion supported in a robotics or virtual reality system. 3DOF provides X, Y and Z (horizontal, vertical and depth) while 6DOF provides X, Y and Z and pitch, yaw and roll}. \\
+For the master course Interactive Virtual Environments (2IV55), group 7 decided to do an experiment using the Nintendo Wii controller. The reason behind this decision, was that the few games that utilise the Wii remote as a 3D mouse were rather awkward to control. One very likely reason for the awkwardness was the lack in depth perception when interacting in a 3D environment projected on a 2D screen. We therefore devised an experiment that hopefully led to some insights in the reasons behind awkward feel of the 3D mouse control scheme. Our goal was to create an application in which the user was required to use a Wii remote as a 3D mouse to perform tasks that require great precision. For comparison we implemented head tracking and anaglyph stereo vision to see in what degree enhanced depth perception would influence the results of the experiment. \\
+
+% The application uses different techniques from computer graphics to create some kind of virtual environment. The goal of this program was to draw some conclusions whether or not these techniques contribute to the accuracy of the user when utilizing a "3D mouse" with 3DOF\footnote{Degrees of freedom: The amount of motion supported in a robotics or virtual reality system. 3DOF provides X, Y and Z (horizontal, vertical and depth) while 6DOF provides X, Y and Z and pitch, yaw and roll}. \\
+
+And thus, the "MatchBlox" project was born. It would challenge the user to take a wooden block in 3D and put it in the hole . These small experiments could then be repeated with the assistance of head tracking or stereo vision. The exact goals of the project and the problems encountered are discussed in chapter 2 of this report. \\
+
+
-And thus, the "MatchBlox" project was born. It would challenge the user to take a wooden block in 3D and put it in the correct position. These small experiments could then be repeated with the assistance of head tracking or stereo vision. The exact goals of the project and the problems encountered are discussed in chapter 2 of this report. \\
-
Head tracking is a relatively old technique in the field of virtual environments. It basically keeps track of the head's position and/or rotation. Using the information about the head's position, the virtual world can be rendered according to these values. This turns an ordinary computer screen into a virtual window which the user can look trough from different angles. \\
Stereo vision is another well-known technique mainly because it's very easy to apply. By rendering the virtual world twice and merging the two images one can accomplish the illusion of depth. The stereo vision and the head tracking techniques are explained in more detail in chapter 3. \\
-Finally, in chapter 4, a conclusion is drawn from the execution of the project. \ No newline at end of file
+Finally, in chapter 4, a conclusion is drawn from the execution of the project.
diff --git a/report/chapter2.tex b/report/chapter2.tex
index bcd9e55..50e33ac 100644
--- a/report/chapter2.tex
+++ b/report/chapter2.tex
@@ -1,6 +1,6 @@
\section{Project definition}
-This chapter gives an insight in the MatchBlox project. First of all, the goals of the project and how they changed during the execution of the project are listed and explained. Then, in the next paragraph, the problems that were encountered are discussed. This chapter concludes with a short project evaluation.
+This chapter gives an insight in the MatchBlox project. First of all, the goals of the project and how they changed during the execution of the project are listed and explained. %Then, in the next paragraph, the problems that were encountered are discussed. This chapter concludes with a short project evaluation.
\subsection{Goals}
@@ -17,32 +17,7 @@ The first two factors, head tracking and stereo vision, were researched before i
The depth of the field was implemented by altering the size of the box in which the user puts the blocks. Three different sizes were supported; small (2x2), medium (3x3) and large (4x4). \\
-As for the shadow projection, the idea was dropped because the number of test cases would grow to large. Without the addition of shadow projection the number of test cases were already $ 2 \times 2 \times 3 = 12 $ (instead of 24 cases). The implementation of the current application as well as the supporting database does support the addition of an extra factor such as shadow projection.
+As for the shadow projection, the idea was dropped because the number of test cases would grow to large. Without the addition of shadow projection the number of test cases were already $ 2 \times 2 \times 3 = 12 $ (instead of 24 cases). The implementation of the current application as well as the supporting database does support the addition of an extra factor such as shadow projection. \\
-\subsection{Problems}
+Our initial intention was to get some people to participate in the experiment by playing the game in some different configurations. From those results we were going to draw conclusions on the usability of the Wii remote as a 3D mouse. However during the development of the application we came across some problems that delayed the completion of the application. At the time of writing no human experiments have been carried out, but the encountered problems have led to an important insight in the usability of the Wii remote as a 3D mouse device.
-The number of test cases to implement was the first small problem the group had encountered. However, it was not the only one. The main problem was due to the chosen input device; the Wiimote\footnote[1]{Wiimote is a nickname for the Wii remote which is the primary controller (6DOF) used with the Nintendo Wii.}. As the Wiimote supports 6DOF only 3DOF were used in the project. This was enough to be able to place a block inside the correct hole.
-
-\subsubsection{System latency}
-
-In order to support a Wiimote in a program like MatchBlox, a number of development libraries are available on the web. The library that was initially used for the application suffered a lot of lag (or latency). It took some time to determine the error and switch to another library.
-
-\subsubsection{Depleted batteries}
-
-A second Wiimote is used to track the head using the infrared camera in front of the Wiimote. The user places two infrared LED's which the Wiimote can keep track of. A wireless Wii sensorbar\footnote[1]{The Wiimote senses IR light from the sensor bar. The light emitted from each end of the sensor bar is focused onto the image sensor which sees the light as two bright dots separated by a distance "mi" on the image sensor. The second distance "m" between the two clusters of light emitters in the sensor bar is a fixed distance. From these two distances m and mi, the distance between the Wiimote and the sensor bar can be calculated.} was used for this. This has as great advantages that it's wireless (as opposed the the wired variant shipped with the Wii console) and no home made infrared LED's need to be used. \\
-
-A great disadvantage however, is that it need batteries and it's not visible when the batteries are almost depleted. When the batteries are running low, it is harder for the Wiimote to detect the IR dots. With the use of a camera from a mobile phone, the infrared LED's can be seen. This is a good way to check if the sensor bar is still emitting IR light.
-
-\subsubsection{Cross talk}
-
-Cross talk is a common problem in all implementations of stereo vision. It occurs when left images reach the right eye and right images reach the left eye. In other words, you see things with an eye you're only supposed to see with the other eye. In the case of red and blue stereo vision it means that the colors are not properly filtered out. \\
-
-There exist a technique that reduces this effect. What it basically does is subtract the leakage from the displayed intensity. This technique was not applied to the MatchBlox application. Instead, a more detailed and moving background is used. This makes the crosstalk less visible and overall very useable.
-
-\subsection{Evaluation}
-
-The conclusions of the MatchBlox project are explained in chapter 4 of this report. This paragraph deals with the evaluation of the project and how it was executed. \\
-
-A lot of new techniques during the implementation were applied; head tracking, stereo vision, a Wiimote as input device, etc. To fit this all into one big project was a bit over the top and therefor unfeasible. The troubles with the infra red LED's and the Wiimote in general consumed a lot of time and energy of the project. \\
-
-Although the project was able to complete the application, it's not useable for an actual experiment. This doesn't mean all was in vein and no conclusion can be drawn from the results of the project. The conclusions will be explained in the last chapter. \ No newline at end of file
diff --git a/report/chapter4.tex b/report/chapter4.tex
index d070360..dc9fd6d 100644
--- a/report/chapter4.tex
+++ b/report/chapter4.tex
@@ -1,3 +1,45 @@
+
+\subsection{Problems}
+
+The number of test cases to implement was the first small problem the group had encountered. However, it was not the only one. The main problem was due to the chosen input device; the Wiimote\footnote[1]{Wiimote is a nickname for the Wii remote which is the primary controller (6DOF) used with the Nintendo Wii.}. As the Wiimote supports 6DOF only 3DOF were used in the project. This was enough to be able to place a block inside the correct hole.
+
+%\subsubsection{System latency}
+
+%In order to support a Wiimote in a program like MatchBlox, a number of development libraries are available on the web. The library that was initially used for the application suffered a lot of lag (or latency). It took some time to determine the error and switch to another library.
+
+\subsubsection{Depleted batteries}
+
+A second Wiimote is used to track the head using the infrared camera in front of the Wiimote. The user places two infrared LED's which the Wiimote can keep track of. A wireless Wii sensorbar\footnote[1]{The Wiimote senses IR light from the sensor bar. The light emitted from each end of the sensor bar is focused onto the image sensor which sees the light as two bright dots separated by a distance "mi" on the image sensor. The second distance "m" between the two clusters of light emitters in the sensor bar is a fixed distance. From these two distances m and mi, the distance between the Wiimote and the sensor bar can be calculated.} was used for this. This has as great advantages that it's wireless (as opposed the the wired variant shipped with the Wii console) and no home made infrared LED's need to be used. \\
+
+A great disadvantage however, is that it need batteries and it's not visible when the batteries are almost depleted. When the batteries are running low, it is harder for the Wiimote to detect the IR dots. With the use of a camera from a mobile phone, the infrared LED's can be seen. This is a good way to check if the sensor bar is still emitting IR light.
+
+\subsubsection{Cross talk}
+
+Cross talk is a common problem in all implementations of stereo vision. It occurs when left images reach the right eye and right images reach the left eye. In other words, you see things with an eye you're only supposed to see with the other eye. In the case of red and blue stereo vision it means that the colors are not properly filtered out. \\
+
+There exist a technique that reduces this effect. What it basically does is subtract the leakage from the displayed intensity. This technique was not applied to the MatchBlox application. Instead, a more detailed and moving background is used. This makes the crosstalk less visible and overall very useable.
+
+\subsubsection{Stability of the 3D mouse cursor}
+
+During the implementation of the 3D mouse control scheme described in section 3.1, we ran into some problems, or actually one problem with multiple causes: the cursor was impossible to keep steady when holding the wii remote in hand. The cursor seemed to oscillate around a point, while frequently jumping to another position. Only when the controller lay perfectly still on a flat surface, the cursor on screen seemed to be steady.
+
+One cause for the cursor instability was the method with which we selected the coordinates for $ g_l $ and $ g_r $ from the points returned by the wiimote. As mentioned the wiimote can track up to four infra-red sources. If the wiimote is close enough to the sensor bar it will recognize the positions of the individual LED's withing the LED groups. When this happens more than two coordinates are returned by the wiimote. In our original implementation we selected the two points with the greatest distance between them. As a solution, we implemented an algorithm that sorts the points returned by the wiimote into two groups based on their proximities, one group per LED group. The algorithm returns the average coordinates per group as the coordinates for $ g_l $ and $ g_r $. This solved the large jumps in the cursor position, but the did not help the stability of the cursor very much.
+
+Another cause, which is more apparent, is the fact that it is very hard for humans to keep their arm perfectly still when holding a wiimote. As a solution we implemented an exponential smoothing algorithm to calculate a weighted average over $ g_l $ and $ g_r $ before they were processed. The smoothing stabilized the cursor in the x and y-direction while still being quite responsive. In the z-direction however the cursor remained unsteady, albeit to some lesser extend.
+
+The reason for the stability problem in the z-direction is the limited resolution of the mapping in the z-direction as opposed to the x and y directions. For the latter two, the resolution corresponds to the resolution of the camera. For the z-direction the resolution is limited by the chosen values of $ d_{min} $ and $ d_{max} $ and the distance between the LED groups in the sensor bar. Suppose we define $ d_{min} $ as 1 meter and $ d_{max} $ as 2 meters and use a standard sensor bar such that $ \Delta $ in the distance calculation is 0.0205 meters (see Figure \ref{fig:wiimote_dist_calc}). The distance in pixels between the $ g_l $ and $ g_r $ at distance $ d_{min} $ is:
+\[ \frac{2\mathsf{arctan}(\frac{\frac{1}{2}\Delta}{d_{max}})}{\theta_{pix}} = 29.832867646 \approx 30 \text{pixels} \]
+Compared to the distance in pixels at $ d_{max} $ which is approximately $ 15 $ pixels, gives a resolution in the z-direction of a mere $ 15 $ camera pixels. The resolution can be increased by using a wider sensor bar or by sitting closer to the sensor bar, reducing $ d_{min} $ and $ d_{max} $. Choosing a bounding box $ \mathcal{B} $ with a depth as small as possible can also make vibrations in the z-direction less visible. In our implementation we severely smoothed the z-coordinate of the cursor position to finally stabilize the cursor. The downside of the smoothing is that we are left with a noticeable lag in movements in the z-direction.
+
+
+%\subsection{Evaluation}
+
+%The conclusions of the MatchBlox project are explained in chapter 4 of this report. This paragraph deals with the evaluation of the project and how it was executed. \\
+
+%A lot of new techniques during the implementation were applied; head tracking, stereo vision, a Wiimote as input device, etc. To fit this all into one big project was a bit over the top and therefor unfeasible. The troubles with the infra red LED's and the Wiimote in general consumed a lot of time and energy of the project. \\
+
+%Although the project was able to complete the application, it's not useable for an actual experiment. This doesn't mean all was in vein and no conclusion can be drawn from the results of the project. The conclusions will be explained in the last chapter.
+
\section{Conclusions}
In this section we'll give our conclusions of the project, although
@@ -18,4 +60,7 @@ and your head cannot be tracked anymore.
\subsection{Stereo Vision}
-The applied red-blue stereo vision is one of the least realistic alternatives of stereo vision. Yet, it is the easiest to implement and still very effective. The created images provide the user the illusion of depth in a very usable way. The distance between objects can easily been seen from a still image. This is also a big disadvantage of head tracking; without moving, the image provides no additional "depth info". \ No newline at end of file
+The applied red-blue stereo vision is one of the least realistic alternatives of stereo vision. Yet, it is the easiest to implement and still very effective. The created images provide the user the illusion of depth in a very usable way. The distance between objects can easily been seen from a still image. This is also a big disadvantage of head tracking; without moving, the image provides no additional "depth info".
+
+\subsection{Wii remote as 3D mouse}
+We can conclude that the Wii remote is not fit to be used as a 3D mouse. The ridiculously low resolution in depth measurements when using standard Nintendo hardware makes the wiimote unusable as a 3D mouse. However, because the resolution of the camera in the wiimote is relatively high, compared to the resolution of a standard definition television set, the wiimote is ideal to be used on the Wii console as a 2D mouse. The third dimension should only be used for tasks that do not require great precision or resolution, for example simulating the action of pushing a button or for switching between zoomed and normal view in a 3D shooting game.
diff --git a/report/headtracking.tex b/report/headtracking.tex
index cf2280a..f4954bb 100644
--- a/report/headtracking.tex
+++ b/report/headtracking.tex
@@ -9,14 +9,14 @@ truncated at the near plane, hence the name frustum. Only the things
between in the pyramid between the near and far plane is visible from
the viewpoint. In the figure below you see a scheme of the frustum.
\begin {center}
- \includegraphics[width=99.7mm]{img/frustumscheme.png} \\
+ \includegraphics[width=99.7mm]{img/frustumscheme.PNG} \\
Figure \#\#: Frustum \\
\end {center}
In the next figure you see the objects which can be viewed by the
view point. The green object can be seen totally, the yellow
partially and the red object can't be seen.
\begin {center}
- \includegraphics[width=81.5mm]{img/frustumobjects.png} \\
+ \includegraphics[width=81.5mm]{img/frustumobjects.PNG} \\
Figure \#\#: Frustum \\
\end {center}
If the viewpoint is moving, the frustum
@@ -57,6 +57,6 @@ near clipping plane closer to the eye position, without changing the
frustum. Below you see the demo application of the head tracking
technique.
\begin {center}
- \includegraphics[width=100mm]{img/HeadTrackScreenShot.png} \\
+ \includegraphics[width=100mm]{img/HeadTrackScreenShot.PNG} \\
Figure \#\#: The head track demo application \\
\end {center}
diff --git a/report/wiimote_ir.tex b/report/wiimote_ir.tex
index b17c955..845f6ba 100644
--- a/report/wiimote_ir.tex
+++ b/report/wiimote_ir.tex
@@ -1,19 +1,19 @@
\subsection{Wiimote IR input}
-As mentioned in the introduction, we used the Nintendo Wii controller for both head tracking and 3D (mouse) input. In both applications we made use of the infra-red sensing ability of the controller. A Wiimote has a build in infra-red camera that is used to track up to four infra-red sources at a time. An on board image processing chip processes the images acquired by the camera and outputs the coordinates of the infra-red sources in camera coordinates, which are reported to the client (pc) via Bluetooth at a frequency of 100Hz. Several sources report that the infra-red camera has a horizontal and vertical resolution of $ 1024 $ by $ 768 $ pixels with a viewing angle of $ 40^{\circ} $ and $ 30^{\circ} $ respectively. However, the author of the wiimote interface library that we used, claimed that his measurements indicated that the reported coordinates never exceeded the range $ [0,1015]\times[0,759] $. Since there are no official specifications of the hardware inside the Wii controller we assumed the resolution of the camera to be $1016 \time 760$.
+As mentioned in the introduction, we used the Nintendo Wii controller for both head tracking and 3D (mouse) input. In both applications we made use of the infra-red sensing ability of the controller. A Wiimote has a build in infra-red camera that is used to track up to four infra-red sources at a time. An on board image processing chip processes the images acquired by the camera and outputs the coordinates of the infra-red sources in camera coordinates, which are reported to the client (pc) via Bluetooth at a frequency of 100Hz. Several sources report that the infra-red camera has a horizontal and vertical resolution of $ 1024 $ by $ 768 $ pixels with a viewing angle of $ 40^{\circ} $ and $ 30^{\circ} $ respectively. However, the author of the wiimote interface library that we used, claimed that his measurements indicated that the reported coordinates never exceeded the range $ [0,1015]\times[0,759] $. Since there are no official specifications of the hardware inside the Wii controller we assumed the resolution of the camera to be $1016 \times 760$. \\
-In order to use the wiimote as a 3D mouse, two infra-red sources/beacons are required to be positioned a certain distance apart, above or below the display. The camera coordinates of these beacons are used to calculate world coordinates (in $ \mathbb{R}^3 $) of the mouse cursor. The beacon shipped with the Wii console is misleadingly called the 'sensor bar'. It is a plastic bar that houses two groups of five infra-red LED's positioned at the tips of the bar. The distance between the LED groups is approximately $ 20.5 $ cm. Other, after market sensor bars, that we used during testing and development of the software, contained two groups of three LED's each.
+In order to use the wiimote as a 3D mouse, two infra-red sources/beacons are required to be positioned a certain distance apart, above or below the display. The camera coordinates of these beacons are used to calculate world coordinates (in $ \mathbb{R}^3 $) of the mouse cursor. The beacon shipped with the Wii console is misleadingly called the 'sensor bar'. It is a plastic bar that houses two groups of five infra-red LED's positioned at the tips of the bar. The distance between the LED groups is approximately $ 20.5 $ cm. Other, after market sensor bars, that we used during testing and development of the software, contained two groups of three LED's each. \\
-To control the 3D mouse cursor, the user points the Wii remote at the display such that both LED groups are registered by the wiimote's camera. The xy position of the mouse cursor is set by pointing the wiimote at the desired location. The position of the cursor in the z direction can be controlled by moving the remote toward or away from the screen.
+To control the 3D mouse cursor, the user points the Wii remote at the display such that both LED groups are registered by the wiimote's camera. The xy position of the mouse cursor is set by pointing the wiimote in the direction of the desired location. The position of the cursor in the z direction can be controlled by moving the remote toward or away from the screen. \\
% In order to use the wiimote as a 3D mouse input device, the user points the wii remote at the display such that the infra-red camera registers the beacons. The camera coordinates of the beacons determine the position of the mouse pointer in screen coordinates. The depth of the mouse pointer is changed by moving the remote toward or away from the display, the distance between the two beacons in camera coordinates is used to calculate the distance of the wiimote relative to the sensor bar.
\subsubsection{Mapping to 2D}
-First we will discuss the mapping of the wiimote infrared output to a 2D mouse cursor position, from there we extend the mapping to 3D. In the ideal situation the wiimote reports the 2D camera coordinates of the two LED groups of the sensor bar. From these coordinates $ g_l $ and $ g_r $, we calculate a cursor position $ c_{pix} $ in pixel coordinates. Because the screen resolution is likely to be different from the camera resolution, we actually calculate a cursor position $ c_{rel} \in [0,1]^2 $ relative to the camera resolution and use that result to get the actual pixel coordinates simply by multiplying $ c_{rel} $ with the screen resolution.
+First we will discuss the mapping of the wiimote's infrared output to a 2D mouse cursor position, from there we extend the mapping to 3D. In the ideal situation the wiimote reports the 2D camera coordinates $ g_l $ and $ g_r $ of the left and right LED group respectively. From these coordinates we calculate a cursor position $ c_{pix} $ in pixel coordinates. Because the screen resolution is likely to be different from the camera resolution, we actually calculate a cursor position $ c_{rel} \in [0,1]^2 $ relative to the camera resolution and use that result to get the actual pixel coordinates simply by multiplying $ c_{rel} $ with the screen resolution. \\
-Figure \ref{fig:2d_mapping} shows a diagram of a camera image of $ g_l $ and $ g_r $ and the corresponding screen with cursor position $ c_{pix} $. In our mapping we first calculate $ g_c $ as the average point in camera coordinates between $ g_l $ and $ g_r $. Then $ g_c $ is inverted and counter-rotated before it is mapped to $ c_{rel} $ and finally converted $ c_{pix} $.
+Figure \ref{fig:2d_mapping} shows a diagram of a camera image of $ g_l $ and $ g_r $ and the corresponding screen with cursor position $ c_{pix} $. In our mapping we first calculate $ g_c $ as the average point in camera coordinates between $ g_l $ and $ g_r $. Then $ g_c $ is inverted and counter-rotated before it is mapped to $ c_{rel} $ and finally converted $ c_{pix} $. \\
\begin{figure}[h!]
\begin{center}
@@ -25,13 +25,13 @@ Figure \ref{fig:2d_mapping} shows a diagram of a camera image of $ g_l $ and $ g
%are mapped to 2D screen coordinates. We take the average $ g_c = \frac{g_l + g_r}{2} $ to be the point in camera coordinates where the user is actually pointing to. Before this point can be converted to screen/pixel coordinates it needs to undergo some corrections. First the coordinates are inverted, secondly $ g_c $ is compensated for roll of the wiimote. Because the resolution of the camera and the display are likely to be different, we first compute a relative cursor position $ c_{rel} $ and multiply that by the screen resolution to get the actual cursor position in pixel coordinates.
-Because we require only one reference point for the 2D mapping, we choose $ g_c $ to be the average of the two LED groups in camera coordinates, this corresponds with the approximate center of the sensor bar.
+Because we require only one reference point for the 2D mapping, we choose $ g_c $ to be the average of the two LED groups in camera coordinates, this corresponds with the approximate center of the sensor bar. \\
-The inversion of the coordinates of $ g_c $ required to compensate for the fact the camera is being moved instead of the sensor bar. When the camera is pointing upwards, the sensor bar is visible in the bottom of the image, when the camera is pointing downwards the sensor bar is visible in the top of image. The same holds for the left and right.
+The inversion of the coordinates of $ g_c $ required to compensate for the fact the camera is being moved instead of the sensor bar. When the camera is pointing upwards, the sensor bar is visible in the bottom of the image, when the camera is pointing downwards the sensor bar is visible in the top of image. The same holds for the left and right. \\
-The next step is to counter-rotate $ g_c $ to compensate for the rotation of the wiimote over its z-axis. Note that the user moves the wiimote with respect to the screen's axis and expects the cursor to follow those movements. Because the camera coordinates of $ g_c $ are inverted and scaled to screen resolution, the cursor will only follow the wiimote's movements when the camera's axis are aligned with the screen axis. To align the camera axis with the screen axis we counter-rotate the coordinates of $ g_c $ around the center of the camera. The angle with which we rotate is the angle between the camera's x-axis and the screen's x-axis. Because we assume the sensor bar to be aligned with the screen's x-axis we can simply calculate the angle of $ \overline{g_l g_r} $ with the camera's x-axis. Using this method we can only detect an angle in the range $ [0^{\circ}, 180^{\circ}) $ because the wiimote will only report the coordinates of $ g_l $ and $ g_r $ and not whether they are the left or right LED group. As a solution one can use the accelerometer data to check whether the wiimote is held upside down and set the appropriate sign of the angle.
+The next step is to counter-rotate $ g_c $ to compensate for the rotation of the wiimote over its z-axis. Note that the user moves the wiimote with respect to the screen's axis and expects the cursor to follow those movements. Because the camera coordinates of $ g_c $ are inverted and scaled to screen resolution, the cursor will only follow the wiimote's movements when the camera's axis are aligned with the screen axis. To align the camera axis with the screen axis we counter-rotate the coordinates of $ g_c $ around the center of the camera. The angle with which we rotate is the angle between the camera's x-axis and the screen's x-axis. Because we assume the sensor bar to be aligned with the screen's x-axis we can simply calculate the angle of $ \overline{g_l g_r} $ with the camera's x-axis. Using this method we can only detect an angle in the range $ [0^{\circ}, 180^{\circ}) $ because the wiimote will only report the coordinates of $ g_l $ and $ g_r $ and not whether they are the left or right LED group. As a solution one can use the accelerometer data to check whether the wiimote is held upside down and set the appropriate sign of the angle. \\
-The rest of the mapping quite trivial, let $ g'_c $ be the corrected camera coordinates, $ c_{rel} $ is computed by dividing $ g'_c $ by the camera resolution and $ c_{pix} $ is calculated by multiplying $ c_{rel} $ by the screen resolution.
+The rest of the mapping quite trivial, let $ g'_c $ be the corrected camera coordinates, $ c_{rel} $ is computed by dividing $ g'_c $ by the camera resolution and $ c_{pix} $ is calculated by multiplying $ c_{rel} $ by the screen resolution. \\
%When the x and y-axis of the screen and camera are alligned, all horizontal and vertical movement of the wiimote with respect to the screen, will result in the expected cursor movements. However, rotating the wiimote will also rotate the when the wiimote is rotated at angle of $ 90^{\circ} $ over its z-axis,
%The roll correction compensates for the rotation of the wiimote over its z-axis, which causes the coordinates of $ g_c $ to be rotated. If the user is holding the wiimote on its side (rotated $90^{\circ}$) and points to the right, the cursor will move down, instead of the expected direction. As a solution, the angle of the line $ \overline{g_l g_r} $ with the x-axis is calculated and is used to counter rotate $ g_c $ over the center of the camera. This rather simplistic solution works when the sensor bar is assumed to be aligned with the screen's x-axis. In order to determine whether the Wii remote is rotated over $ 180^{\circ} $ one can use the accelerometer data, but note that this will only give a reliable orientation estimate when the wiimote is not accelerating.
@@ -44,7 +44,7 @@ The rest of the mapping quite trivial, let $ g'_c $ be the corrected camera coor
For the 3D mapping we convert $ g_l $ and $ g_r $ to a coordinate $ c_{rel} \in [0,1]^3 $, relative to an axis aligned box $ \mathcal{B} $ in world coordinates. The box restricts the movement of the 3D cursor and is defined by two corner points $ p_{min} $ and $ p_{max} $ with the minimum and maximum coordinates respectively. From $ c_{rel} $ we compute the world coordinates $ c_{world} $ of the cursor by:
\[ c_{world} = p_{min} + c_{rel} \cdot (p_{max}-p_{min})\]
-The relative x and y coordinates are computed in the same way as in the 2D case. The relative z coordinate is calculated from the measured distance $ d $ between the sensor bar and the wiimote as illustrated in Figure \ref{fig:z_mapping}.
+The relative x and y coordinates are computed in the same way as in the 2D case. The relative z coordinate is calculated from the measured distance $ d $ between the sensor bar and the wiimote as illustrated in Figure \ref{fig:z_mapping}. \\
\begin{figure}[!h]
\begin{center}
@@ -61,7 +61,7 @@ z_{rel} & = 1 && \text{iff} \; d \geq d_{max} \\
z_{rel} & = \frac{d_{max} - d_{min}}{ d - d_{min}} && \text{otherwise}
\end{align*}
-The distances $ d_{min} $ and $ d_{max} $ depend on where the user is standing and have to be initialized according to the user's initial position. The distance between them determines the amount a user has to reach in order to touch the far end of $ \mathcal{B} $.
+The distances $ d_{min} $ and $ d_{max} $ depend on where the user is standing and have to be initialized according to the user's initial position. The distance between them determines the amount a user has to reach in order to touch the far end of $ \mathcal{B} $. \\
The calculation of the relative z-coordinate requires us to calculate the distance between the Wii remote and the sensor bar. The distance is calculated by measuring the angle between the lines of sight of left and right LED groups from the camera's point of view and calculating the length of a side of a right triangle. Let $ \Delta $ be the distance between the LED groups in the sensor bar. The distance $ d $ between the wiimote and the sensor bar can be calculated as shown in Figure \ref{fig:wiimote_dist_calc}. The angle $ \theta $ is computed as the distance between $ g_l $ and $ g_r $ multiplied by the angle per pixel:
\[ \theta = |g_l-g_r| \cdot \theta_{pix} \]
@@ -78,19 +78,33 @@ Assuming that the wiimote is positioned perpendicular to the sensor bar, the dis
\label{fig:wiimote_dist_calc}
\end{figure}
-Because the wiimote never is perpendicular to the sensor bar, the distance will be an estimate. If the angle between the wiimote and the sensor bar deviates from $ 90^{\circ} $, being an angle $ \alpha $ in the range $ (0, 180) $, then the measured distance $ |g_l-g_r| $ is a factor $ \mathsf{sin}(\alpha) $ from the actual distance that would be perceived from a perpendicular viewing angle where $ \alpha = 90^{\circ} $. The computed distance $ d $ will therefore be larger then the actual distance. However, the angle $ \alpha $ can be measured when using customized sensor bar with three LED groups with equally space in between. The angle can then be computed from the ratio between $ | g_l - g_c | $ and $ | g_r - g_c | $, where $ g_c $ are de camera coordinates of the center LED group. %a third beacon positioned at equal distances in between the other two. %Our 3D mouse implementation does not require such precise distance calculation, because the
+Because the wiimote never is perpendicular to the sensor bar, the distance will be an estimate. If the angle between the wiimote and the sensor bar deviates from $ 90^{\circ} $, being an angle $ \alpha $ in the range $ (0, 180) $, then the measured distance $ |g_l-g_r| $ is a factor $ \mathsf{sin}(\alpha) $ from the actual distance that would be perceived from a perpendicular viewing angle where $ \alpha = 90^{\circ} $. The computed distance $ d $ will therefore be larger then the actual distance. However, the angle $ \alpha $ can be measured when using customized sensor bar with three LED groups with equally space in between. The angle can then be computed from the ratio between $ | g_l - g_c | $ and $ | g_r - g_c | $, where $ g_c $ are de camera coordinates of the center LED group. \\%a third beacon positioned at equal distances in between the other two. %Our 3D mouse implementation does not require such precise distance calculation, because the \\
-\subsubsection{Problems}
+\subsubsection{Mapping to head position}
+
+For the headtracking implementation we use a wiimote mounted on top of the display. This wiimote is used to track the position of the users head with respect to the display. We use the position of the head to set a perspective projection such that it originates from the user's point of view. As a result the 3D scene can be observed from different angles and distances, and if done right the display seems to be a window to 3D world behind it. For the wiimote to be able track the position of the head, the user is required to have two infrared sources mounted on its head. \\
+
+The head position is first calculated in real world measurements, relative to the center of the display and then converted to world coordinates. For the best results, the wiimote has to be aligned with the axis of the display such that the wiimote's z-axis is perpendicular to the display. \\
+
+Figure \ref{fig:headpos_mapping} illustrates the mapping of the camera coordinates to a 3D position in real world measurements. Let $ d $ be the distance from the wiimote to the users head, measured as described in the previous section. The point $ p_{cam} $ is the point in camera coordinates that is mapped to 3D, this point could be the center of the users head or the camera coordinates of the left- or right eye in the case of stereo vision. \\
+
+\begin{figure}[!h]
+\begin{center}
+\includegraphics[scale=1]{img/headpos_mapping}
+\end{center}
+\caption{The diagram for the computation of the position of $ p $ in real world measurements.}
+\label{fig:headpos_mapping}
+\end{figure}
+
+The angles $ \theta_x $ and $ \theta_y $ are measured from distance of $ p_{cam} $ to the center of the camera coordinates, multiplied by the angle per pixel as seen in (b). With these angles we compute the distances $ d_{xy} $ and $ d_{yz} $, these distances represent the length of the line segment from the wiimote to $ p $ projected to xy- and yz-planes. With the projected distances and the angles we compute the x- and y-coordinate of $ p $ as $ d_{xy} \cdot \mathsf{sin}(\theta_x) $ and $ d_{yz} \cdot \mathsf{sin}(\theta_y) $ respectively. The z coordinate is computed using Pythagoras's theorem for the sides of a right triangle. \\
+
+At this point in the calculation, we have computed the position of $ p $ in real world measurements relative to the camera. In order to get the position relative to the center of the screen we only need to subtract the offset in the y-direction between the center of the screen and the camera, from the y-coordinate of $ p $. \\
+
+Now that we have the position of the head in real world measurements, we only have to convert $ p $ to world coordinates in order to be able to set a perspective projection that corresponds to the user's point of view. In our implementation we simply used the ratio between the actual screen height and a predefined screen height in world coordinates as a conversion factor.
-During the implementation of the control scheme described here, we ran into some problems, or actually one problem with multiple causes: the cursor was impossible to keep steady when holding the wii remote in hand. The cursor seemed to oscillate around a point, while frequently jumping to another position. Only when the controller lay perfectly still on a surface, the cursor on screen seemed to be steady.
-One cause for the cursor instability was the method with which we selected the coordinates for $ g_l $ and $ g_r $ from the points returned by the wiimote. As mentioned the wiimote can track up to four infra-red sources. If the wiimote is close enough to the sensor bar it will recognize the positions of the individual LED's withing the LED groups. When this happens more than two coordinates are returned by the wiimote. In our original implementation we selected the two points with the greatest distance between them. As a solution, we implemented an algorithm that sorts the points returned by the wiimote into two groups based on their proximities, one group per LED group. The algorithm returns the average coordinates per group as the coordinates for $ g_l $ and $ g_r $. This solved the large jumps in the cursor position, but the did not help the stability of the cursor very much.
-Another cause, which is more apparent, is the fact that it is very hard for humans to keep their arm perfectly still when holding a wiimote. As a solution we implemented an exponential smoothing algorithm to calculate a weighted average over $ g_l $ and $ g_r $ before they were processed. The smoothing stabilized the cursor in the x and y-direction while still being quite responsive. In the z-direction however the cursor remained unsteady, albeit to some lesser extend.
-The reason for the stability problem in the z-direction is the limited resolution of the mapping in the z-direction as opposed to the x and y directions. For the latter two, the resolution corresponds to the resolution of the camera. For the z-direction the resolution is limited by the chosen values of $ d_{min} $ and $ d_{max} $ and the distance between the LED groups in the sensor bar. Suppose we define $ d_{min} $ as 1 meter and $ d_{max} $ as 2 meters and use a standard sensor bar such that $ \Delta $ in the distance calculation is 0.0205 meters (see Figure \ref{fig:wiimote_dist_calc}). The distance in pixels between the $ g_l $ and $ g_r $ at distance $ d_{min} $ is:
-\[ \frac{2\mathsf{arctan}(\frac{\frac{1}{2}\Delta}{d_{max}})}{\theta_{pix}} = 29.832867646 \approx 30 \text{pixels} \]
-Compared to the distance in pixels at $ d_{max} $ which is approximately $ 15 $ pixels, gives a resolution in the z-direction of a mere $ 15 $ camera pixels. The resolution can be increased by using a wider sensor bar or by sitting closer to the sensor bar, reducing $ d_{min} $ and $ d_{max} $. Choosing a bounding box $ \mathcal{B} $ with a depth as small as possible can also make vibrations in the z-direction less visible. In our implementation we severely smoothed the z-coordinate of the cursor position to finally stabilize the cursor. The downside of the smoothing is that we are left with a noticeable lag in movements in the z-direction.