YosrON Report v1.4
YosrON Report v1.4
Touchscreen Add-On
Official Website: YosrON.com
July, 2012
A Graduation Project Report Submitted to the Faculty of Engineering at Cairo University In Partial Fulfillment of the Requirements for the Degree of Bachelor of Science in Electronics and Communications Engineering Faculty of Engineering, Cairo University Giza, Egypt July 2012
i
Table of Contents
List of Figures ................................................................................................................ v Acknowledgments......................................................................................................... vi Abstract ........................................................................................................................vii Chapter 1: 1.1 1.2 1.3 Introduction .............................................................................................. 1
Why is it important? ........................................................................................ 1 Other related projects ...................................................................................... 3 YosrON is built on the 2nd version of EverScreen .......................................... 6 The hardware ........................................................................................... 6 The software............................................................................................. 6 The advantages of YosrON ...................................................................... 7 The challenges we expected..................................................................... 7 The skills we needed ................................................................................ 8 The plan ................................................................................................... 8 YosrON structure ..................................................................................... 9
1.3.1 1.3.2 1.3.3 1.3.4 1.3.5 1.3.6 Chapter 2: 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 Chapter 3: 3.1
System Description ......................................................................................... 9 Scanlines.......................................................................................................... 9 Noise reduction ............................................................................................. 10 Fast pointer detection .................................................................................... 11 Positioning the cameras ................................................................................. 14 Calibration phase ........................................................................................... 16 Tracking algorithm ........................................................................................ 16 Resolution accuracy ...................................................................................... 18 Algorithm complexity ................................................................................... 21 Settings and system performance .............................................................. 21 Notes on the code ................................................................................... 22
3.2
Notes on main.cpp ......................................................................................... 24 Defining the two webcams we need ...................................................... 24 Smoothing neighborhood in averaging .................................................. 24 The color model to be used .................................................................... 24 Debugging .............................................................................................. 25 Luminance.............................................................................................. 25 Control the code while running.............................................................. 25
Notes on constants.h ...................................................................................... 26 Threshold of colors difference in each pixel.......................................... 26 Consecutive pixels threshold ................................................................. 26 Calibration touch offset.......................................................................... 26 Consecutive detections to locate a corner .............................................. 26 Limit of attempts to locate a corner ....................................................... 27 Calibration scanlines distances .............................................................. 27 Picture format, resolution, fps and grab method .................................... 27
Chapter 4: 4.1
The environment ........................................................................................... 29 OpenCV on Windows ............................................................................ 29 C/C++ programming on Linux/Ubuntu ................................................. 29 Libraries that must be installed for the code .......................................... 30
The cameras................................................................................................... 33 The fisheye lenses ......................................................................................... 34 Conclusions and Future Work ............................................................... 38
Chapter 5:
0.2
Installing the required libraries and packages ............................................... 50 Installing "build-essential" using the "Terminal" .................................. 50 Installing libraries/packages using "Ubuntu Software Center".............. 53
Check webcam supported formats and UVC compliance ............................. 54 UVC compliance check ......................................................................... 54 Supported configurations and formats ................................................... 55 Troubleshooting webcams ..................................................................... 56
iv
List of Figures
Figure .1: Survey results on UniMasr.com website. .................................................... 2 1 Figure .2: Survey results on YosrON page on facebook. ............................................ 3 1 Figure .3: Touchscreen add-on by TouchMagic. ......................................................... 3 1 Figure .1: Visual representation of scanlines. ............................................................ 10 2 Figure .2: The buffer used for the analysis of the green row shows a clear peak. ..... 12 2 Figure .3: The system correctly detects only the the pointer coming from above. .... 13 2 Figure .4: The vertical contiguity constraint of a of a hand holding a pen. ............... 14 2 Figure .5: Example of a simple but inefficient configuration. ................................... 14 2 Figure .6: Suggested configuration to optimize the use of the cameras. ................... 15 2 Figure .7: Resolution accuracy of W1 in t. ................................................................. 19 2 Figure .8: A4Tech webcam, PK 720G model. ........................................................... 21 2 Figure .1: Full-frame fisheye image........................................................................... 35 4 Figure .2: Remapped full-frame fisheye image into rectilinear prespective. ............. 35 4 Figure .3: Circular fisheye image............................................................................... 36 4 Figure .4: The image of circular fisheye after remapping (Defisheye). ..................... 36 4 Figure .5: Fisheye for home doors. ............................................................................ 37 4 Figure .1: Windows Disk Management Tool. ............................................................ 44 6 Figure .2: Shrink dialog box. ..................................................................................... 45 6 Figure .3: Windows partitions after successfully freeing space. ............................... 46 6 Figure .4: Don't allow any updates ............................................................................ 47 6 Figure .5: Install Ubuntu alongside your current OS. ................................................ 48 6 Figure .6: Disabling automatic updates of Ubuntu. ................................................... 49 6 Figure .7: Using Ubuntu Software Center to install required libraries/packages. ..... 54 6 Figure .8: Checking supported configurations and formats using guvcview............. 55 6
Acknowledgments
We would like to thank those who helped us to make this dream comes true. No matter how big or little help they offered, we would like to mention them all as much as we can. We will mention them according to the timing of their help.
Thanks to Dr. Ibrahim Qamar for accepting us and our idea. Thanks for his valuable time and efforts of understanding and kindness discussing with us many problems leading us to solutions. Thanks to Eng. Abdel-Mohsen for telling us which programming language to use (Matlab is easy but slow, C++ is good with toolboxes and very fast for image processing). Thanks to Eng. Khaled Yeiha and Eng. Ahmad Ismail for giving us useful guidelines for the algorithm. Thanks to Eng. E. Rustico (From Italy) for supporting us with documents, codes and instructions that helped us very much as we built our project on his work, EverScreen. Thanks to Dr. Essam, the glasses maker, for helping us with the fisheye lenses. Thanks to Eng. Shaimaa Mahmoud and Eng. Dina Zeid for helping us with OpenCV toolbox and in some translations (From Italian to English). Thanks to Muhammad Sherif and Sherif Medhat for helping us with programming on Ubuntu. Thanks to Eng. Muhammad Hosny for helping us debugging some codes and solving many problems we faced with the OS and the software. Thanks to Mr. Muhammad Reda for helping us finding the compatible webcams. Thanks to Eng. Sherbeeny Hasan, Muhammad's father, for helping us with the webcams and the fisheye lenses. Thanks to our families for supporting us in every way all the time. vi
Abstract
The entire world is heading to design all operating systems and programs to work with touch technology. But most of the Egyptians, and others around the world, can't afford the cost of a touchscreen for their computers. That's why we came up with YosrON. YosrON is meant to be a touchscreen add-on that can be put on any computer screen, PC or laptop, to add the "touch" feature to the computer screen using a USB connection and software. It has been built on a complete and inexpensive system to track the movements of a physical pointer on a flat surface. Any opaque object can be used as a pointer (fingers, pens, etc.) and it is possible to discriminate whether the surface is being touched or just pointed at. The system relies on two entry-level webcams and it uses a fast scanline-based algorithm. A calibration wizard helps the user during the initial setup of the two webcams. No markers, gloves or other hand-held devices are required. Since the system is independent from the nature of the pointing surface, it is possible to use a screen or a projected wall as a virtual touchscreen. The complexity of the algorithms used by the system grows less than linearly with resolution, making the software layer very lightweight and suitable also for low-powered devices like embedded controllers. We were planning to make a resizable plastic frame as housing for the webcams and the added wide-angle lenses, fisheye lenses, but we ran out of time and faced many problems that made us postpone it for future work besides adding the multi-touch feature. For now, YosrON is just two webcams fixed far away from the touching surface and software is used for calibration and moving the mouse.
vii
Chapter 1:
Introduction
Among the existing graphical input devices, computer users love especially touchscreens. The reason is that they reflect, as no other device does, the way we use to get in touch and interact with the reality around us: we use to point and touch directly with our hands what we see around us; touchscreens allow doing the same with our fingers on computer interfaces. This preference is confirmed by a strong trend in the industry of high end platforms (e.g. Microsoft Surface and Touchwall) and in the market of mobile devices: Apple, Samsung and Nokia, to cite only a few examples, finally chose a touch-sensible display for their leading products, while the interest for this technology is growing also for design studios, industrial environments and public information points like museums and ATMs. Unfortunately, touchscreen flexibility is low: finger tracking is impossible without physical contact; it is not possible to use sharp objects on them; large touch-sensible displays are expensive because of their manufacturing cost and damage-proneness. YosrON is made of low cost devices, without the use of any kind of equipment that is not possible to find in any computer shop with less than 300 EGP which is reasonable price for the Egyptian market and other similar markets. It's important to offer such add-on with low price because the upcoming Microsoft Windows 8 OS, which is the most common OS in Egypt, is mainly designed for touchscreens. Of course it can be used without touchscreens, but that would be a great loss for the user experience. We made a simple survey asking many computer users and resellers if they would buy such an add-on and how much would they pay for it? The results are in fig. 1.1 and fig. 1.2.
references}); we focus on finger tracking systems that do not require lasers, markers, gloves or hand-held devices [SP98, DUS01, Lee07]. The main application of finger tracking is to move a digital pointer over a screen, enabling the user to replace the pointing device (e.g. the mouse) with his hands. While for eye or head tracking we have to direct the camera(s) towards the users body, finger tracking let us a wider range of choices. The first possibility is to direct the camera towards the users body, as for head tracking, and to translate the absolute or relative position of the users finger to screen coordinates. In [WSL00] an empty background is needed; in [IVV01] the whole arm position is reconstructed, and in [Jen99] a combination of depth and color analysis helps to robustly locate the finger. Some works tried to estimate the position of the fingertip relatively to the view frustum of the user; this was done in [CT06] with one camera and in [pHYssCIb98] with stereovision, but both had strong limits in the accuracy of the estimation. The second possibility is to direct the camera towards the pointing surface, which may be static or dynamic. Some works require a simple black pad as pointing surface, making it easy to locate the users finger with only one camera [LB04]; however, we may need additional hardware [Mos06] or stereovision [ML04] to distinguish if the user is just hovering the finger on it or if there is a physical contact between the finger and the surface. A physical desktop is an interesting surface to track a pointer on. Some works are based on the DigitalDesk setup [Wel93], where an overhead projector and one or more cameras are directed downwards on a desk and virtual objects can interact with physical documents [Ber03, Wil05]; others use a similar approach to integrate physical and virtual drawings on vertical or horizontal whiteboards [Wil05, vHB01, ST05], and one integrates visual information with an acoustic triangulation to achieve better accuracy [GOSC00]. These works use differencing algorithms to segment the users hands from the background, and then shape analysis or finger templates matching to locate the fingertips; they rely on the assumption that the background surface is white, or in general of a color different than skin. Other approaches work also on highly dynamic surfaces. It is possible to robustly suppress the background by analyzing the screen color space [Zha03] or by applying polarizing filters to the cameras [AA07]; in the first the mouse click has to 4
be simulated with a keystroke, while in the latter a sophisticated mathematical finger model allow to detect the physical contact with stereovision. Unfortunately, these two techniques cannot be applied to a projected wall. Directing the camera towards the pointing surface implies, in general, the use of computationally expensive algorithms, especially when we have to deal with dynamic surfaces. A third possible approach, which may drastically reduce the above problems, is to have the cameras watching sidewise - i.e. laying on the same plane of the surface; using this point of view we do not have any problem with dynamic backgrounds both behind the user or on the pointing surface, and this enables us to set up the system also in environments otherwise problematic (e.g. large displays, outdoor, and so on). Among the very few works using this approach, in [QMZ95] the webcam is on the top of the monitor looking towards the keyboard, and the finger is located with a color segmentation algorithm. The movement of the hand along the axis perpendicular to the screen is mapped to the vertical movement of the cursor, and a keyboard button press simulates the mouse click. However, the position of the webcam has to be calibrated and the vertical movement is mapped in an unnatural way. Also in [WC05] we find a camera on the top of a laptop display directed towards the keyboard, but the mouse pointer is moved accordingly to the motion vectors detected in the gray scale video flow; a capacitive touch sensor enables and disables the tracking, while the mouse button has to be pressed with the other hand. In [Mor05], finally, the lateral approach is used to embed four smart cameras into a plastic frame that is possible to overlap on a traditional display. The above approaches need to process the entire image as it is captured by the webcam. Thus, every of the above algorithms are at least quadratic with respect to resolution (or linear with respect to image area). Although it is possible to use smart region finding algorithms, these would not resolve the problem entirely. In [FR08] they proposed the 1st version of EverScreen, a different way to track user movements keeping the complexity low. They drastically decreased the scanning area to a discrete number of pixel lines of two uncalibrated cameras. Their system requires a simple calibration phase that is easy to perform also for non-experienced users. The proposed technique only regards the tracking of a pointer, and it is not about gesture recognition. The output of the system, at present, is directly translated into mouse movements, but may be instead interpreted by gesture recognition software. 5
And for production, we will need to make drivers for different OSs and to implement the software on microprocessor.
Chapter 2:
YosrON structure
2.2 Scanlines
We focus the processing only on a small number of pixel lines from the whole image provided by each webcam; we call these lines scanlines. Each scanline is horizontal and ideally parallel with the pointing surface; we call touching scanline the lowest scanline (the nearest to the pointing surface), and pointing scanline every other one. The calibration phase requires grabbing a frame before any pointer enters in the tracking area; these reference frames (one per webcam) will be stored as reference backgrounds, and will be used to look for runs of consecutive pixels different from the reference background. We will see later how we detect such scan-line interruptions (fig. 2.1). The detection of a finger only in pointing scanlines will mean that the surface is only being pointed, while a detection in all the scanlines will mean that the user is currently touching the surface. To determine if a mouse button pressure has to be simulated, we can just look at the touching scanline: we assume that the user is clicking if the touching scanline is occluded in at least one of the two views.
During the calibration phase the number of scanlines of interest may vary from a couple to tens; during the tracking, three or four scanlines will suffice for an excellent accuracy. A detailed description of the calibration will be given later.
10
The first strategy is to store, as a reference background, not just the first frame but the average of the first b frames captured (in current implementation, b = 4). The average root mean square deviation of a frame from the reference background, after this simple operation, decreases from ~1.52 to ~1.26 (about 17%). The second strategy is to apply a simple convolution to the scanlines we focus on. The matrix we use is
with divisor 3. This is equivalent to say that we replace each pixel with the average of a 1 pixel neighborhood on the same row; it is not worth increasing the neighborhood of interest because by increasing it we decrease the tracking accuracy.
Finally, we keep track of the Root Mean Square Error (RMSE) with respect to the reference frames; if the RMSE gets higher than a threshold, this is probably due to a disturbing entity in the video or to a movement of the camera rather than to systematic noise. In this case, the system automatically stops tracking and informs the user that a new reference background is about to be grabbed.
11
The first goal is to find a run of consecutive pixels significantly different from the reference; what we care is the X coordinate of the center of such interruption. We initialize to zero a buffer of the same size of one row, and then we start scanning the selected line (say l). For each pixel p = ( px, pl ), we compute the absolute difference dp from the correspondent reference value; then, for each pixel q = ( qx , ql ) in a neighborhood long n, we add this dp multiplied by a factor m inversely proportional to | px qx |. Finally we read in the buffer a peak value correspondent to the X coordinate of the center of the interruption (fig. 2.2); if no interruption occurred in the row (i.e. pixels different from the reference were not close to each other), we will have only low peaks in the buffer. To distinguish between a high and a low peak we can use a fixed or a relative threshold; in our tests, a safe threshold was about 20 times greater than the neighborhood length.
Figure .2: The buffer used for the analysis of the green row shows a clear peak. 2
12
Now we have a horizontal proximity check, but not a vertical one yet. Each webcam sees the pointer always breaking into the view frustum by the upper side. The pointer silhouette may be straight (like a stick) or curved (e.g. a finger); in both cases, the interruptions found on scanlines close to each other should not differ more than a given threshold. This vertical proximity constraint gives a linear upper bound to the curvature of the pointer, and helps discarding interruptions caused by noise or other objects entering in the view frustum; in other words, the system detects only pointers coming from above, and keeps working correctly if other objects appear in the view frustum from a different direction (e.g. the black pen in fig. 2.3).
Figure .3: The system correctly detects only the the pointer coming from above. 2
These two simple proximity checks make the recognition of the pointer an easier task. Fig. 2.4 shows the correct detection of the pointer (a hand holding a pen) over a challenging background. The lower end of the vertical sequence of interruptions is marked with a little red cross.
13
accuracy on the opposite side of the surface. On the other hand, the narrower is the view field of the webcams, the farther we have to put them to capture the entire surface.
Figure 2.5: Example of a simple but inefficient configuration.
14
In fig. 2.5, for example, the webcam along Y axis of the surface has a wide view field, but this brings resolution loss on segment DC; on the other side, the webcam along X axis of the surface has a narrow view field, but it has to be positioned far from the pointing surface to cover the whole area. If the surface is a 21.5m projected wall and the webcam has a 45 view field, we have to put the camera ~5.2 meters away to catch the whole horizontal size. A really usable system should not bother the final user about webcam calibration, view angles and so on. A way to minimize the calibration effort is to position the webcams near two nonopposite corners of the pointing surface, far enough to catch it whole and oriented as the surface diagonals were about bisectors of the respective view fields (fig. 2.6). With this configuration there is no need to put the webcams far away from the surface; this reduces the accuracy loss on the far sides.
Figure .6: Suggested configuration to optimize the use of view frustum of the cameras. 2
In the rest of this project we will assume, for the sake of clarity, that the webcams are in the same locations and orientations as in fig. 2.6. However, the proposed tracking algorithm works with a variety of configurations without changes in the calibration phase: the cameras may be positioned anywhere around the surface, and we only need that they do not face each other. 15
Since P is determined up to a proportional factor there is no loss of generality in setting one of the elements of M to an arbitrary non-zero value. In the following we set the element l33 = 1. To obtain all the other elements of M, in principle the correspondence between four pairs of points must be given. The proposed application only needs to look at horizontal scanlines; for this reason there is no need to know the coefficients l21,l22,l23 of M and we only have to determine the values of l11,l12,l13,l31,l32. The number of unknown matrix elements has been decreased to five, so we only need the x coordinate of five points (instead of the x and y of four points).
16
During the calibration phase, we ask the user to touch the four vertices of the pointing surface and its center. This setup greatly simplifies the computation of the unknown coefficients. Indeed points A,B,C,D and the center E (see fig. 2.6) have screen coordinates respectively:
when the display resolution is W H. If Q is a point on the surface, let Qxp be the x coordinate of the corresponding projected point. The final linear system to solve is:
which makes easy to obtain l11, l12, l13, l31, l32 for each camera. During the tracking phase we face a somehow inverse problem: we know the projected x coordinate in each view, and from these values (let them be Xl and Xr) we would like to compute the x and y coordinates of the correspondent unprojected point (that is, the point the user is touching). Let lij be the transformation values for the first camera, and rij for the second one; the linear system we have to solve in this case is
17
It is convenient to divide the first two equations by zl and the latter two by zr , and rename the unknown variables as follows
This is a determined linear system, and it is possible to prove that in the setting above there is always one and only one solution. By solving this system in x and y we find the absolute coordinates of the point that the user is pointing/touching on the surface. We can solve this system in a very fast way by computing once a LU factorization of the coefficient matrix, and by using it to compute x and y for each pair of frames; we can also use numerical methods, such as Single Value Decomposition, or direct formulas. In the previous version of the system direct formulas were used, while now a LU factorization is implemented.
18
Let t = (xt ,yt ) be a point on the pointing surface, XDYD the display resolution (i.e. the resolution of the projector for a projected wall) and XW1 YW1 the resolution of a webcam W1; let W1 be the bisector of the view frustum of W1, and let the upper left corner of the surface be the origin of our coordinate system (with Y pointing downwards, like in fig. 2.7). We assume for simplicity that the view frustum of the camera is centered on the bisector of the coordinate system, but the following considerations keep their validity also in slightly different configurations. The higher is the number of pixels detected by the webcam for each real pixel of the display, the more accurate will be the tracking. Thus, if we want to know how accurate is the detection of a point in the pointing surface, we could consider the ratio between the length in pixels of the segment Xt , passing by t and perpendicular to W1 , and the number of pixels detected by the webcam W1. We define resolution accuracy of W1 in t and we call
horizontal resolution of W1, which is constant in the whole view frustum of the camera. (fig. 2.7)
Figure .7: We define resolution accuracy of W1 in t the ratio between the length of Xt and the 2 number of pixels detected by W1.
19
Because pixels are approximately squares, the number of pixels along the diagonal of a square is equal to the number of pixels along an edge of the square; thus, the length of Xt will be equal to the distance from the origin of one of the two points that Xt intercepts on the X and Y axes. For every point p Xt is xp + yp = k; then, its length will be equal to the y-intercept of the line passing by t and perpendicular to W1. So we have |Xt | = xt + yt ; hence, the resolution accuracy of W1 in t is
One of the most interesting applications of the system is to projected walls, so that they become virtual blackboards. A very common projector resolution is nowadays 1024 768 pixels, while one of the maximum resolutions that recent low-cost webcams support is 12801024 pixels at 15 frames per second. In this configuration, the resolution accuracy in t = (1024, 768) is
This is the lowest resolution accuracy we have with W1 in the worst orientation; if we invert the X axis to get the accuracy for W2 (supposing that W2 is placed on the upper right corner of the surface), (W2, t) 1.7. In the central point u = (512, 384) of the display we have (W1, u) = (W2, u) 1.4; it is immediate that, in the above configuration, the average resolution accuracy is higher than 1:1 (sub-pixel).
20
Their 2012 price has been of about 110 EGP each. There is a mature Video4Linux2 compliant driver (uvcvideo) available for GNU/Linux. Our prototype has good resolution accuracy and excellent time performances: less than 10 milliseconds are needed to elaborate a new frame and compute the pointer coordinates. Two USB webcams connected to the same computer can usually send less than 20 frames per second simultaneously, while the software layer could elaborate hundreds more. The tracking system is in C++ in a GNU/Linux environment; in the relatively small source code, all software layers are strictly separated, so that it is possible to port the whole system to different platforms with very little changes in the source.
21
Chapter 3:
The code consists of separate files. Most of them are standard header files or contain many standard functions. Most of our efforts in coding were made in the files: constants.h, main.cpp and makefile.
Grab 4 frames/webcam then average them to set a reference image for each webcam
For each corner, compare the live frames of each webcam with its reference image
Yes Exit
Tracking
No
Inside the
tracking area Yes
No
Move mouse
Yes
No
23
If the host computer doesn't have any other webcams (doesn't have built-in webcam), then these lines should be like this:
const char *videodevice1 = "/dev/video0"; const char *videodevice2 = "/dev/video1";
In general, we used an application called "Cheese webcam" to test the webcam and to determine their ID. After installing "Cheese webcam" using "Ubuntu Software Center", go to Edit preferences And you can see a list of all connected webcams and their ID.
24
3.2.4 Debugging
There are two debugging modes. Debug_one is for debugging one webcam only (the first one) as we will be able to see a live streaming from the first webcam with a single horizontal line across the image defining the scanline resulting a histogram below the live stream showing interruptions as in fig. 2.2. And the other mode is an overall debugging. Activating any of them is using the following lines:
debug = false; debug_one = false;
If debug_one is activated (Making it "true") it will prevent rest of the code from running.
3.2.5 Luminance
The value of the following variable should be set depending on the luminance of the surrounding.
norm_luminance = false;
25
26
mmap
const int format = V4L2_PIX_FMT_YUYV; // Better quality, lower framerate //const int format = V4L2_PIX_FMT_MJPEG; // Lower quality, higher frame rate
Note that entering an unsupported option would lead to error 22. And entering higher resolution without lowering the fps or using MJPEG format would lead to error 28 which is due to USB 2.0 bandwidth limitation. More details about error 22 and error 28 can be found in section 4.2.
27
To remove older compilation files, type: Make clean To make new compilation files, type: Make
28
Chapter 4:
Challenges
installation all over again using another method. We had to remove all the installation again and install it from a boot CD alongside with Windows 7. Details about this process are available in the appendix.
Libc dev: It provides headers from the Linux kernel. These headers are used by the installed headers for GNU glibc and other system libraries.
SDL dev (libsdl1.2-dev): Simple DirectMedia Layer is a cross-platform multimedia library designed to provide low level access to audio, keyboard, mouse, joystick, 3D hardware via OpenGL, and 2D video framebuffer. It is used by MPEG playback software, emulators, and many popular games, including the award winning Linux port of "Civilization: Call To Power." SDL supports Linux, Windows, Windows CE, BeOS, MacOS, Mac OS X, FreeBSD, NetBSD, OpenBSD, BSD/OS, Solaris, IRIX, and QNX. The code contains support for AmigaOS, Dreamcast, Atari, AIX, OSF/Tru64, RISC OS, SymbianOS, and OS/2, but these are not officially supported. SDL is written in C, but works with C++ natively, and has bindings to several other languages, including Ada, C#, D, Eiffel, Erlang, Euphoria, Go, Guile, Haskell, Java, Lisp, Lua, ML, Objective C, Pascal, Perl, PHP, Pike, Pliant, Python, Ruby, Smalltalk, and Tcl. 30
GSL dev (libgsl0-dev): The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and leastsquares fitting. There are over 1000 functions in total with an extensive test suite.
Xorg XTest (libxtst-dev): The X window system (commonly X Window System or X11, based on its current major version being 11) is a computer software system and network protocol that provides a basis for graphical user interfaces (GUIs) and rich input device capability for networked computers. It creates a hardware abstraction layer where software is written to use a generalized set of commands, allowing for device independence and reuse of programs on any computer that implements X.
programming interface for Linux. Many USB webcams, TV tuners, and other devices are supported. Video4Linux is closely integrated with the Linux kernel. V4L2 is the second version of V4L. The original V4L was introduced late into the 2.1.X development cycle of the Linux kernel. Video4Linux2 fixed some design bugs and started appearing in the 2.5.X kernels. Video4Linux2 drivers include a compatibility mode for Video4Linux1 application, though practically, the support can be incomplete and it is recommended to use V4L2 devices in V4L2 mode.
31
It's considered as an API that provides unified access to various video capturing devices, such as TV tuners, USB web cameras, etc.
UVC drivers: The USB video device class (also USB video class or UVC) is a USB device class that describes devices capable of streaming video like webcams, digital camcorders, transcoders, analog video converters, television tuners, and still-image cameras.
The latest revision of the USB video class specification carries the version number 1.1 and was defined by the USB Implementers Forum in a set of documents describing both the basic protocol and the different payload formats. Webcams were among the first devices to support the UVC standard and they are currently the most popular UVC devices. It can be expected that in the near future most webcams will be UVC compatible as this is a logo requirement for Windows and Since Linux 2.6.26 the driver is included in kernel source distribution.
luvcview: luvcview is a camera viewer for UVC based webcams. It includes an mjpeg decoder and is able to save the video stream as an AVI file.
guvcview: It provides a simple GTK interface for capturing and viewing video from devices supported by the linux UVC driver, although it should also work with any v4l2 compatible device. The project is based on luvcview for video rendering, but all controls are built using a GTK2 interface. It can also be used as a control window only
32
33
So, the total required bandwidth for two webcams = 2 x 294.912 = 589.824 Mbps which is higher than the 480 Mbps total bandwidth supported by USB 2.0. Overcoming this problem was supposed to be easy by setting the configuration of the webcams to fewer frames (15 fps) or lower resolution (320x240), but that didn't work. After spending more than a week investigating this problem and trying all the suggested solutions, we suspected that the 2B webcams only supports one bandwidth setting despite of the configuration which means that each webcam reserves a fixed USB bandwidth much more than it really need no matter what is the configuration. Error 16 is much related to error 22 as it means that the device is hanged and can't be accessed. When a webcam starts streaming, it reserves the bandwidth. When the other webcam starts to work on the same bus, it requests the needed bandwidth which is not available because of the first webcam. So, both webcams hang and stop responding while the system keeps their ports (i.e. /dev/video1) reserved forcing us to unplug and plug them again. Our final solution for these errors was to buy another two webcams that support either MJPEG format or variable bandwidth depending on the configuration. We didn't find webcams in the Egyptian market that support MJPEG format but we found A4Tech webcams that supported variable bandwidth depending on the configuration. A4Tech webcams don't support MJPEG and support only 30 frames per second. So we had to work with the configuration of 320 x 240 resolution which is acceptable for our needs.
34
Then to remap it into rectilinear perspective (Defisheye) with any of the available scripts like Panorama Tools as in fig 4.2.
We searched in many places and asked many photographers and glass makers to help us finding a single lens that serves as a full-frame fisheye with a very small size for our webcams. But all the attempts failed.
35
We also couldn't find circular fisheye that would produce an image as in fig 4.3.
36
Our final hope is to use the only available fisheye small enough for YosON: The fisheye for home doors as in fig. 4.5. We removed its metallic housing as we don't need and while we need to make it smaller to fit in the plastic frame.
Figure 4.5: Fisheye for home doors.
After removing the housing of the webcams and fixing the fisheye lenses on them, we faced a problem that we couldn't overcome due to the lake of time and available support in Egypt. The problem was that the fisheye lens produced some internal reflections on the image (i.e. the lighting would be repeated in other parts in the image) increasing the noise to unacceptable levels. Another problem was the difficulties of finding two exactly identical fisheye lenses. We thought it should be a simple thing if we bought them both from the same brand and the same shop, but believe it or not: They weren't identical!! Although that "identical" problem is possible to overcome using software, but the killing problem was the "internal reflections" problem that made us postpone the fisheye addition and the plastic frame to future work.
37
Chapter 5:
5.1 Conclusions
We presented a low cost system for bare finger tracking able to turn LCD displays into touchscreens, as well as a desk into a design board, or a wall into an interactive whiteboard. Many application domains can benefit from the proposed solution: designers, teachers, gamers, interface developers. The proposed system requires a simple calibration phase.
References
[Figure 1.1] Survey from UniMasr.com website at: Can be found at: http://unimasr.com/community/viewtopic.php?t=87470. [Figure 1.2] Survey from YosrON page on facebook (http://fb.com/yosronx) at: http://fb.com/questions/242871132427684/. [Figure 1.3] Image and price details from http://www.magictouch.com and local resellers available at: http://www.magictouch.com/middleeast.html. [Figure 2.8] A4Tech webcam, PK 720G model at: http://a4tech.com/product.asp?cid=77&scid=167&id=693. E. Rustico. "Low cost finger tracking for a virtual blackboard" at http://www.dmi.unict.it/~rustico/docs/Low%20cost%20finger%20tracking%2 0for%20a%20virtual%20blackboard.pdf. [AA07] Chandraker M. Blake A. Agarwal A., Shahram Izadi S. High precision multitouch sensing on surfaces using overhead cameras. In Horizontal Interactive Human-Computer Systems, 2007. TABLETOP 07. Second Annual IEEE International Workshop on, pages 197 200, 2007. [Ber03] F. Berard. The magic table: Computer vision based augmentation of a whiteboard for creative meetings. IEEE International Conference in Computer Vision, 2003. [CT06] Kelvin Cheng and Masahiro Takatsuka. Estimating virtual touchscreen for fingertip interaction with large displays. In OZCHI 06: Proceedings of the 20th conference of the computer-human interaction special interest group (CHISIG) of Australia on Computer-human interaction: design: activities, artefacts and environments, pages 397400, New York, NY, USA, 2006. ACM. [DUS01] Klaus Dorfmller-Ulhaas and Dieter Schmalstieg. Finger tracking for interaction in augmented environments. Augmented Reality, International Symposium on, 0:55, 2001. [FR08] G.M. Farinella and E. Rustico. Low cost finger tracking on flat surfaces. In Eurographics Italian chapter 2008, 2008. [GMR02] D. Gorodnichy, S. Malik, and G. Roth. Nouse use your nose as a mouse a new technology for hands-free games and interfaces, 2002.
39
[GOSC00] Christophe Le Gal, Ali Erdem Ozcan, Karl Schwerdt, and James L. Crowley. A sound magicboard. In ICMI 00: Proceedings of the Third International Conference on Advances in Multimodal Interfaces, pages 6571, London, UK, 2000. Springer-Verlag.
[IVV01] Giancarlo Iannizzotto, Massimo Villari, and Lorenzo Vita. Hand tracking for human-computer interaction with gray level visual glove: turning back to the simple way. In PUI 01: Proceedings of the 2001 workshop on Perceptive user interfaces, pages 17, New York, NY, USA, 2001. ACM.
[Jen99] Cullen Jennings. Robust finger tracking with multiple cameras. In In Proc. Of the International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, pages 152160, 1999.
[LB04] Julien Letessier and Franois Brard. Visual tracking of bare fingers for interactive surfaces. In UIST 04: Proceedings of the 17th annual ACM symposium on User interface software and technology, pages 119122, New York, NY, USA, 2004. ACM.
[Lee07] Johnny Chung Lee. Head tracking for desktop VR displays using the Wii remote http://www.cs.cmu.edu/~johnny/projects/wii. 2007. [ML04] Shahzad Malik and Joe Laszlo. Visual touchpad: a two-handed gestural input device. In ICMI 04: Proceedings of the 6th international conference on Multimodal interfaces, pages 289296, New York, NY, USA, 2004. ACM.
[MmM01] Lionel Moisanm and Jean Michel Morel. Edge detection by Helmholtz principle. Journal of Mathematical Imaging and Vision, 14:271 284, 2001.
[Mor05] Gerald D. Morrison. A camera-based input device for large interactive displays. IEEE Computer Graphics and Applications, 25(4):5257, 2005.
[Mos06] Tomer Moscovich. Multi-finger cursor techniques. In In GI 06: Proceedings of the 2006 conference on Graphics interface, pages 17, 2006. [pHYssCIb98] Yi ping Hung, Yang Yao-strong, Yong sheng Chen, and Hsieh Ingbor. Freehand pointer by use of an active stereo vision system. In Proc. 14th Int. Conf. Pattern Recognition, pages 12441246, 1998.
40
[QMZ95] F. Quek, T. Mysliwiec, and M. Zhao. Fingermouse: A freehand computer pointing interface, 1995. [SP98] Joshua Strickon and Joseph Paradiso. Tracking hands above large interactive surfaces with a low-cost scanning laser range finder. In Proceedings of CHI98, pages 231232. Press, 1998.
[ST05] Le Song and Masahiro Takatsuka. Real-time 3d finger pointing for an augmented desk. In AUIC 05: Proceedings of the Sixth Australasian conference on User interface, pages 99108, Darlinghurst, Australia, Australia, 2005. Australian Computer Society, Inc.
[vHB01] Christian von Hardenberg and Franois Brard. Bare-hand humancomputer interaction. In PUI 01: Proceedings of the 2001 workshop on Perceptive user interfaces, pages 18, New York, NY, USA, 2001. ACM.
[WC05] Andrew D.Wilson and Edward Cutrell. Flowmouse: A computer vision-based pointing and gesture input device. In Interact 05, 2005. [Wel93] Pierre Wellner. Interacting with paper on the digitaldesk. Communications of the ACM, 36:8796, 1993. [Wil05] Andrew D. Wilson. Play anywhere: a compact interactive tabletop projection-vision system. In Patrick Baudisch, Mary Czerwinski, and Dan R. Olsen, editors, UIST, pages 8392. ACM, 2005.
[WSL00] Andrew Wu, Mubarak Shah, and N. Da Vitoria Lobo. A virtual 3d blackboard: 3d finger tracking using a single camera. In In Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pages 536543, 2000.
[Zha03] Zhengyou Zhang. Vision-based interaction with fingers and papers. In Proc. International Symposium on the CREST Digital Archiving Project, pages 83106, 2003.
Details about guvcview package from: http://guvcview.sourceforge.net. Details about luvcview package from: http://packages.ubuntu.com/hardy/luvcview. Details about V4L2 library from: http://en.wikipedia.org/wiki/Video4Linux. Details about SDL library from: http://www.libsdl.org. Details about GSL library from: http://www.gnu.org/software/gsl.
41
Details about Xorg Xtest from http://en.wikipedia.org/wiki/X_Window_System. Details about build-essential package from: http://packages.ubuntu.com/lucid/build-essential. Details about UVC drivers from: http://en.wikipedia.org/wiki/USB_video_device_class. Details about Libc dev package from: http://packages.debian.org/sid/linuxlibc-dev. Details about fisheye lenses from: http://en.wikipedia.org/wiki/Fisheye_lens. Details about defisheye scripts from: http://www.fmwconcepts.com/imagemagick/defisheye/index.php. How to install Ubuntu 11.10 from a CD or USB flash memory. From: http://blog.sudobits.com/2011/09/11/how-to-install-ubuntu-11-10-from-usbdrive-or-cd/
How to free space on your hard disk and make it unallocated using Windows Disk Management Tool. From: http://technet.microsoft.com/enus/magazine/gg309169.aspx.
How to disable automatic updates in Ubuntu. From: http://www.garron.me/linux/turn-off-stop-ubuntu-automatic-update.html. How to install build-essential from: https://help.ubuntu.com/community/CompilingEasyHowTo. How to check UVC compliance of a webcam and troubleshoot it from: http://www.ideasonboard.org/uvc/faq.
42
Chapter 0:
Appendix
Preparing for installation: First of All backup your important data This step is very important, especially for beginners, as some mistakes would lead to reformatting the entire hard disk and losing data. So, Before going to start the installation procedure you are strongly recommended to backup your data (using a backup disk or online backup program), although you arent going to lose any if youve multiple partition on your drive and want to go for custom installation procedure, but youre supposed to have a backup of all your critical data before starting any experiments.
Step 1: Download Ubuntu 11.10 ISO file First, Download Ubuntu 11.10 ISO (http://releases.ubuntu.com/oneiric), then select the archive file (ISO) depending on your computer architecture such as Intelx86 or AMD64. If you are not sure then go for first one. When the download is completed, move on to next step.
43
Step 2: Create a bootable media (USB/CD) You can create a bootable USB stick/drive or a CD/DVD from the ISO file youve just downloaded. If you want to create a bootable CD/DVD then its pretty easyyou just need to burn the ISO image to the cd. If you want to install Ubuntu from a USB flash memory (pendrive), then use the free program called universal USB installer. To make your pendrive bootable use Universal-USB-Installer (Download from "http://www.pendrivelinux.com/universalusb-installer-easy-as-1-2-3" and run it then locate the ISO file, choose your USB drive as a target and your will be done in a minute). In Windows 7 you can burn ISO files directly in few simple steps Insert cd in to the tray, right click on the ISO file and select burn this ISO and finally you will get a bootable cd.
Step 3: Free enough space Explore your partitions and make sure that one of them has at least 20 GB free. Then use the Windows 7 Disk Management tool that provides a simple interface for managing partitions and volumes. Heres an easy way to shrink a volume: 1. Open the Disk Management console by pressing "Windows key + R" and typing diskmgmt.msc at an elevated command prompt. 2. In Disk Management, right-click the volume that you want to shrink, and then click Shrink Volume.
44
3. In the field provided in the Shrink dialog box, enter the amount of space by which to shrink the disk.
Total Size Before Shrink In MB Lists the total capacity of the volume in MB. This is the formatted size of the volume. Size Of Available Shrink Space In MB Lists the maximum amount by which you can shrink the volume. This doesnt represent the total amount of free space on the volume; rather, it represents the amount of space that can be removed, not including any data reserved for the master file table, volume snapshots, page files, and temporary files. Enter The Amount of Space To Shrink In MB Lists the total amount of space that will be removed from the volume. The initial value defaults to the maximum amount of space that can be removed from the volume. For optimal drive performance, you should ensure that the volume has at least 10 percent of free space after the shrink operation. Total Size After Shrink In MB Lists what the total capacity of the volume in MB will be after you shrink the volume. This is the new formatted size of the volume.
45
4. After clicking "Shrink", you should see the free space as a green partition.
That free unallocated space will be automatically used by the Ubuntu installer.
Step 4: Insert the USB disk (or CD) and restart Now restart your computer and enter the BIOS to make sure that your computer is configured to boot first from CD or USB drives. The steps of this configuration are not the same for every computer, so you have to do it yourself. You can search online according to the model for your motherboard or you can ask the help of any available technical support for you. When your computer is booting, if you have set any password, enter your supervisor BIOS password as your system may not boot from CD if you enter user BIOS password. Your computer should boot automatically from the bootable media, and the Ubuntu will be loaded in RAM. If any option comes then select "try Ubuntu without installing" if you want to take a look before installing it on your hard drive. Then click on the install Ubuntu 11.10 icon on the desktop to begin and select the language "English" to continue.
46
Step 5: Select Installation Type For YosrON, we should not allow any updates for the environment, especially kernel updates or an entire upgrade, as this might lead to incompatibilities between the software headers and the kernel headers. So, make sure to uncheck "Download Updates" but you can check "install third party software", but you must be connected with Internet (its recommended if wireless network doesnt seem to work use wired connection). Although there is no hurry you can always install them later, so its optional.
then click on continue then a new window will appear where you need to select installation type.
47
You may get different options depending on your computer configuration. The above snapshot has been taken while installing Ubuntu 11.10 on a computer with Ubuntu 10.04 and Windows 7 pre-installed as dual boot.
Install Ubuntu alongside with them: it will install Ubuntu 11.10 alongside with existing operating systems such as Windows 7.
Erase Entire Disk and Install Ubuntu: its going to erase your whole hard drive and everything will be deleted (your files as well as other operating systems), useful only if your hard-drive doesnt have any important files or you just bought a new computer and want to keep only one OS i.e. Ubuntu.
Something Else: Create, Allocate and choose the partition to which you want to install Ubuntu, using advanced partition manager. At first look it may seems little difficult but its better as it give you more options/control.
However, we will go with the first option select "Install Ubuntu alongside them" and continue.
48
Step 6: Finishing the installation The rest of the steps are easy for any user and they are standards as available online. But it's important to select the correct keyboard layout to ensure no problems later. Most of keyboards in Egypt are "Arabic 101" layout. Also, it's very important to set a password for Ubuntu and remember is very well as we will use it in installing the required libraries and packages for YosrON.
Step 7: Disabling automatic updates As we mentioned before, it's very important for YosrON to disable the automatic updates feature in Ubuntu. From the menu on the left of the screen, Open the Ubuntu Software Center then go to Edit -> Software Sources and be sure to select to Never the option Automatically check for updates:.
Then click "close". This will disable automatic update on you Ubuntu box.
49
Step 2: Resolving Dependencies One nice thing about modern Linux distributions is they take care of dependencies for the user. That is to say, if you want to install a program, the apt program will make sure it installs all needed libraries and other dependent programs so installing a program is never more difficult than just specifying what you need and it does the rest. Unfortunately with some programs this is not the case, and you'll have to do it manually. It's this stage that trips up even some fairly experienced users who often give up in frustration for not being able to figure out what they need to get.
You probably want to read about the possibilities and limitations of auto-apt (https://help.ubuntu.com/community/AutoApt) first, which will attempt to take care of dependency issues automatically. The following instructions are for fulfilling dependencies manually:
To prepare, install the package "apt-file", and then run sudo apt-file update. This will download a list of all the available packages and all of the files those packages contain, which as you might expect can be a very large list. It will not provide any feedback while it loads, so just wait. The "apt-file" program has some interesting functions, the two most useful are aptfile search which searches for a particular file name, and apt-file list which lists all the files in a given package. (Two explanations:
1{http://debaday.debian.net/2007/01/24/apt-file-search-for-files-in-packagesinstalled-or-not/} and 2{http://www.debianhelp.co.uk/findfile.htm}) To check the dependencies of your program, change into the directory you created in step two (cd /usr/local/src). Extracting the tarball or downloading from
"cvs/subversion" will have made a sub-directory under "/usr/local/src" that contains the source code. This newly-created directory will contain a file called "configure", which is a script to make sure that the program can be compiled on your computer. To run it, run the command ./configure This command will check to see if you've got all the programs needed to install the program in most cases you will not, and it will error out with a message about needing a program.
If you run ./configure without any options, you will use the default settings for the program. Most programs have a range of settings that you can enable or 51
disable, if you are interested in this check the README and INSTALL files found in the directory after decompressing the tar file. You can check the developer documentation and in many cases ./configure --help will list some of the key configurations you can do. A very common options is to use ./configure --prefix=/usr which will install your application into "/usr" instead of "/usr/local" as my instructions do. If this happens, the last line of output will be something like configure: error: Library requirements (gobbletygook) not met, blah blah blah stuff we don't care about But right above that it will list a filename that it cannot find (often a filename ending in ".pc", for instance). What you need to do then is to run apt-file search missingfilename.pc which will tell you which Ubuntu package the missing file is in. You can then simply install the package using sudo apt-get install requiredpackage Then try running ./configure again, and see if it works. If you get to a bunch of text that finishes with "config.status: creating Makefile" followed by no obvious error messages, you're ready for the next steps.
Step 3: Build and install If you got this far, you've done the hardest part already. Now all you need to do is to make sure you are inside the program folder (for example: a folder on the desktop called YosrON), type: cd Desktop/YosrON Then, run the command make
52
which does the actual building (compiling) of the program. (You can use make clean to remove older compilation files after any edits you make in the code then use make again) Make sure you installed all the libraries/packages needed for YosrON before running this command. Check the following sections. When it's done, install the program. You probably want to use sudo checkinstall which puts the program in the package manager for clean, easy removal later. This replaces the old sudo make install command. See the complete documentation at CheckInstall (https://help.ubuntu.com/community/CheckInstall). Note: If checkinstall fails you may need to run the command like sudo checkinstall --fstrans=0 which should allow the install to complete successfully Then the final stage of the installation will run. It shouldn't take long. When finished, if you used checkinstall, the program will appear in Synaptic Package Manager. If you used sudo make install, your application will be installed to "/usr/local/bin" and you should be able to run it from there without problems. Finally, it would be better to change the group of "/usr/local/src/" to admin and give them rwx privileges? Since anyone adding and removing software should be in the admin group.
53
Check the box "Canonical Partners" as in fig. 6.7, then click "Close".
Then, type the name (code) of what you want to install in the search box found in the upper right of the window. For example: type "guvcview" and it will appear in the results. Just click "Install". Some libraries/packages can't be installed from the "Ubuntu Software Center" leading us to the "Terminal". For example: to install the SDL library, type: sudo apt-get install libsdl1.2-dev
54
2. Use the lsusb tool again to look for video class interfaces like this: (In this example, the VID is 046d and the PID is 08cb.) lsusb -d 046d:08cb -v | grep "14 Video" If the webcam is a UVC device, you should see a number of lines that look like this:
bFunctionClass bInterfaceClass bInterfaceClass bInterfaceClass 14 Video 14 Video 14 Video 14 Video
In this case the Linux UVC driver should recognize your camera when you plug it in. If there are no such lines, your device is not a UVC device.
55
(http://www.ideasonboard.org/uvc/#devices), dump its USB descriptors: lsusb -d VID:PID -v > lsusb.log (replace VID and PID with your device VID and PID)
56