Research Journal
Week One : June 9 - June 13, 2003
The first week started off pretty hectic, but waned towards the end. On Monday, I had initially
planned to meet Margrit sometime in the afternoon and get an informal overview of the project and other introductions.
However, my plane was late and I had a lot of running around to do for BU housing. By the time I was ready to go
meet her, I figured I should call over to her office to make sure she was still there. However, I found out that I had just missed her.
Tuesday, I met Jessie, the other CRA student matched with Boston University.
She managed to meet with Professor Betke on
Monday. Maybe that was a good thing, because she really helped me with setting up my computer accounts and ID card stuff. One
would think it would take 2 minutes and voila! you have your accounts. However, we had to go from building to building, creating
several different accounts, and waiting in between for activation times. If it weren't for Jessie having gone through the whole
routine by herself on Monday, I think I would have spent the entire day running around campus like a lost puppy.
I spent the rest of the week reading research papers and a thesis given to me by Professor Betke. This was a good thing. I have absolutely no research background nor do I possess any familiarity with computer vision/imaging/video. Two strikes
against me. Not good. So, these papers really helped me out by giving me a preliminary understanding as to what I'll be delving into as the weeks go on. Papers I've read:
- OpenSesame: System Support of Real-Time Activity Interpretation in a Distributed Video Sensor Environment
- The Camera Mouse: Visual Tracking of Body Features to Provide Computer Access for People with Severe Disabilities by
Margrit Betke, James Gips, and Peter Fleming
- Counting Fingers in Real Time: A Webcam-Based Computer Human Computer Interface with Game Applications by Stephen Crampton and Margrit Betke
- Communication via Eye Blinks and Eyebrow Raises: Video-Based Human-Computer Interfaces by Kristen Grauman, Margrit Betke, Jonathan Lombardi, James Gips, and Gray Bradski
Towards the end of the week, I was introduced to a few PhD students
(Stephen Crampton and
John Magee.) They gave me an overview of their current
projects. I was given code to become familiar with and experiment with. I also attended some informal meetings, a
Sensorium meeting, and went to lunch with Professor Betke,
Jessie, and Diane (another undergraduate research student).
Week Two : June 16 - June 20, 2003
Things are starting to pick up -- Jessie and I had a meeting with Professor Betke on Tuesday, where we further
defined our research project and goals. So... check the research page for an updated version of the computer vision timeline/agenda
I'll be working with this summer. We were intially going to focus on the timing and scheduling as it pertained to the
Finger Counter
program, but we came to the conclusion that it would be more beneficial to the computer vision community if we applied such techniques
and compiled statistical data to modular vision functions / algorithms (for example, in OpenCV), so that other developers could get an
idea on how much processing time and resources their programs require. Anyhow, visit the research link for the
specifics.
On Wednesday I experimented with some code that would detect skin color in static images. I utilized the concept of
thresholding. This is a technique that creates a filtered image according to some threshold color value. This method utilizes segmentation,
which groups and partitions defined pixels into a region (i.e. flesh tones). I'm a little rusty on my C++ programming, so this gave me a good
chance to get the wheels turning again. I know it doesn't seem like much, but I'm happy that I've gotten past the preliminary workings of the
skin detection program -- the simple idea behind the program was to identify "skin" color and produce an output image that turns all identifiable
regions of the defined parameters black (though some rough patches, or noise, gets caught up in the output images). But all in all,
it didn't throw up a bizillioin error messages, which
tends to be the standard procedure when I get behind a computer with even an intent to program. The fact that
the program even compiled is beyond me. There is a God. And I think he likes me.
The last two goals were accomplished towards the end of the week. The Logitech 4000 cam now works on the Linux machine.
As for the completion of the OpenCV installation, Nahur left the following instructions for me in email:
- go to the root directory
- go to ffmpeg directory
- ./configure --enable-share --prefix=/home/home8/jburger/lib/
- make
- make install
- rebuild and test the motempl application
- go to ~jburger/OpenCV(version)/samples/c
- build_wv4l.sh
- pray :)
- ./motempl
Apparently, I didn't pray hard enough and the installation threw an error or two at me. Good thing Nahur came back to our
lab and now, two hours later, we're set to go.
Week Three : June 23 - June 27, 2003
On Tuesday, Professor Betke gave us an introductory class to computer vision and some common techniques utilized in binary image processing.
I got a more in depth view on thresholding, connected components, filtering images from noise, identifying regions of interest, shape/size analysis,
and template matching. I also learned about the math behind the Sum Squared Difference measure (SSD). But don't ask me to
repeat it because I'm not entirely sure that I can. The mini-lecture ended with a little more direction in this project: to get an idea about
face recognition and create a (very) simple program that utilizes these concepts (thresholding, labeling, size analysis, etc.) for finding faces.
On Wednesday we had our weekly Sensorium meeting and discussed the timing aspect of the project in more detail. Jessie and I got the
timing code, which measures the program of interest (or selected parts) in clock cycles / ticks. We were told to start this project with small functions
(skin detection), time the function, and then later build upon each part to create a simple face detection program (i.e. skin detection, bounding boxes,
connectedness, moments, orientation, rotation, etc.) Although this order of operations seems logical for accomplishing the timing aspect of the project, I
wonder if anyone has taken into account the fact that our code may not necessarily be optimized, which of course, throws off the efficiency of a timing
report. Although Jessie and I get good grades and are computer majors, we've had no prior training in computer vision. This only leads me to surmise that
our code won't be as efficient as say, a grad student's or a professor's, when trying to accomplish the same task.
For the rest of the week, Jessie did her own skin detection program, and it turned out to be much simpler than the one I worked on last week,
so I think we'll use what she has -- she's a better, more efficient programmer than myself. I was able to take that program and convert the rest
of the image into a binary image, changing whatever wasn't identified as skin pixels to white, leaving everything else black. At the end
of Thursday and Friday, I spent a lot of time researching connected components and how to label them. I get the theory down pretty well,
but the whole recursive vs. sequential method throws me off when actually trying to implement them into code. Maybe it will all come together
next week.
Week Four : June 30 - July 4
This week I've been doing a lot of research on how connected components work. I explored different methods of identifying
and labeling objects in a binary image. I get the theory behind it real well, but the implementation is harder for me. I've been working
on a recursive algorithm that will basically check each pixel in a binary image and search out its neighbors to see who it's connected with
and then once it finds its "blob", it will be labeled, then eventually colored its own color to identify it in the picture as a separate
entity. Did that make sense?
Anyhow, my program compiled without exploding but I get random colored chunks everywhere that do not look even close
to the binary picture that was read into it. Oh well, back to the drawing board. I talked to Margrit and she said that she'll give me
some code to go on, but I haven't gotten that yet. In other news, I got the timing code running in our very basic program. The only
problem is that its output still needs to be divided by the CPU cycle of the particular computer it's running on. I know that my PC is
running on dual AMDs, but it seems as if no one knows exactly what AMDs they are. Guess it's time to visit the BIOS.
Week Five : July 7 - July 11
I took an extended 4th of July weekend because I had out of town visitors, so I wasn't in the lab on Monday and
Tuesday ... but made up the time the following weekend.
This week was testing week. Stephen needed people to test his Finger Counter and we were the people to scout out the
testing subjects. We ran into quite a few problems finding people to try the program out on. Initially, we figured that it would
be quite easy if we set up the system (laptop, webcam, tripod, and light) at some coffee shop like Starbucks or Espresso Royale.
However, we learned that management considers our testing scheme "solicitation" and quickly shut us down. Jessie even got booted
out of several places. Instead, Diane (another undergrad research assistant) suggested the GSU link (student union). We really won't
be able to start testing until next week because apparently, we need the Computer Science department to call into the BU reservations
office in order to reserve a table for us. It seems as if nowhere will just let us set up shop on a whim.
In other news, Stephen gave us some of his connected component code, of which, I easily implemented into our already
expanding program for timing. So far the basic timing program tracks the CPU clock ticks of binary image processing, component labeling,
and coloring. Perhaps next week, I'll be able to implement the morph functions that Stephen included with the labeling code.
Week Six : July 14 - July 18
On Monday and Tuesday, Jess and I set up shop in the GSU. Our mission was to recruit testers for the Finger Counter
Program. We thought it would be really easy: a laptop, a tripod, a webcam ... after the initial set up, we had all sorts of cool
looking gadgets. Who wouldn't want to stop by our table and check out the nifty program, or at the very least, inquire what the hell
we were doing with all this equipment. The answer? No one. Some people stopped by and inquired about the project, but when told
that testing takes 15 minutes, we were quickly turned down. We even bought some candy to try and lure potential testing subjects, but
that only resulted in little old ladies coming by to steal the candy without partaking in our test. After two days, we ended up with
only an additional four testers, bringing our grand total up to 10. Stephen really wanted 20 - 30 people. Heh, not likely with the
way things were going.
Slow testing... so what? You think things couldn't get worse. Oh, but they do. Apparently Jess' Linux partition became
cranky and started acting up, so she ended up re-installing Linux. Over our test logs. Our data. Our coveted 10 people.
So now we have to start over. Yay. The joys of conducting usability studies. Anyhow, Thursday was spent trying to recover
Linux and fix whatever was broken. There are more technical terms for it, but let's just say Jess learned how to recompile her Linux
kernal and all the rest of that geeky stuff that makes people proud.
Friday was speed testing day. We asked every single person in the grad lab to test. It didn't matter if we knew them
or even talked to them before. We bothered people in the middle of their coding, pulled them away from web surfing, and barged in on
their reading. Yes, I know... rude. But when you have 3 days to make up testing in a process that literally crawls, you'll do just about
anything. So we got 10 people on Friday and in conjunction with the 5 questionnaires we saved from the previous study, we have 15. Jess
is going up to Vermont this weekend and will be able to get 5 more people there. We're also going to (hopefully) continue this
accosting process well into the Tuesday deadline.
Stephen asked me to write up a memo about testing so far. So click there if you want
the details. Oh yeah, and here's my second memo.
Week Seven : July 21 - 25
Thank God for people with connections. Tom Castelli, a grad student that's really helped us out big time, hooked us
up with one of his friends who is working as a counselor for a high school math and science camp visiting BU. I set up a
meeting with his friend and his students for Wednesday afternoon. We'll get (at minimum) 5 more testers.
This is really good for us because high school kids will help supplement the strictly grad student test data that we've compiled thus
far.
On Monday, I took the test data that we've accumulated so far and did some statistical analyses with it. I wrote up a
quick SAS program to give me the mean, median, standard deviation, histograms, scatterplots and other various statistical evaluations
that I don't quite remember from my DePaul CSC 323 class. I'm going to have to review my notes from two quarters ago to make any sense
of half the stuff that the SAS UNIVARIATE procedure spit back at me.
For the rest of the week, I spent a lot of time trying to discern what the raw data in the Finger Counter
log files really meant. One would think that log files would log whatever it is you're looking to record/measure/track, right? And, if
anything that you could just seamlessly extract whatever specific tidbits you wanted out of the nifty little text files. Wrong. After much scrutiny,
raised eyebrows, and blurbs of 'WTF?', I finally realized that there was no rhyme or reason to the log files. Whatever code was being used to record the
testing information was wildly and dramatically off. I was informed by Stephen that if I just threw everything into Excel and made use of the functions
that everything would be cleared up. Wrong again. Jessie and I spent hours on end trying to make sense of the data and when we finally did, we realized
that there was no way that Excel would be able to use it's pretty functions to fix everything. To make sense of the contents of the files
would require a person... a human being to sit there and sort through the raw data and re-do the computations and logically discern
what needs to be utilized and what doesn't. So I spent the rest of the week sifting through numerous text files, creating a spreadsheet,
tabulating statistics, analyzing them, and making some preliminary graphs.
Week Eight : July 28 - August 1
Now that the early statistical analyses has been done and all the data has been formatted into a spreadsheet, Jessie and
I spent some time trying to figure out what (if any) correlations there may be between certain categories. For example, does the number
of years of experience and/or hours of use of a computer have any effect with the drawing scores and/or the voice interactive test? As it
turns out, the correlation coefficient was near 0.27, which means that although we hoped to draw some conclusion between the two -- there
really wasn't anything to go on.
We met with Stephen and discussed what kind of analyses he was looking for and exactly what types of graphical displays would best
illustrate the information to an audience. I suggested that we utilize a boxplot to showcase the average overall and individual times
for the Finger Paint scores. I thought this would be best because a boxplot would be the most efficient way to show the most amount of
information in one chart. It would also give an audience a clear depiction on where outliers were in
regards to the rest of the data.
On Friday, Jessie, Stephen, Margrit, and I had a meeting to iron out specific details concerning the further implementation of the
testing process/protocol. We went over the spreadsheet that Jessie and I made, our rough draft analyses, and graphs. We decided to leave out
the success rates of the Finger Counter out of the report and instead focus more on changing the way the Finger Counter records and
detects fingers -- the "thinking" process it goes through before the actual affirmative detection is made. We also thought it would be
useful to do some research on the average hand circumference, rotation, finger length, measurements, etc. so that perhaps the program
could be tweaked to suit a more general audience. Initially, the program was only based off of Stephen's hand.
Week Nine : August 4 - August 8
This week we revised our goals for the prototype in regards to additional tests that need to be performed.
Particularly, we honed in on the "limitations" test.
For this test, we plan to place a webcam in settings similar to the
previous experiments, for example, in the graduate computer lab and
perhaps
also in other locations. We will run the interface and have it
capture frames as well as report poses recognized. A hand will be
placed over the camera so that the system properly recognizes the hand
pose. Then the hand will be moved slowly until recognition
fails. Frames will be captured of the hand just before the point
of failure and just afterwards.
The hand will be moved in the following ways:
- Toward the camera
- Away from the camera
- Rotation in the direction of each Euler angle on axes going
through the center of the palm
- "Pitch": The hand is rotated so the fingertips are closer
to the camera than the palm and vice versa.
- "Roll": The hand is rotated so that the side of the hand
including the base of the little finger is closer than the side
including the base of the thumb and vice versa.
- "Yaw": The hand is rotated in the plane parallel to the
image plane, so that, from the camera's perspective, the fingertips
move from side to side while the palm remains more or less fixed.
To quantify results for motion toward and away from the camera, the
distance from hand to camera will be measured with a tape
measure. To estimate the angle of pitch, the distance from the
camera to the palm and to the most prominent fingertip will be measured
and compared. To estimate roll, the distance from the camera to
the base of the little finger will be compared with the distance from
the camera to the base of the thumb. To estimate yaw, an image of
the hand upright will be compared with images of the hand yawed to
either side.
Week Ten : August 11 - August 15
My last week. This week we re-did some of the hand poses in regards to the Euler angles -- we found that it would be
easier to use two cameras to take pictures of the roll, pitch, and yaw tests. We used the webcam for the Finger Counter and another
digital camera on a tripod, parallel to the hand above the webcam in order to get a more evenly distributed axis for analyses. Jessie used
GIMP to calculate the angles based on the pictures we took and I put all the data into a spreadsheet. We decided that there really wasn't any
statistical calculations that really needed to be made since only three of us were the testing subjects. Rather, we thought it would be more
useful to make a graph showcasing the functional range in which each finger/hand pose would work.
|