View this PageEdit this PageUploads to this PageHistory of this PageHomeRecent ChangesSearchHelp Guide

Duck's review

Paper number:  103  

Title:   Application and device effects on the usability of mobile phones and PDAs 

Submission type:   Research paper 

Relevance to magazine:  4 
/* Timely and Relevant */

Presentation and clarity:  4 
/* Easily Understandable */

Reviewer expertise:  2 
/* I'm somewhat familiar with this stuff */
   
Depth of contribution:  2 
/* Ho Hum */

Overall recommendation:  1  
/* Reject */

Detailed comments for the authors:


Summary: The paper presents two studies.  One study tests whether
there are significant usability differences between mobile phones and
PalmOS PDAs for mobile, client-server applications.  The second study
investigates issues in converting desktop client-server applications
to mobile client-server applications.

The first study deploys two sample applications (movie ticket buying
and stock broking) to two different devices (a mobile phone and a
PalmOS PDA).  Participants using the interface were asked to complete
a few tasks and were rated on time and "percentage correct."  A
questionnaire was deployed to gauge user satisfaction and TLX score
(workload analysis).  The study found no significant differences
except in workload between the two applications, which is to be
expected as the two applications require different cognitive loads.

The second study catalogues the usability issues in coverting a
desktop application to a mobile application as discovered by workers
who would use the interface in real-world situations.  Issues to
consider were use of drag and drop, use of color and sound, feedback
and status messages, scrolling through lists and list representations,
and handedness issues.

The authors conclude that it is possible to design mobile applications
to be run on a variety of platforms without making new code, although
converting desktop applications to mobile applications requires
serious design reconsiderations and that tailoring applications to
specific devices can yield nice advantages.

Main criticism:

1. Both experiments are described as pilot studies.  Pilot studies are
   generally not suitable for publication unless the application area
   (or device area) is completely new.

In the introduction, the authors state,

"We investigated whether unmodified applications could be deployed on
both a mobile phone and PDA without any significant difference in
usability... the first study concentrated on using common interface
components for two different devices" and proceeded to discuss a
study involving two applications deployed on two devices.

Major Criticisms of sections first study:

1. While it is clear what the differences in *input* capabilities are
   between the two devices, it is not clear how different the *output*
   capabilities of PalmOS PDA and the mobile phone are, which should
   have a dramatic influence on the usability of a common application.
   My understanding is that the display capabilities of each are
   equivalent, although the PDA may have more screen real estate.
   Figures and more discussion should be included

2. It is not clear whether the usability of the specific applications
   or the usability of the devices themselves are being measured.
   These two particular applications were deployed to different
   devices, but all of the discussion of usability seems to hover
   around the applications (picking items from lists) and not the
   devices.

3. There is discussion of 'error rate' and 'correctness' Error rate
   does seem to be defined very well, and moreover is an inappropriate
   measure for monitoring tasks (as seen in the stock brokering
   application).  A 'correctness' percentage was assigned to each part
   of the movie ticket purchasing task, but I would argue that the
   issue of purchasing the tickets is binary: either you make the
   accurate selection or you don't.  If I buy 4 tickets instead of 2
   (or worse, 2 instead of 4), then my experience at the event
   location is likely to be unpleasant.

4. The authors state, "It is slightly puzzling that the ticketing
   application on the phone had the highest satisfaction rating yet
   had a lower accuracy score than on the PDA."  This could point to a
   number of conclusions other than unfamiliarity with the PDA (such
   as novelty, which is slightly different, or a bug in the interface,
   which is a major penalty against the validity of the experiment).

5.  In 2.4.3, you accept hypothesis H8.  Then in 2.4.4, you reject
    hypothesis H8.  You should be clearer about your understanding of
    H8.

Minor Criticisms of the first pilot

1. All subjects were computer science students, which is a very
   limited pool of participants.

2. You introduce "TLX task-load rating" but do not define or describe
   it until later; a note indicating more description in a later
   section would be helpful.

3.  It is inclear how much application training participants had; if
    none, one sentence would be helpful.

4.  Both applications were specifically designed for mobile devices,
    if not one particular device, which weakens the results because it
    is unclear if the application could be deployed to devices other
    than mobile devices.

Major criticisms of the second pilot

1.  The study of people is very informal.  I think it was a good idea
    to test the interface with the people who will actually use it,
    but it is unclear what the reader can take away from the study as
    general conclusions.  This case study would be much more effective
    if the interface was at least deployed for the people to test in
    an actual situation (for example at the end of the workday for a
    more thorough walkthrough) and was then critiqued by the workers
    (or even during the walkthrough).  As presented, it seems that
    each section is just an item in a "laundry list" of features of
    the specific application.

Minor criticisms of the second pilot

1.  Two alternatives for displaying information were discussed, with
    one showing progress but being slower and one not showing progress
    but being faster.  The authors go on to say that this is a
    consideration designers should make and that designing for novices
    and experts is a a concern.  However in this particular situation,
    it seems that workers would be trained with the interface, so that
    the fast interface would clearly be better (in other words, there
    are no novice users in this case study).  A more thorough
    evaluation of the interface in this situation might also shed some
    light onto this subject.

Final Criticism:

In the conclusion, the authors state, "...simply converting a desktop
application to run on a mobile device by interface markup techniques
will not work well."  I do not think that this conclusion can be
drawn.  In the first study, the applications were designed
specifically for mobile devices.  In the second, the application was
designed for a PDA, and it does not appear that any information
regarding deployment of the desktop application directly is indicated.
A more appropriate statement would be, "... in some, and perhaps many
instances, simply converting..."



/******************************************************
Information below this line will NOT be seen by authors
******************************************************/

Comments for EIC and AEICs only: 



Reviewer:  quack