Re: Automated GUI testing
- From: Joseph M. Newcomer <newcomer@xxxxxxxxxxxx>
- Date: Fri, 01 Jul 2005 13:47:35 -0400
See below...
On 1 Jul 2005 03:57:43 -0700, idesilva@xxxxxxxxxxxxxx wrote:
>Hi Joe,
>
>First of all, thanks a lot for sharing your experience.
>
>Well... it seems that we (I'm part of a team doing this) are up to a
>difficult task. But I have some questions regarding your comments. Hope
>you would consider them too.
***
Actually, "difficult" is not the characterization that I would use. "Impossible" is a good
adjective. Let's put it this way: you could not hire me, at any rate of pay you choose, to
try to build such a system. I do not think the problem is solvable. I can't even figure
out how I would write test drivers to test specific applications I have written, let alone
solve the general problem.
Note that a testing harness actually requires full Win32 API Scripting support as well
(read: VB), because I would need, in my testing code, to actually execute real system
calls.
***
>
>>>From the literature I've read so far, there seem to be two approaches
>to GUI testing; position-based and object-based. Position-based is
>where you define the (x,y) cordinates and object-based uses control IDs
>etc.
>
>You were mainly referring to the difficulty of using position-based
>testing, due to problems in reliably defining cordinates.
>
>But we have a question whether using control IDs to recognize output
>will actually verify what the human eye sees. Because Windows will
>return us the data contained in a component irrespective of it been
>properly drawn on the screen or not. Right? That's the reason why we
>thought of defining screen cordinates in a script. (This, however, is a
>separate question)
****
"Recognize output" already is a problem. What do you mean by this? For example, in an
owner-draw listbox without LBS_HASSTRINGS, what you get "returned" is an address in the
context of the process, which does you no good at all. You can't really use the address
effectively, but even if you figure out how to use the debugger calls, you then have to
understand the class/struct to figure out how the data is represented, and this could
change from release to release. Think about the problems of invoking a virtual method to
obtain information.
Example: take a look at my logging listbox control, on my MVP Tips site, and explain
exactly how you would analyze its output. And this is a SIMPLE example compared to, say,
the Outlook 2003 Tools>Create Rules dialog that comes up.
So Windows CAN'T "tell you the data" because it is impossible to do this except for a few
trivial built-in controls. Explain exactly how you are going to pass an LVITEM or TVITEM
structure across a process boundary so it can be filled in with the information about the
list control or tree control state, for example. (Hint: think "DLL Injection". Now figure
out how, even with DLL injection, you can interpret the data obtained without knowing
details of the structure that is contained in the LPARAM field of the TVITEM. I can send
you a program that will tax any algorithm you propose)
The only way to figure out what is going on is to analyze the pixels, and this means
something like OCR. And pray that the application doesn't do an owner-draw listbox that
uses icons to convey information. I have one app that has three icons on the left, and
those three icons are selected from sets of icons, so the combination of three icon
positions represent about 40 or so states. How do you determine that I'm displaying the
correct state as encoded in the icons? Oh yes, one way is to use a pre-canned database
(the icon selections depend on the database contents, which are modified by incoming input
data), precanned input data, which means you will have to, for this particular
application, create a simulation script that sends network messages in a particular
syntax, using a particular IP address and socket, and then, before sending the next
message, make sure the response to the first message is correct.
Now deal with the fact that some messages may change the behavior based upon whether or
not they are received during or after certain actions with respect to the first message
(if during the processing, a new message for the same remote device could terminate
processing of the earlier message; but if already processed, it sets a different state).
And the simulator must be able to properly handle the response I send out, and send the
appropriate response back.
Test to make sure I respond properly to syntactically incorrect messages, by putting up
the correct information on the screen, and enabling/disabling appropriate menu items.
make sure that, given a particular selection in a listbox, list control, combo box, etc.,
that I properly enable/disable the appropriate controls, menu items, etc.
make sure that copy, paste, cut, and delete options are properly selected, just as a
beginning example. Note that what I implement is that if multiple selections are made in a
listbox, a "copy" operation will place, in the clipobard, the proper text (which might not
be the actual screen contents).
Here's one from a liquid CO2 analyzer: the data in the structure was represented by a
temperature converter that returned temperature in 1/64 degree C. But the end user might
choose F, C or K as the display. If a row of data from the grid control was placed in the
clipboard, the temperature data was in 1/64C units, although the text displayed was a
conversion to integral degrees F, C or K. So I would do a copy, the user could change the
display mode from C to K, and then do a paste, so the numbers pasted were not the numbers
copied, although (within the limits of the conversion to an integer value, which was
unimportant to this application, because external constraints did not care about the small
errors) the values themselves were "identical". Write a test script that proves that
copy-change representation-paste produces the correct result.
I've built scripting systems to test programs such as Windows control programs for
embedded systems; I build an embedded system simulator (often because the client does not
yet have working devices). These are hard to write. How do you propose to build a testing
system that would interface to a simulator component to provide the input data stream (the
input stream is not just user interaction. It is often user interaction in the context of
live data streams).
Since you can't define screen coordinates effectively, it is not clear how you would
maintain correspondence between your debugging script and each release.
Examples: two bitmap buttons. I decide to exchange their positions. How does your script
cope with this? Version 1.1 had two icon buttons. Version 2.0 has three icon buttons. How
do you determine their new arrangement? I move the listbox from the left to the right of
the dialog? What about situations in which I dynamically rearrange controls, so their size
and/or positions change as the window is resized? (Actually, my FIRST Windows app, back in
1990 or so, had a dialog with two columns, with the usual "Add", "Remove", "Add All" and
"Remove All" buttons. As you resized the dialog, the sizes of the two listboxes changed,
but the buttons remained fixed size, but always remained between the two listboxes. Hence,
static analysis not only fails because it is unmaintainable, it fails because dialogs,
form views, etc. dynamically resize).
One app I can't send you (but you can get by buying a $15,000 controller...) has edit
controls, combo boxes, check boxes, and radio buttons as child controls of CListBox. These
controls are creating dynamically based upon reading a controller configuration file,
where it defines abstract properties of the controller, and I create controls on the fly.
Note that since these controls are child controls of a listbox, they scroll with the
listbox! In a case like this, you cannot do a simple two-level enumeration to obtain the
controls, you cannot predict the control IDs (which I assign dynamically), and the
positions are non-constant. Oh yes, there are somewhere between 20 and 100 list boxes,
each in a tabbed dialog. How do you handle tab controls with child dialogs? How do you
handle the case where the tabs are scrolled? How do you handle the case where the tabs are
stacked? How do you handle the case where the number of tabs depends upon runtime
information? Where the number of tabs changes each time the program executes, and you can
only determine the correct number of tabs by reading the same configuration file I read
and figuring out what I did (which, I might add, is nontrivial).
>
>We found some 3rd party components for input simulation (AutoIt) and
>screen output recognition (ScreenOCR). Given a control ID, AutoIt can
>send commands to it, and given cordinates, ScreenOCR reads it
>correctly.
****
Interesing. Will ScreenOCR handle rich-edit controls, owner-draw CListCtrl with multiple
fonts, and owner-draw controls (how about one with rotated text. I've got one of these
right now). And what about my controls that are now displaying Hebrew, Japanese, Chinese,
Arabic, Korean and a dozen other scripts I also cannot read myself? Will it check my
owner-draw pushbutton to make sure the arrow is pointing in the correct direction? What
does it do with the app I have that displays circle-slashed icons in the tabs for those
tabs that are illegal in the current context? Will it verify that I have enabled/disabled
the correct tabs? Will it check my owner-draw combo box that uses non-textual output, such
as line shapes with a radio button indicating the one selected?
What about constraint management? For example, how will you express rules of the form "If
the thus-and-such field is blank, disable OK", "If the checkbox so-and-so is checked, the
listbox should be enabled" "If the combo box selection is thus-and-such, the following
controls should be visible, and these other controls should be invisible". How do you
impose rules that state the contents of controls that are sensitive to the context the
user is running in? "If the code page of the end user is thus-and-such, use this API to
get the correct character to use for this purpose"? "For locales in Europe, make sure the
correct digit separator is used"? (And test this out for a German user who is running
Windows in Indiana, but wants to see familiar information representations). What about "If
this item is selected in the rich edit control A, then the following text in rich edit
control B should also be selected"? (That is a piece of code sitting on the screen next to
me right this moment). "The output should be underlined in groups computed by the contents
of the text" (the bug I'm working on right now...there's an error in my highlighting code,
which I just wrote yesterday). What about parsing pictures (for example, making sure that
the values selected result in the correct part of the picture being highlighted)? What
about making sure that listbox items in a draggable listbox are properly dragged and end
up in the right place?
>
>> The best you can deal with is control IDs; given the control ID, you
>> can locate the control, and determine, for the currently running
>> instance, what its coordinates are, and what its type is.
>
>This comment signals a possiblity of using the above components,
>provided that we know the control ID of each GUI component we need to
>test. What if we query the cordinates at run-time rather than scripting
>them? (We also can remove certain variables such as
>internationalization, different windows versions etc. from the
>equation)
****
You HAVE to query the components the INSTANT BEFORE YOU TRY TO USE THEM. You cannot even
enumerate them at startup time, because bewteen two tests, their position will change.
Bewteen any two activations of a dialog, the layout could change (for example, I have an
applicaiton that has one dialog box that rearranges the controls based on the size of the
picture that is displayed).
Here's one scenario from an application I have right now: right click on part of the
screen. Get a menu item that says "Maximize". The window is maximized, and all the
contents are rearranged for the new geometry. Now send an activation request to a
particular control. If you don't do it via the control ID, AT THAT INSTANT, you would not
know where the control was.
Oh yes, here's another case: a dialog-based application for which there is only one
control ID. I use TextOut to draw the text in that one-and-only control. In a variety of
languages and fonts. Some of the text is conditional based on program state. How do you
test that I'm displaying the correct information relative to the internal program state.
And the program state changes dynamically based on internal computations, such as
timeouts. Can you verify that my program is working correctly? Can you test it?
What about a user-defined control that does its own selection highlighting? Can you tell
where the text is to highlight? Suppose it is as powerful as rich edit, so the text is in
different sizes. How do you check that my code is working correctly? Did I really put the
error message in red? DId I really put the warning message in italic? Did I use boldface
in the proper places?
Verify the proper tooltip message pops up. What is the control ID of a tooltip message?
What if it is a window that I created that simulates a tooltip, but is more powerful?
How do you deal with situations where the number of controls depends upon external state,
and the control IDs, layout, etc. are all determined at runtime based upon input data. How
do you test that the correct configuration has been constructed?
What about the "Do not enable this control if the operating system does not support this
feature"? I have controls that disable on XP but enable on Server 2003.
(Note that I've actually built and delivered code that contains all these features I'm
discussing)
I have a piece of pre-Alpha code--I'm still heavily involved in writing and debugging
it--that would be a serious test of any scripting engine design. If you agree to keep the
current version confidential and not distribute it, I'll even send you the existing code.
(I will be releasing it as open source as soon as it all works). Now your problem would be
to determine how you would check it for correctness, so having once created a testing
script, you could test the final release (or at least the subset of the final release that
corresponds to what I've sent). This represents perhaps 25% of the techniques I have
actually used, so it wouldn't be a full test of all the complex cases, but it would
represent the minimum you would have to achieve).
How about a couple scenarios based on time? For example, one product I deliver requires
that a script be executed at, say, 8am. Can you test to make sure that this script is
indeed executed at 8am? Can you even FIND OUT that such a script exists, given there may
be no display currently active on the screen that indicates this should be so?
And what about the speech output from that app? Can you verify that it actually says
"oh-three-hundred" for 3:00? (Yes, I admit that this program is highly ethnocentric. But
it is a 16 bit Windows program designed for the English-only marketplace). Note that the
Microsoft Text-to-speech engine translates 03:00 as "three" <short pause> "zero". (Oh, you
ask, how do you do TTS in a 16-bit program? The answer is, I don't. I do the TTS interface
in a 32-bit co-process which has its OWN GUI; I prep the text in the 16-bit app and send
the rendred TTS text to the 32-bit app, which also handles network traffic).
For that matter, take a situation where I receive a message via the network. This message
triggers a sequence of actions, some of which have GUI output, but most of which don't. Or
the output is optional depending upon what view is being displayed. Can you test that my
program is responding correctly to input across the network?
For that matter, consider the case of a two-view splitter window. Can you verify that a
change to the contents of one view is properly represented in the other view? Suppose the
other view is a graphical view?
The approach that you can actually read the contents of the text, even assuming that you
can capture the bits, is naive.
Here's one: I have a vertically-scrollable, horizontally-scrollable grid control. You
can't see some of the bits because they are off-window. Can you test my program?
Oh yes, this particular grid control highlights illegal values in red. Can you test that
it is properly highlighting illegal values? Write the constraint equation in your
scripting language. Assume that in order to see the information involved in seeing cell
(to use Excel terminology) L27, I have to use data from cells A3, B7..B9, C22, C23, and
L1..L6. This is a custom third-party grid control (e.g., Stingray, Dundas). Explain
exactly how you plan to test this program under a variety of inputs. How do you write the
script to test that it is behaving correctly? How do you write the script to enter the
data in the first place? Note that entering some data will cause some of the columns to
widen to accomodate the text, while entering other data will create a
horizontally-scrolling edit control. When the data is displayed, if the field is too
short, it might be displayed as "#######", and expect the user to manually widen the
column to make it visible. Or it might just truncate it, perhaps displaying only half a
letter (see what your OCR does with a partial letter). What about the case of ellipses?
Can you test that the data behind the ellipses represention is valid in spite of the fact
you can't see it on the screen?
*****
>
>I would really appreciate if you can reconsider the changed approach of
>using 'control IDs' with the above 3rd party components. Also, give
>some guidance as to where I should start if I'm using 'control IDs' to
>query the cordinates, type etc. of GUI components.
>
****
GetWindowRect is the way to get the coordinates. ScreenToClient will give you the client
coordinates relative to the parent window (use GetParent).
(If you have to ask this question, it strongly suggests that you really have no idea what
you're getting into! One other observation I've made is the less experience a Windows
programmer has, the more confident said programmer is that he or she knows how to write a
GUI scripting language).
****
>Thanks,
>Ishan.
Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.
- References:
- Automated GUI testing
- From: idesilva
- Re: Automated GUI testing
- From: Joseph M . Newcomer
- Re: Automated GUI testing
- From: idesilva
- Automated GUI testing
- Prev by Date: Re: Button Control Event: question
- Next by Date: How to affect link order in VC 2003 (LNK 2005 error)
- Previous by thread: Re: Automated GUI testing
- Next by thread: Re: Automated GUI testing
- Index(es):