Thursday, June 30, 2011
Tuesday, June 28, 2011
Friday, June 24, 2011
Android Apps .apk file notes
An .apk file is just a .zip file.
The apps are kept in data/app
7zip can view and extrace
http://dedexer.sourceforge.net can reverse engineer the .dex files into .class files?
Open Astro
click tools
use it to backup the application of choice. It stores the .apk file in /sdcard/backups.
The apps are kept in data/app
7zip can view and extrace
http://dedexer.sourceforge.net can reverse engineer the .dex files into .class files?
Open Astro
click tools
use it to backup the application of choice. It stores the .apk file in /sdcard/backups.
Thursday, June 23, 2011
objdump
options of notes
/dependents - dlls this dll needs
/imports - routines this dll needs
/exports - routines this dll provides
/dependents - dlls this dll needs
/imports - routines this dll needs
/exports - routines this dll provides
Windows Calling convention Summary
Default is __cdecl
- caller pushes parms on stack
- calls function
- caller pops parms off stack
- The decorated name in windows just has a leading underscore "_".
- also known as WINAPI and PASCAL
- WIN32 uses this convention, hence WINAPI
- callee pops parms off stack. Results in smaller code since if a program calls a routine ten times the stack pop code is only once in the called function and not in each of the ten places it is used.
- Routines with variable arguments are hard to clean up. ie. printf.
- The decorated name has "_" prefix and a "@n" postfix. "n" is a number saying how many bytes are used to encode the function parameters.
- Uses some registers to pass parameters.
- Time used to save registers cuts down on benefits.
Windows Calling convention
http://unixwiz.net/techtips/win32-callconv.html
Argument push order?
Calling Conventions
Traditionally, C function calls are made with the caller pushing some parameters onto the stack, calling the function, and then popping the stack to clean up those pushed arguments.
/* example of __cdecl */ push arg1 push arg2 push arg3 call function add sp,12 // effectively "pop; pop; pop"
It turns out that Microsoft compilers on Windows (and probably most others) support not just this convention, but two others as well. The technical details are found at Microsoft's MSDN site, but we'll touch on them here as well.
The default convention — shown above — is known as __cdecl.
The other most popular convention is __stdcall. In it the parameters are again pushed by the caller, but the stack is cleaned up by the callee. It is the standard convention for Win32 API functions (as defined by the WINAPI macro in <windows.h>), and it's also sometimes called the "Pascal" calling convention.
/* example of __stdcall */ push arg1 push arg2 push arg3 call function // no stack cleanup - callee does this
This looks like a minor technical detail, but if there is a disagreement on how the stack is managed between the caller and thecallee, the stack will be destroyed in a way that is unlikely to be recovered.
A mismatch in calling convention is catastrophic for a running program.
At first this seems like a not-that-interesting distinction (to many it is in fact not-that-interesting), but there are several implications that arise when considering one or the other.
There is no single "correct" order in which a compiler pushes arguments on the stack: some go left to right, while others go right to left. This is determined by the author of the compiler, often influenced by the underlying architecture.
Our general observation is that machines with a stack that grows down push from right to left, while up-growing stacks push left to right.
But unlike calling conventions, which the user controls, argument push order is defined by the compiler and not subject to the kinds of mismatches we discuss here. Portable code never relies on argument push order.
We'll note that there is also a __fastcall convention that uses registers, but we don't believe it's really that useful in the general case — the save and restore of the registers often removes any speed benefit of using the register for arg passing. We'll only touch on it in passing.
- Since __stdcall does stack cleanup, the (very tiny) code to perform this task is found in only one place, rather than being duplicated in every caller as it is in__cdecl. This makes the code very slightly smaller, though the size impact is only visible in large programs.
- Variadic functions like printf() are almost impossible to get right with__stdcall, because only the caller really knows how many arguments were passed in order to clean them up. The callee can make some good guesses (say, by looking at a format string), but the stack cleanup would have to be determined by the actual logic of the function, not the calling-convention mechanism itself. Hence only __cdecl supports variadic functions so that the caller can do the cleanup.
- There isn't really a "right or wrong" with respect to which one is best, but it is positively fatal to "mix and match". The general principle is "the stack-cleanup must match the arg-pushing", and this only happens when caller and callee know what the other is doing. Calling a function with the "wrong" convention will destroy the stack.
Linker symbol name decorations
As mentioned in a bullet point above, calling a function with the "wrong" convention can be disastrous, so Microsoft has a mechanism to avoid this from happening. It works well, though it can be maddening if one does not know what the reasons are.
They have chosen to resolve this by encoding the calling convention into the low-level function names with extra characters (which are often called "decorations"), and these are treated as unrelated names by the linker. The default calling convention is__cdecl, but each one can be requested explicitly with the /G? parameter to the compiler.
- __cdecl (cl /Gd ...)
- All function names of this type are prefixed with an underscore, and the number of parameters does not really matter because the caller is responsible for stack setup and stack cleanup. It is possible for a caller and callee to be confused over the number of parameters actually passed, but at least the stack discipline is maintained properly.
- __stdcall (cl /Gz ...)
- These function names are prefixed with an underscore and appended with the number of bytes of parameters passed. By this mechanism, it's not possible to call a function with the "wrong" type, or even with the wrong number of parameters.
- __fastcall (cl /Gr ...)
- These function names start with an @ sign and are suffixed with the @parameter count, much like __stdcall.
Examples:
Declaration | decorated name |
---|---|
void __cdecl foo(void); | _foo |
void __cdecl foo(int a); | _foo |
void __cdecl foo(int a, int b); | _foo |
void __stdcall foo(void); | _foo@0 |
void __stdcall foo(int a); | _foo@4 |
void __stdcall foo(int a, int b); | _foo@8 |
void __fastcall foo(void); | @foo@0 |
void __fastcall foo(int a); | @foo@4 |
void __fastcall foo(int a, int b); | @foo@8 |
We'll note that the decorated names are never visible to a C program: they are strictly a linker facility, and the linker will never resolve one kind of reference with the "wrong" one.
We can see this in action with a simple program that declares — but does not define — several functions that are not found by the linker.
C> type testfile.c extern void __stdcall func1(int a); extern void __stdcall func2(int a, int b, double d); extern void __cdecl func3(int b); extern void __cdecl func4(int a, int b, double d); int __cdecl main(int argc, char **argv) { func1(1); func2(2, 3, 4.); func3(5); func4(6, 7, 8.0); return 0; } C> cl /nologo testfile.c testfile.c testfile.obj : error LNK2001: unresolved external symbol _func1@4 ... __stdcall testfile.obj : error LNK2001: unresolved external symbol _func2@16 ... __stdcall testfile.obj : error LNK2001: unresolved external symbol _func3 ... __cdecl testfile.obj : error LNK2001: unresolved external symbol _func4 ... __cdecl testfile.exe : fatal error LNK1120: 4 unresolved externals
Note that since a double variable takes eight bytes (not four like an int), the three-parameter func2() is ...@16 instead of ...@12. But both of the __cdecl functions are undecorated in this manner.
But doesn't the compiler catch this?
Yes, but the calling-convention decorations are solving a somewhat narrower problem than function prototypes do.
C++ has always supported, and ANSI C introduced, "function prototypes", which allow one to describe the parameters of a function in a declaration (previously, only the return type was part of a declaration). When a function is actually called, it's compared with the declaration, and a warning issued:
/* somefile.c */ extern int foo(int a); // prototype ... n = foo(1, 2, 3); // mismatch! bad parameter count!
Here, the compiler expects the foo() function to take just one parameter, and when it see a few more (or with the wrong type), it objects. The Microsoft calling-conventions would add nothing to this.
But when the linker enters the picture, it's possible to see cases where this will arise. Consider two files, one that uses a function and the other that defines it:
/* in file1.c */ /* in file2.c */ extern int __cdecl foo(int); int __stdcall foo(int a) { ... .... n = foo(1); }
Since the compiler never looks at the two source files together, it could never detect that there has been a mismatch in the calling conventions used. The resulting code, if linked, would destroy the stack.
Before one lambasts the programmer for making such a foolish mistake, consider that the default calling convention is usually__cdecl, so even if the file1.c example omitted the declaration for foo(), it would still default upon first use to __cdecl. This is a different (but very common) oversight.
In a small project, the example shown is highly contrived, but as systems get larger, this situation arises more often. It's common to use a third-party library (which exports many functions), and one cannot always tell which compiler flags were used by the library builder.
It's at this point where we get to the real reason for the calling-convention decorations. It's not to keep a programmer from calling a function with the wrong number (or type) of arguments:
Important!
Calling-convention symbol decorations exist only to maintain stack discipline
When does it matter?
In most cases, it makes no difference either which calling convention is used by default throughout the program, or what the convention is on any particular function, but there are a few exceptions of note when using other than __cdecl for the default.
- The function main() (and the wide version wmain()) must always be __cdecl.
- The function WinMain() — the starting point for GUI programs — is always __stdcall, though this is pre-declared by the <windows.h> include file to make this more or less automatic.
- Variadic ("printf-like") functions are __cdecl even if declared otherwise (e.g., the calling convention keyword is ignored). We are surprised that the compiler does not issue a warning against this misuse:
int __stdcall myprintf(const char *fmt, ...); // it's really __cdecl
- Some library functions take addresses of other functions as parameters, and these must be matched properly. A common example is qsort, which takes a "compare" function as the last parameter, and this function must be __cdecl.
extern void __cdecl qsort( void *base, size_t num, size_t width, int (__cdecl *compare )(const void *, const void *) ); .... int __stdcall mycompare(const void *p1, const void *p2) { // compare here } .... qsort(base, n, width, mycompare); // ERROR - mismatch
We'll note here that the calling convention of the qsort function itself doesn't enter into this — it's the convention of theparameter to qsort that does.This function-address-as-parameter issue also comes up with signal handlers.
Calling functions exported from DLLs
When one builds a system from scratch, it's usually straightforward to coordinate the calling conventions (often at theMakefile level), but an added twist arises when using DLLs provided by third parties, especially if it's written in a different language.
If the only item provided is the .DLL itself, without an associated import library and header file, one must associate the calling convention with the function pointer itself, and the compiler will generally be powerless to provide any real help.
For illustration, we'll work with a hypothetical barcode library BARCODE.DLL, and it provides two functions that we fetch by name, and then call via a pointer:
typedef BOOL (__stdcall *INITFUNCTION)(BOOL); typedef int (__stdcall *DRAWFUNCTION)(int x, int y, const char *label); HINSTANCE hInst = LoadLibrary("barcode.dll"); INITFUNCTION pfInit = (INITFUNCTION)GetProcAddress(hInst, "Init"); DRAWFUNCTION pfDraw = (DRAWFUNCTION)GetProcAddress(hInst, "Draw"); (*pfInit)(TRUE); (*pfDraw)(1, 1, "12345"); (*pfDraw)(1, 2, "67890");
It's important to note that using a calling convention in the typedef, one must match the convention used by the actual code, and a mismatch here will be both undetectible by the compiler, and fatal at runtime.
There is no real substitute for checking the documentation provided by the supplier of the library.
Building bigger systems
As mentioned, smaller programs really just don't care much about this, but when systems get larger, or when third-party libraries enter the picture, it becomes necessary to be aware of calling-convention issues (particularly on an inter-module basis). This is further complicated if the software in question must be ported to non-Windows platforms that have no notion of calling conventions.
Even when one has the source to a third-party package (say, the excellent NET-SNMP library), one may still not be too excited about diving into the build system. Though UNIX build systems are almost always created based on "Makefiles", Windows builds often use "project files" that are somewhat less transparent and more ad hoc.
Our preference is to use __stdcall when possible, but it's less important "which convention to use" than it is "all conventions match". We also don't like to insist that all parts be built the same way, so a library could be built mainly with one while the application another.
We'll start with the library, and with the first guidline:
Rule #1:
Library headers should explicitly name a calling convention everywhere — Do not rely on the default.
When a library header includes the calling convention on every function, the default value won't ever be considered, so the client application can use whatever it likes for its convention. The
For a Win32-only library, it's straightforward enough to simply note the calling convention on every function:
/* mylibrary.h */ extern void * __stdcall circalloc(size_t n); extern char * __stdcall circdup(const char *s); extern char * __cdecl circfmt(const char *fmt, ...); extern BOOL __stdcall set_inherit_handle(BOOL bInherit, HANDLE h); extern void __stdcall init_timestamp(void); extern size_t __stdcall sprintf_timestamp(char *obuf); typedef void __stdcall FAILHANDLER(int, const char *, const char *); extern FAILHANDLER * __stdcall set_fail_handler(FAILHANDLER *pHdlr); ...
Typedefs carry the calling convention (implicit or explicit) along with the type information for function pointers, and in our simple library, we've done this with the FAILHANDLER typedef.
In practice it's not strictly necessary to mark the function definitions with the calling-convention keywords, because if the definition is seen in the presense of the keyword-endowed declaration from the header file, it overrides the default.
/* somefile.c */ extern void __stdcall foo1(void); .. void foo1(void) // OK - __stdcall taken from the declaration just seen { ... } extern void __stdcall foo2(void); ... void __cdecl foo2(void) // ERROR - clashes with __stdcall above { ... } extern void foo3(void); // presume __cdecl ... void __stdcall foo3(void) // ERROR - clashes with presumed __cdecl { ... }
Most libraries also have internal functions that are not exported or visible to the users, and these need not have the calling conventions noted. The idea is that since the library is built as a whole in one big step, all the parts will share the same default convention even if it's different from how other unrelated modules are built. As long as these functions are not visible to the outside, their conventions are a private matter.
Where this gets trickier is when the library will be used by non-Win32 platforms, almost all of which treat __stdcall and related keywords as syntax errors.
Rule #2:
Use the C preprocessor and a portability-related header file to make this work seamlesly on non-MSVC platforms.
We generally create a "portable.h" header file that contains this (and many other) definitions related to portability. Since the calling conventions are meaningless on non-Windows compilers, all that's required to support them is to make them go away.
#ifndef _WIN32 # define __cdecl /* nothing */ # define __stdcall /* nothing */ # define __fastcall /* nothing */ #endif /* _WIN32 */
This elides these keywords anytime they are seen on a non-Windows platform, though developers using compilers other than Microsoft Visual C may need to tune the definitions a bit.
We've found it most helpful to define our libraries such that they all have a "portable.h" header file that all others can include to help iron out some of these compiler and platform differences. "Portability" extends to far beyond just the calling conventions, though this is a habit that only "experience" can inform.
As an aside, we'll encourage those taking this approach to add support for the GNU C compiler __attribute__ facility, which can be used to provide high-level information tagging that the compiler uses to perform better error checking. Though the details of __attribute__ itself are not pertinent to our discussion, the "making it work for a non-GNU compiler" fits squarely in our "portable.h" scheme:
An important goal is that the calling convention applied to the functions at library compile time must be the same that the client application sees when the header files is used. A contorted "portable.h" file that allows for (say) __stdcall to be defined sometimes and not other times on the same platform is asking for trouble.
Note: We've been taken to task for the use of this portable.h technique, suggesting that trying to preprocess out a reserved word may break things. We've never seen this in spite of years of portable coding, but it does seem prudent to consider this if using compilers outside the mainstream.
- It's been suggested that we should key off of _MSC_VER instead of _WIN32
- Rather than #define away reserved words, it's been suggested that portable code should instead invent new symbols and use those in the code, relying on the macros to either insert the calling convention keywords, or not. We're not convinced, but we're not prepared to dismiss this out of hand.
Monday, June 20, 2011
Friday, June 17, 2011
debugging
You can start ddms via command line. Just use ddms command and it will pop up a gui similar to what is in eclipse. Need to have the path set first.
Things to do:
find utility in java similar to find on unix.
In platform-tools there is "adb". "adb devices" shows the android devices attached.
You can use adb in eclipse to take screenshots of the droid and to explore the filesystem.
Installing a device.
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb devices
List of devices attached
HT068P900465 device
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb -s HT068P900465 install DroidTumbler.apk
254 KB/s (20911 bytes in 0.080s)
pkg: /data/local/tmp/DroidTumbler.apk
Success
Push/pulling a file to the device.
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb -s HT068P900465 push readme.txt /sdcard/mine/foo/readme.txt
2 KB/s (43 bytes in 0.020s)
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>type readme.txt
line1: This is a test
line2: Test line two
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb -s HT068P900465 pull /sdcard/mine/foo/readme.txt readme.enc
4 KB/s (43 bytes in 0.010s)
<on android run encryption program on the file.>
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>type readme.enc
↑
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>
Things to do:
find utility in java similar to find on unix.
In platform-tools there is "adb". "adb devices" shows the android devices attached.
You can use adb in eclipse to take screenshots of the droid and to explore the filesystem.
Installing a device.
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb devices
List of devices attached
HT068P900465 device
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb -s HT068P900465 install DroidTumbler.apk
254 KB/s (20911 bytes in 0.080s)
pkg: /data/local/tmp/DroidTumbler.apk
Success
Push/pulling a file to the device.
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb -s HT068P900465 push readme.txt /sdcard/mine/foo/readme.txt
2 KB/s (43 bytes in 0.020s)
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>type readme.txt
line1: This is a test
line2: Test line two
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>adb -s HT068P900465 pull /sdcard/mine/foo/readme.txt readme.enc
4 KB/s (43 bytes in 0.010s)
<on android run encryption program on the file.>
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>type readme.enc
↑
C:\Documents and Settings\davis\workspace\DroidTumbler\bin>
Thursday, June 16, 2011
Tridroid Meeting Notes
Code repository for book Beginning Android Games
http://code.google.com/p/beginning-android-games/
Programming Web Site
http://www.mybringback.com/
June 25th google will delete unused blogger accounts.
http://code.google.com/p/beginning-android-games/
Programming Web Site
http://www.mybringback.com/
June 25th google will delete unused blogger accounts.
Subscribe to:
Posts (Atom)