Microsoft KB Archive/51501

From BetaArchive Wiki

Article ID: 51501

Article Last Modified on 8/16/2005



APPLIES TO

  • Microsoft QuickBasic 4.0
  • Microsoft QuickBASIC 4.0b
  • Microsoft QuickBasic 4.5 for MS-DOS
  • Microsoft BASIC Compiler 6.0
  • Microsoft BASIC Compiler 6.0b
  • Microsoft BASIC Professional Development System 7.0
  • Microsoft BASIC Professional Development System 7.1
  • Microsoft Macro Assembler 5.0
  • Microsoft Macro Assembler 5.1 Standard Edition



This article was previously published under Q51501

SUMMARY

The article below gives Part 1 of 2 of a complete tutorial and examples for passing all types of parameters between compiled Basic and Assembly Language.

The examples in BAS2MASM (but not the tutorial section) are also available in this database as multiple separate ENDUSER articles, which can be found as a group by querying on the word BAS2MASM.

MORE INFORMATION

HOW TO PASS PARAMETERS BETWEEN Basic AND ASSEMBLY LANGUAGE

This document explains how Microsoft Basic compiled programs can pass parameters to and from Microsoft Macro Assembler (MASM) programs. This document assumes that you have a fundamental understanding of Basic and assembly language.

Microsoft Basic supports calls to routines written in Microsoft Macro Assembler, FORTRAN, Pascal, and C. This document describes the necessary syntax for calling Microsoft assembly-language procedures and contains a series of examples demonstrating the interlanguage calling capabilities between Basic and assembly language. The sample programs apply to the following Microsoft products:

  1. Microsoft QuickBasic versions 4.00, 4.00b, and 4.50 for MS-DOS
  2. Microsoft Basic Compiler versions 6.00 and 6.00b for MS-DOS and MS OS/2
  3. Microsoft Basic Professional Development System (PDS) versions 7.00 and 7.10 for MS-DOS and MS OS/2
  4. Microsoft Macro Assembler (MASM) versions 5.00 and 5.10 for MS-DOS and MS OS/2
  5. Microsoft QuickAssembler versions 2.01 and 2.51 (which are integrated as part of Microsoft QuickC Compiler with QuickAssembler versions 2.01 and 2.51) for MS-DOS

Microsoft Basic can be linked with all versions of MASM or QuickAssembler. However, we recommend that you use the latest version of MASM or QuickAssembler with the examples in this application note.

For more information about interlanguage calling, refer to the "Microsoft Mixed-Language Programming Guide," which is available with C 5.00 and 5.10 and MASM 5.00 and 5.10.

                     MAKING MIXED-LANGUAGE CALLS
                     ===========================

Mixed-language programming always involves a call; specifically, it
involves a function or subprogram call. For example, a Basic main
module may need to execute a specific task that you would like to
program separately. Instead of calling a Basic subprogram, however,
you can call an assembly-language procedure.

Mixed-language calls require multiple modules. Instead of compiling
all of your source modules with the same compiler, you use different
compilers. In the example mentioned above, you would compile the main-
module source file with the Basic compiler, assemble another source
file (written in assembly language) with the assembler, and then link
together the two object files.

There are two types of routines that can be called. Their principal
difference is that some return values, and others do not. (Note: In
this document, "routine" refers to any function or subprogram
procedure that can be called from another module.)

   Note: Basic DEF FN functions and GOSUB subroutines cannot be called
   from another language.

   Basic has a much more complex environment and initialization
   procedure than assembly language. Because of this, Basic must be
   the initial environment that the program starts in, and from there,
   assembly-language routines can be called (which can in turn call
   Basic routines). This means that a program cannot start in assembly
   language and then call Basic routines.


               THE Basic INTERFACE TO ASSEMBLY LANGUAGE
               ========================================

The Basic DECLARE statement provides a flexible and convenient
interface to assembly language. When you call a routine, the DECLARE
statement syntax is as follows:

   DECLARE FUNCTION <name> [ALIAS "aliasname"][CDECL][<parameter-
   list>]

The <name> is the name of the function or subprogram that you want to
call as it appears in the Basic source file. The following are the
recommended steps for using the DECLARE statement when calling
assembly language:

1. For each distinct assembly-language routine you plan to call, put a
   DECLARE statement in your Basic source file before the routine is
   called.

2. If you are calling a MASM routine with a name longer than 31
   characters, use the ALIAS feature. The use of ALIAS is explained
   below.

3. Use the parameter list to determine how each parameter is to be
   passed. The use of the parameter list is explained below.

4. Once the routine is properly declared, call it just as you would a
   Basic subprogram or function.


NAMING-CONVENTION REQUIREMENTS
==============================

The term "naming convention" refers to the way that a compiler alters
the name of the routine before placing it into an object file.

It is important that you adopt a compatible naming convention when you
issue a mixed-language call. If the name of the called routine is
stored differently in each object file, then the linker will not be
able to find a match. Instead, it will report an unresolved external.

Microsoft compilers place machine code into object files, but they
also place into object files the names of all routines and common
blocks that need to be accessed publicly. (Note: Basic variables are
never public symbols.) That way, the linker can compare the name of a
routine called in one module to the name of a routine defined in
another module, and recognize a match.

Basic and MASM use the same naming conventions. They both translate
each letter of public names to uppercase. Basic drops the type
declaration character (%, &, !, #, $). Basic recognizes the first 40
characters of a routine name, while MASM recognizes the first 31
characters of a name.


CALLING-CONVENTION REQUIREMENTS
===============================

The term "calling convention" refers to the way that a language
implements a call. The choice of calling convention affects the actual
machine instructions that a compiler generates to execute (and return
from) a function, procedure, or subroutine call.

The use of a calling convention affects programming in two ways:

1. The calling routine uses a calling convention to determine in what
   order to pass arguments (parameters) to another routine. The
   convention can usually be specified in a mixed-language interface.

2. The called routine uses a calling convention to determine in what
   order to receive the parameters that were passed to it. In most
   languages, this convention can be specified in the routine's
   heading. Basic, however, always uses its own convention to receive
   parameters.

Basic's calling convention pushes parameters onto the stack in the
order in which they appear in the source code. For example, the Basic
statement CALL Calc(A, B) pushes argument A onto the stack before it
pushes B. This convention also specifies that the stack is restored by
the called routine just before returning control to the caller. (The
stack is restored by removing parameters.)


USING ALIAS
===========

The use of ALIAS may be necessary because assembly language places the
first 31 characters of a name into an object file, whereas Basic
places up to 40 characters of a name into an object file.

  Note: You do not need the ALIAS feature to remove type
  declaration characters (%, &, !, #, $). Basic automatically
  removes these characters when it generates object code. Thus,
  Fact% in Basic matches FACT in assembly language.

The ALIAS keyword directs Basic to place aliasname into the object
file, instead of <name>. The Basic source file still contains calls to
<name>. However, these calls are interpreted as if they were actually
calls to aliasname. This is used when a Basic name is longer then 31
characters and must be called from assembly language, or the assembly
language routine name contains characters that are illegal in a Basic
subroutine name.
For example:

  DECLARE FUNCTION  QuadraticPolynomialFunctionLeastSquares%
                    ALIAS "QUADRATI" (a, b, c)

In the example above, QUADRATI, the aliasname, contains the first
eight characters of the name QuadraticPolynomialFunctionLeastSquares%.
This causes Basic to place QUADRATI into the object file, thereby
mimicking MASM's behavior.


USING THE PARAMETER LIST
========================

The <parameter-list> syntax is displayed below, followed by
explanations of each field:

   [BYVAL | SEG] <variable> [AS <type>]...,

  Note: You can use BYVAL or SEG, but not both.

Use the BYVAL keyword to declare a value parameter. In each subsequent
call, the corresponding argument will be passed by value.

   Note: Basic provides two ways of "passing by value." The usual
   method of passing by value is to use an extra set of parentheses,
   as in the following:

      CALL HOLM((A))

   This method actually creates a temporary value, whose address is
   passed. In contrast, BYVAL provides a true method of passing by
   value, because the value itself is passed, not an address. Only by
   using BYVAL will a Basic program be compatible with an
   assembly-language routine that expects a value parameter.

Use the SEG keyword to declare a far reference parameter. In each
subsequent call, the far (segmented) address of the corresponding
argument will be passed.

You can choose any legal name for <variable>, but only the type
associated with the name has any significance to Basic. As with other
variables, the type can be indicated with a type declaration character
(%, &, !, #, $) or the implicit declaration.

You can use the "AS type" clause to override the type declaration of
<variable>. The type field can be INTEGER, LONG, SINGLE, DOUBLE,
STRING, a user-defined type, or ANY, which directs Basic to permit any
type of data to be passed as the argument.

For example:

   DECLARE FUNCTION Calc2! (BYVAL a%, BYVAL b%, BYVAL c!)

In the example above, Calc2! is declared as an assembly-language
routine that takes three arguments: the first two are integers passed
by value, and the last is a single-precision real number passed by
value.


ALTERNATIVE Basic INTERFACES
============================

You can specify parameter-passing methods without using a DECLARE
statement or by using a DECLARE statement and omitting the parameter
list.

1. You can make the call with the CALLS statement. The CALLS statement
   causes each parameter to be passed by far reference.

2. You can use the BYVAL and SEG keywords in the actual parameter list
   when you make the call, as follows:

      CALL Fun2(BYVAL Term1, BYVAL Term2, SEG Sum)

In the example above, BYVAL and SEG have the same meaning that they
have in a Basic DECLARE statement. When you use BYVAL and SEG this
way, however, you need to be careful because neither the type nor the
number of parameters will be checked as they would be in a DECLARE
statement.


              SETTING UP THE ASSEMBLY-LANGUAGE PROCEDURE
              ==========================================

The linker cannot combine the assembly-language procedure with the
calling program unless compatible segments are used and the procedure
itself is declared properly. The following points may be helpful:

1. If you have version 5.00 of the Macro Assembler, use the .MODEL
   directive at the beginning of the source file; this directive
   automatically causes the appropriate return to be generated (NEAR
   for small or compact model, FAR otherwise). Modules called from
   Basic should be declared as .MODEL MEDIUM. If you have a version of
   the assembler earlier than 5.00, declare the procedure FAR.

2. If you have version 5.00 or later of the Microsoft Macro Assembler
   (MASM), use the simplified segment directives .CODE to declare the
   code segment and .DATA to declare the data segment. (Having a code
   segment is sufficient if you do not have data declarations.) If you
   are using an earlier version of the assembler, the SEGMENT, GROUP,
   and ASSUME directives must be used.

3. The procedure label must be declared public with the PUBLIC
   directive. This declaration makes the procedure available to be
   called by other modules. Also, any data you want to make public to
   other modules must be declared as PUBLIC.

4. Global data or procedures accessed by the routine must be declared
   EXTRN. The safest way to use EXTRN is to place the directive
   outside any segment definition (however, near data must go inside
   the data segment).


PRESERVING REGISTERS
====================

There are several registers that need to be preserved in a mixed-
language program. These registers are as follows:

   CX, BX
   BP, SI, DI, SP
   CS, DS, SS, ES

The direction flag should also be preserved.


ENTERING THE ASSEMBLY-LANGUAGE PROCEDURE
========================================

The following two instructions begin the procedure:

   push   bp
   mov    bp, sp

This sequence establishes BP as the "framepointer." The framepointer
is used to access parameters and local data, which are located on the
stack. SP cannot be used for this purpose because it is not an index
or base register. Also, the value of SP may change as more data is
pushed onto the stack. However, the value of the base register BP will
remain constant throughout the procedure, so that each parameter can
be addressed as a fixed displacement off of BP.

The instruction sequence above first saves the value of BP because it
will be needed by the calling procedure as soon as the current
procedure terminates. Then BP is loaded with the value of SP to
capture the value of the pointer at the time of entry to the
procedure.


ALLOCATING LOCAL DATA (OPTIONAL)
================================

An assembly-language procedure can use the same technique for
implementing local data that is used by high-level languages. To set
up local data space, decrease the contents of SP in the third
instruction of the procedure. (To ensure correct execution, you should
always increase or decrease SP by an even amount.) Decreasing SP
reserves space on the stack for the local data. The space must be
restored at the end of the procedure, as shown below:

   push   bp
   mov    bp, sp
   sub    sp, space

In the text above, space is the total size in bytes of the local data.
Local variables are then accessed as fixed, negative displacements off
of BP.

For example:

   push   bp
   mov    bp, sp
   sub    sp, 4
      .
      .
      .
   mov    WORD PTR [bp-2], 0
   mov    WORD PTR [bp-4], 0

The example above uses two local variables, each of which is 2 bytes
in size. SP is decreased by 4, since there are 4 bytes of local data.
Later, each of the variables is initialized to 0 (zero). These
variables are never formally declared with any assembler directive;
the programmer must keep track of them manually.

Local variables are also called dynamic, stack, or automatic
variables.


EXITING THE PROCEDURE
=====================

Several steps may be involved in terminating the procedure:

1. If any of the registers SS, DS, SI, etc., have been saved, these
   must be popped off the stack in the reverse order that they were
   saved.

2. If local data space was allocated at the beginning of the
   procedure, SP must be restored with the instruction MOV SP, BP.
3. Restore BP with POP BP. This step is always necessary.

4. Finally, if you are not using CDECL and the C calling conventions,
   return to the calling program with the RET <n> instruction (where
   <n> is the number of bytes to pop off the stack) to adjust the
   stack with respect to the parameters that were pushed by the
   caller.


ASSEMBLY-LANGUAGE CALLS TO Basic
================================

No Basic routine can be executed unless the main program is in Basic,
because a Basic routine requires the environment to be initialized in
a way that is unique to Basic. MASM will not perform this special
initialization.

However, a program can start up in Basic, call an assembly-language
function that does most of the work of the program, and then call
Basic subprograms and functions as needed.

The following rules are recommended when you call Basic from assembly
language:

1. Start up in a Basic main module. You must use the DECLARE statement
   to provide an interface to the assembly-language module.

2. In the assembly-language module, declare the Basic routine as
   EXTRN.

3. Make sure that all data is passed as a near pointer. Basic can pass
   data in a variety of ways, but is unable to receive data in any
   form other than near reference.

   Note: With near pointers, the program assumes that the data is
   in the default data segment. If you want to pass data that is
   not in the default data segment, then first copy the data to a
   variable that is in the default data segment.

   Note: Microsoft Basic Professional Development System (PDS)
   version 7.10 allows a Basic routine to be passed parameters by
   value.


THE MICROSOFT SEGMENT MODEL
===========================

If you use the simplified segment directives by themselves, you do not
need to know the names assigned for each segment. However, versions of
the Macro Assembler earlier than 5.00 do not support these directives.
With earlier versions of the assembler, you should use the SEGMENT,
GROUP, ASSUME, and ENDS directives equivalent to the simplified
segment directives.

The following table shows the default segment names created by the
.MODEL MEDIUM directive used with Basic. Use of these segments ensures
compatibility with Microsoft languages and will help you access public
symbols. This table is followed by a list of three steps, illustrating
how to make the actual declarations, and a sample program.

   Directive   Name         Align     Combine   Class     Group
   ---------   ----         -----     -------   -----     -----

   .CODE       name_TEXT    WORD      PUBLIC    'CODE'
   .DATA       _DATA        WORD      PUBLIC    'DATA'    DGROUP
   .CONST      CONST        WORD      PUBLIC    'CONST'   DGROUP
   .DATA?      _BSS         WORD      PUBLIC    'BSS'     DGROUP
   .STACK      STACK        PARA      STACK     'STACK'   DGROUP

The directives in the table refer to the following kinds of segments:

   Directive      Description of Segment
   ---------      ----------------------

   .CODE          The segment containing all the code for the module.

   .DATA          Initialized data.

   .DATA?         Uninitialized data. Microsoft compilers store
                  uninitialized data separately because it can be more
                  efficiently stored than initialized data. (Note:
                  Basic does not use uninitialized data.)

   .FARDATA and
   .FARDATA?      Data placed here will not be combined with the
                  corresponding segments in other modules. The segment
                  of data placed here can always be determined,
                  however, with the assembler SEG operator.

   .CONST         Constant data. Microsoft compilers use this segment
                  for such items as string and floating-point
                  constants.

   .STACK         Stack. Normally, this segment is declared in the
                  main module for you and should not be redeclared.

The following steps describe how to use this table to create
directives:

1. Refer to the table to look up the segment name, align type, combine
   type, and class for your code and data segments. Use all of these
   attributes when you define a segment. For example, the code segment
   is declared as follows:

      _TEXT    SEGMENT   WORD PUBLIC 'CODE'

   The name _TEXT and all the attributes are taken from the table.

2. If you have segments in DGROUP, put them into DGROUP with the GROUP
   directive, as in the following:

      GROUP    DGROUP    _DATA     _BSS

3. Use ASSUME and ENDS as you would normally. Upon entering routines
   called directly from Basic, DS and SS will both point to DGROUP.

The following example shows an assembly-language program without the
simplified segment directives from version 5.00 of the Microsoft Macro
Assembler:

  test_TEXT  SEGMENT WORD PUBLIC 'CODE'
             ASSUME   cs:test_TEXT
             PUBLIC Power2
  Power2     PROC
             push bp
             mov bp, sp

             mov ax, [bp+6]
             mov cx, [bp+8]
             shl ax, cl

             pop bp
             ret 4
  Power2     ENDP
  test_TEXT  ENDS
             END


                        COMPILING AND LINKING
                        =====================

After you have written your source files and resolved the issues
raised in the above sections, you are ready to compile individual
modules and then link them together.

Before linking, each program module must be compiled or assembled with
the appropriate compiler or assembler.


                         ACCESSING PARAMETERS
                         ====================

PARAMETER-PASSING REQUIREMENTS
==============================

Microsoft compilers support three methods for passing a parameter:

   Method         Description
   ------         -----------

   Near reference Passes a variable's near (offset) address. This
                  method gives the called routine direct access to the
                  variable itself. Any change the routine makes to the
                  parameter will be reflected in the calling routine.

   Far reference  Passes a variable's far (segmented) address. This
                  method is similar to passing by near reference,
                  except that a longer address is passed.

   By value       Passes only the variable's value, not address. With
                  this method, the called routine knows the value of
                  the parameter, but has no access to the original
                  variable. Changes to the value parameter have no
                  effect on the value of the parameter in the calling
                  routine, once the routine terminates.

Because there are different parameter-passing methods, please note the
following:

1. Make sure that the called routine and the calling routine use the
   same method for passing each parameter (argument). In most cases,
   you will need to check the parameter-passing defaults used by each
   language, and possibly make adjustments. Each language has keywords
   or language features that allow you to change the parameter-passing
   method.

2. You may want to use a particular parameter-passing method rather
   then using the default for the language.


Basic ARGUMENTS
===============

The default for Basic is to pass all arguments by near reference. This
can be overridden by using the SEG directive or CALLS instead of CALL.
Both of these methods cause Basic to pass both the segment and offset.
These methods can be used only to call a non-Basic routine because
Basic receives all parameters by near reference.

   Note: Although Basic can pass parameters to other languages by far
   reference by using the SEG directive or CALLS, Basic routines can
   be CALLed only from other languages when parameters are passed by
   near reference. You cannot DECLARE or CALL a Basic routine with
   parameters that have SEG or BYVAL attributes. SEG and BYVAL are
   only used for parameters of non- Basic routines.

   Note: Basic PDS version 7.10 allows a Basic routine to be passed
   parameters by value.


Basic STACK FRAME
=================

The following diagram illustrates the Basic stack frame as it appears
upon entry to the assembly-language routine:

          +--------------------+
        A |   Arg 1 address    | <-- BP + 8
          |--------------------|
        B |   Arg 2 address    | <-- BP + 6
          |--------------------|
          |   Return address   |     BP + 4
          |      (4 bytes)     |     BP + 2
          |--------------------|
          |      Saved BP      | <-- BP
          +--------------------+

           Low Addresses


ASSEMBLY-LANGUAGE ARGUMENTS
===========================

Once you have established the procedure's framepointer, allocated
local data space (if desired), and pushed any registers that need to
be preserved, you can write the main body of the procedure. To write
instructions that can access parameters, consider the general picture
of the stack frame after a procedure call, as illustrated in the
following figure:


          High Addresses

                    +------------------+
                    |    Parameter     |
                    |------------------|
                    |    Parameter     |
                    |------------------|
                    |        .         |
                    |        .         |
                    |        .         |
       Stack grows  |------------------|     Parameters above
       downward with|    Parameter     |     this generated
       each push or |------------------|     automatically
       by call      |  Return Address  | <-- the compiler.
                    |------------------|
                    |     Saved BP     | <-- Framepointer (BP)
                    |------------------|     points here.
                    | Local Data Space |     These parameters
                    |------------------|     would be generated
                    |     Saved SI     |     by your assembly-
                    |------------------|     language code.
                    |     Saved DI     | <-- SP points to last
                    +------------------+     item placed on
                                             stack.

          Low Addresses


The stack frame for the procedure is established by the following
sequence of events:

1. The calling program pushes each of the parameters on the stack,
   after which SP points to the last parameter pushed.

2. The calling program issues a CALL instruction, which causes the
   return address (the place in the calling program to which control
   will ultimately return) to be placed on the stack.  This address
   may be either 2 bytes long (for near calls) or 4 bytes long (for
   far calls).  SP now points to this address.  (Note: When dealing
   with Basic, the return address will always be a far address [4
   bytes].)

3. The first instruction of the called procedure saves the old value
   of BP, with the instruction push bp.  SP now points to the saved
   copy of BP.

4. BP is used to capture the current value of SP, with the instruction
   MOV BP, SP.  Therefore, BP now points to the old value of BP.

5. Whereas BP remains constant throughout the procedure, SP may be
   decreased to provide room on the stack, for local data or saved
   registers.

In general, the displacement (off of BP) for a parameter X is equal to
the following:

   2  + size of return address
      + total size of parameters between X and BP

For example, consider a FAR procedure (all Basic procedures are FAR)
that has received one parameter, a 2-byte address. The displacement of
the parameter would be as follows:

   Argument's displacement     = 2 + size of return address
                               = 2 + 4
                               = 6

The argument can thus be loaded into BX with the following
instruction:

   mov bx, [bp+6]

Once you determine the displacement of each parameter, you may want to
use string equates or structures so that the parameters can be
referenced with a single identifier name in your assembly-language
source code. For example, the parameter above at bp+6 can be
conveniently accessed if you put the following statement at the
beginning of the assembly-language source file:

   Arg1   EQU  [bp+6]

You could then refer to this parameter as Arg1 in any instruction. Use
of this feature is optional.


PASSING Basic ARGUMENTS BY VALUE
================================

An argument is passed by value when the called routine is first
declared with a DECLARE statement, and the BYVAL keyword is applied to
the argument. For example:

   DECLARE SUB AssemProc (BYVAL a AS INTEGER)


PASSING Basic ARGUMENTS BY NEAR REFERENCE
=========================================

The Basic default is to pass by near reference. Use of SEG, BYVAL, or
CALLS changes this default.


PASSING Basic ARGUMENTS BY FAR REFERENCE
========================================

Basic passes each argument in a call by far reference when CALLS is
used to invoke a routine. Using SEG to modify a parameter in a
preceding DECLARE statement also causes a Basic CALL to pass
parameters by far reference.

   Note: CALLS cannot be used to call a routine that is named in a
   DECLARE statement. For this reason, the use of the SEG directive is
   the preferred method of passing variables by far reference.


                              DATA TYPES
                              ==========

NUMERICAL FORMATS
=================

Numerical data formats are the simplest kinds of data to pass between
assembly language and Basic. The following chart shows the equivalent
data types in each language:

   Basic        Assembly Language
   -----        -----------------

   x%, INTEGER  DW
   ...          DB, DF, DT    <-- These are not available in Basic.
   x&, LONG     DD
   x!, SINGLE   DD
   x#, DOUBLE   DQ


USER-DEFINED TYPES
==================

The elements in a user-defined type are stored contiguously in memory,
one after the other. When a Basic user-defined type appears in an
argument list, Basic passes the address of the beginning element of
the user-defined type.

The routine that receives the user-defined type must know the format
of the type beforehand. The assembly-language routine should then
expect to receive a pointer to a structure of this type.


Basic STRING FORMATS
====================

Near Variable-Length Strings
----------------------------

Variable-length strings in Basic have 4-byte string descriptors:

        +-------------------------------------+
        |      Length      | Address (offset) |
        +-------------------------------------+
                   (2 bytes)          (2 bytes)

The first field of the string descriptor contains a 2-byte integer
indicating the length of the actual string text. The second field
contains the address of the text. This address is an offset into the
default data area (DGROUP) and is assigned by Basic's string-space
management routines. These management routines need to be available to
reassign this address whenever the length of the string changes, yet
the routines are available only to Basic. Therefore, an assembly-
language routine should not alter the length or address of a Basic
variable-length string.

   Note: Fixed-length strings do not have a string descriptor.


Passing Variable-Length Strings from Basic
------------------------------------------

When a Basic variable-length string (such as A$) appears in an
argument list, Basic passes a string descriptor rather than the string
data itself.

   Warning: When you pass a string from Basic to assembly language,
   the called routine should under no circumstances alter the length
   or address of the string.

The routine that receives the string must be aware that if any Basic
routine is called, Basic's string-space management routines may change
the location of the string data without warning. In this case, the
calling routine must note that the values in the string descriptor may
change.

The Basic functions SADD and LEN extract parts of the string
descriptor. SADD extracts the address of the actual string data, and
LEN extracts the length. The results of these functions can then be
passed to an assembly-language routine.

Basic should pass the result of the SADD function by value. Bear in
mind that the string's address, not the string itself, will be passed
by value. This amounts to passing the string itself by reference. The
Basic module passes the string address, and the other module receives
the string address. The address returned by SADD is declared as type
INTEGER, but is actually equivalent to a near pointer.

There are two methods for passing a variable-length string from Basic
to assembly language. The first method is to pass the string address
and string length as separate arguments, using the SADD and LEN
functions. The second method is to pass the string descriptor itself,
with a call statement such as the following:

   CALL CRoutine(A$)

The assembly-language routine should then expect to receive a pointer
to a string descriptor of this type.


Passing Near String Descriptors from Assembly Language
------------------------------------------------------

To pass an assembly-language string to Basic, first allocate a string
in assembly language. Then create a structure identical to a Basic
string descriptor. Pass this structure by near reference. Make sure
that the string originates in assembly language, not in Basic.
Otherwise, Basic may attempt to move the string around in memory.

   Warning: Microsoft does not recommend creating your own string
   descriptors in assembler functions because it is very easy to
   inadvertently destroy portions of the data segment. The Basic
   routine should not reassign the value or length of a string passed
   from assembly language.

The preferred method is to create the strings in Basic and then modify
their contents in the assembler function without altering their string
descriptors.


Far Variable-Length Strings
---------------------------

Microsoft Basic Professional Development System (PDS) versions 7.00
and 7.10 allow for the use of far strings. Information on using far
strings with other languages is covered in the "Microsoft Basic 7.0:
Programmer's Guide," in Chapter 13, "Mixed-Language Programming with
Far-Strings."


Fixed-Length Strings
--------------------

Fixed-length strings in Basic are stored simply as contiguous bytes of
characters, with no terminating character. There is no string
descriptor for a fixed-length string.

To pass a fixed-length string to a routine, the string must be put
into a user-defined type. For example:

   TYPE FixType
       A AS STRING * 10
   END TYPE

The string is then passed like any other user-defined type.


ARRAYS
======

There are several special problems that you need to be aware of when
passing arrays between Basic and assembly language:

1. Arrays are implemented differently in Basic than in other
   languages, so you must take special precautions when passing an
   array from Basic to assembly language.

2. Arrays are declared differently in assembly language and Basic.

3. Because Basic uses an array descriptor, passed arrays must be
   created in Basic.


Passing Arrays from Basic
-------------------------

To pass an array to an assembly-language routine, pass only the base
element, and the other elements will be contiguous from there.


Passed Arrays Must Be Created in Basic
--------------------------------------

Basic keeps track of all arrays in a special structure called an array
descriptor. The array descriptor is unique to Basic and is not
available in any other language. Because of this, to pass an array
from assembly language to Basic, the array must first be created in
Basic, then passed to the assembly-language routine. The assembly-
language routine may then alter the values in the array, but it cannot
change the length of the array.

The array descriptor is similar in some respects to a string
descriptor. The array descriptor is necessary because Basic may shift
the location of array data in memory. Therefore, you can safely pass
arrays from Basic only if you follow three rules:

1. Pass the array's address by applying the VARPTR function to the
   first element of the array and passing the result by value. To pass
   the far address of the array, apply both the VARPTR and VARSEG
   functions and pass each result by value. The assembler gets the
   address of the first element and considers it the address of the
   entire array.

2. The routine that receives the array must not, under any
   circumstances, make a call back to Basic. If it does, then the
   location of the array may change, and the address that was passed
   to the routine will become meaningless.

3. Basic can pass any member of an array by value. With this method,
   the above precautions do not apply.


Array Ordering
--------------

There are two types of ordering: row-major and column-major.

Basic uses column-major ordering, in which the leftmost dimension
changes fastest. When you use Basic with the BC command line, you can
select the /R compile option, which specifies that row-major order is
to be used, rather than column-major order.


COMMON BLOCKS
=============

You can pass individual members of a Basic COMMON block in an argument
list, just as you can any data. However, you can also give an
assembly-language routine access to the entire COMMON block at once.

Assembly language can reference the items of a COMMON block by first
declaring a structure with fields that correspond to the COMMON block
variables. Having defined a structure with the appropriate fields, the
assembly-language routine must then get the address of the COMMON
block.

To pass the address of the COMMON block, pass the address of the first
variable in the block. The assembly-language routine should expect to
receive a structure by reference.

For named COMMON blocks, there is an alternative method. In the
assembly-language program, a segment is set up with the same name as
the COMMON block and then grouped with DGROUP, as follows:

   BNAME SEGMENT COMMON 'BC_VARS'
      x dw 1 dup (?)
      y dw 1 dup (?)
      z dw 1 dup (?)
   BNAME ENDS

   DGROUP GROUP BNAME

The above assembler code matches with the following Basic code using a
named COMMON block:

   DEFINT A-Z
   COMMON /BNAME/ x,y,z

Passing arrays through the COMMON block is done in a similar fashion.
However, only static arrays can be passed to assembler through COMMON.

   Note: Microsoft does not support passing dynamic arrays through
   COMMON to assembler (since this depends upon a Microsoft
   proprietary dynamic array descriptor format that changes from
   version to version). Dynamic arrays can be passed to assembler only
   as parameters in a CALL statement.

When static arrays are used, the entire array is stored in the COMMON
block.

Note that variables in COMMON following STRING*n variables, where n is
odd, are aligned on the next even word boundary. Thus, you must define
an extra dummy byte using db 1 in the assembler code following
STRING*n variables (where n is odd). A dummy byte is not necessary
after STRING*n variables when n is even.

        HOW TO RETURN VALUES FROM ASSEMBLY-LANGUAGE FUNCTIONS
        =====================================================

Assembler "functions" are not called with the CALL statement; they are
invoked on the right-hand side of an equal sign (=) in compiled Basic.
When calling an assembly-language function from Basic, either the
passed variable or a pointer to the passed variable is returned in the
AX register, as shown in the following chart:

   Data Type      How Value Is Returned
   ---------      ---------------------

   INTEGER        The value is placed in AX.

   LONG           The high-order portion is placed in DX. The low-order
                  portion is placed in AX.

   SINGLE         The value is placed in the location provided by
                  Basic. The segment is DS. Basic will push an extra
                  parameter on the stack, after all the other
                  parameters, that contains the offset of the memory
                  location to share the return value. The offset
                  located in BP+6 should be placed in AX before the
                  function exits.

   DOUBLE         The value is placed in the location provided by
                  Basic. The segment is DS. Basic will push an extra
                  parameter on the stack, after all the other
                  parameters, that contains the offset of the memory
                  location to share the return value. The offset should
                  be placed in AX before the function exits.

   VARIABLE-
   LENGTH STRING  Pointer to a descriptor (offset in AX).

   Note: Basic does not allow functions with a fixed-length-string
   type or a user-defined type.

                  DEBUGGING MIXED-LANGUAGE PROGRAMS
                  =================================

Microsoft CodeView is very useful when trying to debug mixed-language
programs. With CodeView you can trace through the source code of both
assembly language and Basic and watch variables in both languages.

To compile programs for use with CodeView, use the /Zi switch on the
compile line for both the assembler and the Basic compiler. Then when
linking, use the /CO switch.

CodeView is a multilanguage source code debugger supplied with
Microsoft Basic Compiler versions 6.00 and 6.00b; Microsoft Basic
Professional Development System (PDS) versions 7.00 and 7.10;
Microsoft C Optimizing Compiler versions 5.00 and 5.10; Microsoft
Macro Assembler versions 5.00 and 5.10; and Microsoft FORTRAN Compiler
versions 4.00 and 5.00.

              COMPILING AND LINKING THE SAMPLE PROGRAMS
              =========================================

The following is a series of examples, demonstrating the interlanguage
calling capabilities between Basic and assembler.

When compiling the sample Basic programs, use the following compile
line:

   BC /O Basicprogramname;

When compiling the sample MASM programs, use the following compile
line for MASM 5.00 or 5.10:

   MASM Assemprogramname;

Or, use the following compile line for QuickAssembler 2.01:

   QCL Assemprogramname;

To link the programs together, use the following LINK line:

   LINK Basicprogramname Assemprogramname;

   Note: All the examples using variable-length strings assume the use
   of near variable-length strings. These examples will not work in
   the QuickBasic Extended (QBX.EXE) environment, or when compiling
   with the BC/FS directive, in Microsoft Basic Professional
   Development System (PDS) versions 7.00 and 7.10.

                   APPENDIX A: MISCELLANEOUS TOPICS
                   ================================

Basic SUPPORTS MASM 5.10 UPDATE .MODEL AND PROC EXTENSIONS
==========================================================

Microsoft Macro Assembler (MASM) version 5.10 includes several new
features (not found in MASM version 5.00 or earlier) that simplify
assembly-language routines linked with high-level-language programs.
Two of these features are as follows:

1. An extension to the .MODEL directive that automatically sets up
   naming, calling, and return conventions for a given high-level
   language. For example:

      .MODEL MEDIUM,Basic

2. A modification of the PROC directive that handles most of the
   procedure entry automatically. The PROC directive saves specified
   registers, defines text macros for passed arguments, and generates
   stack setup code on entry and stack tear-down code on exit.

Section 5 of the "Microsoft Macro Assembler Version 5.1 Update" manual
discusses the new features.

PROBLEM CALLING ASSEMBLER ROUTINE WITH LABEL ON END DIRECTIVE
=============================================================

A QuickBasic .EXE program will hang at run time if it is LINKed to an
assembly-language routine that uses a label on the END directive. The
same programs execute successfully when run inside the QB.EXE editor
with the assembly-language routine in a Quick library.

Although versions of QuickBasic prior to version 4.00 allow a label on
the END directive in a LINKed assembly-language program, programs for
versions 4.00 and 4.00b require you to have no label on the assembly-
language END directive.

When the linker creates an executable program, it successively
examines each .OBJ file and determines whether that file has a
specified entry point. The first .OBJ file that specifies an entry
point is assumed by the linker to be the main program, and program
execution begins there.

In the assembly-language routines, the purpose of a label with an END
directive is to indicate to the linker the program's starting address
or entry point (where program execution is to start). Therefore, if no
entry point is found in the QuickBasic routine, program execution will
begin in the assembly-language routines (in effect, the Basic code is
totally bypassed).

In previous versions of the QuickBasic compiler, the QuickBasic object
code contains an entry-point specifier. Therefore, by simply listing
QuickBasic object files before the assembly-language object files on
the LINK command line, the linker recognizes that the QuickBasic
program is the main program.

However, in QuickBasic version 4.00, the entry-point information is no
longer in the object file; instead, it resides in the run-time module
(for example, BCOM40.LIB or BRUN40.LIB). Because these files are
LINKed after the Basic and assembly-language .OBJ files, if the
assembly-language routine specifies an entry point, the linker will
incorrectly assume that program execution is to begin in the assembly-
language routine.

Results of testing with previous versions of QuickBasic indicate that
the programs run successfully both inside the editor and as .EXE files
when compiled with versions 2.00, 2.01, and 3.00 of the QuickBasic
compiler.

There are two workarounds to correct this problem in version 4.00:

1. Remove the label on the END directive (that is, remove the entry-
   point specification in your assembly-language routine) and
   reassemble.

2. The assembler .OBJ module can be used successfully without removing
   the label from the END directive. If the assembly-language routine
   cannot be changed, place the assembly-language routine into a .LIB
   file.

ASSEMBLER ROUTINES MUST NOT ASSUME ES EQUALS DS
===============================================

If CALLed assembler routines do string manipulation and use the ES
register, then the results inside the QB.EXE editor may differ from
the executable .EXE program if the assembler routines assume the ES
and DS registers are equal.

The ES and DS registers should not be assumed to be equal in
QuickBasic versions 4.00 and later.

Generally, the ES and DS registers are equal for the executable
program; however, this is not always a valid assumption. The assembler
routines must explicitly set ES equal to DS, as shown in the code
example below.

The following assembler code sets ES equal to DS:

      push bp
      mov bp, sp
      push es         ;These three
      push ds         ;lines set the
      pop es          ;es register equal to ds
        .
        .             ;body of program
        .
      pop es          ;at end of program need to
      pop bp          ;restore saved registers
      ret

QUICK LIBRARY WITH 0 (ZERO) BYTES IN FIRST CODE SEGMENT
=======================================================

A Quick library containing leading zeros in the first CODE segment is
invalid, causing the message "Error in loading file <name> - Invalid
format" when you try to load it in QuickBasic. For example, this error
can occur if an assembly-language routine puts data that is
initialized to 0 (zero) in the first CODE segment, and it is
subsequently listed first on the LINK command line when you make a
Quick library.  If you have this problem, do either of the following:

1. Link with a Basic module first on the LINK command line.

-or-

2. In whatever module comes first on the LINK command line, make sure
   that the first code segment starts with a nonzero byte.
                

This article is continued in the following article in the Microsoft Knowledge Base:

ARTICLE-ID: Q71275 TITLE : "How to Pass Parameters Between Basic and Assembly" (Part 2/2)


Additional query words: QuickBas

Keywords: KB51501