Microsoft KB Archive/239527

= ACC2000: Database Bloats When Importing Large Text File =

Article ID: 239527

Article Last Modified on 6/29/2004

-

APPLIES TO


 * Microsoft Access 2000 Standard Edition

-



This article was previously published under Q239527



Moderate: Requires basic macro, coding, and interoperability skills.

This article applies only to a Microsoft Access database (.mdb).



SYMPTOMS
When you import a large text file into a Microsoft Jet 4.0 database, the file size increases beyond the expected size, based on the size of the text file.



CAUSE
There are two possible causes for this problem, and you may be experiencing both.
 * Your database is using record-level locking. When importing a text file into a database, record-level locking is used incorrectly, which creates a separate 4 kilobyte page in memory for each record that is imported.
 * You are importing the text file into a new table. When you import text data into a new table, the Import wizard does not allow you to specify that the data be compressed. All data that is imported into Access is two-byte, unicode data. If unicode data is not compressed, it requires twice the usual space to store the same information.



RESOLUTION
If the import ran without errors but your database has experienced bloating, you can reclaim the extra space in your database by compacting the database. To compact a database, on the Tools menu, point to Database Utilities, and then click Compact and Repair Database. To avoid the problem in the future, you can follow these suggestions.

Record-level locking causes database bloat
Download the most recent version of the Microsoft Data Access Components 2.1 (MDAC 2.1). This behavior no longer occurs in MDAC version MDAC 2.1.2.4202.3 (GA), which you may also see referred to as MDAC 2.1 SP2, and later. For more information about the most recent MDAC 2.1, visit the following Microsoft Web site:

http://msdn.microsoft.com/dataaccess

Unicode text data being imported uncompressed
Import the text file into an existing table where the UnicodeCompression property for each Text, Memo, and Hyperlink field is set to Yes (Default).



STATUS
Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.



MORE INFORMATION
Microsoft Access 2000 uses the unicode character-encoding scheme. In unicode, all characters are represented by two bytes, instead of a single byte as in earlier versions of Access. Setting the UnicodeCompression property to Yes offsets the extra space required for two-byte character representation when working with characters that have a 0 (zero) as the first byte. All Latin characters, which includes English, have a 0 (zero) as the first byte and will be compressed.

In addition to reducing the size of text data stored in your database, using unicode compression can also improve the performance of your database. Because the text data is stored compressed, more records can be read into memory at a time, which will reduce the number of times Access has to access the disk drive to get data. Fewer disk-drive reads means faster data access.

